Zusammenfassung der Ressource
An Empirical
Study of Chinese
Name matching
and Applications
- Introduction
- Name matching
- Important
- Downstream tasks
- Entity linking
- Includes context of mentions
- Entity clustering
- Includes context of mentions
- ?
- Entity coreference
- ?
- Name transliteration
- Identifying
names for mining
paraphrases
- ?
- Standalone
name
matching
- Context independen
- Entity disambiguation
- Determine if two
mentioned strings
refer to the same
entity
- Methods
- Language type
- Alphabetic
languages
- Focused on
- Example
- English
- Indo-European
- Logogram languages
- Example
- Chinese
- Hanzi
- Challenge
- A small
number of
hanzi
represents an
entire name
- There are
X*10.000
hanzi in use
- Current methods
- Largely UNTESTED
- Coreference
resolution
errors
- Caused by
- Chinese name
matching errors
- Focus on
persons
names
- Challenge
- Issue: Name variations
- Nicknames
- Aliases
- Acronyms
- Differences in
translation
- Exact string matching
- POOR results!
- Determine whether
two strings refer to the
same entity based on
the strings above.
- Research
- Evaluate Name
Matching methods
- In Chineese
- Approaches
- Existing
- String matching
- ?
- Learnig
- ?
- New
- New Representation for Chinese
- Experiments
- New Representation for Chinese
- Improves
- name matching
- Entity clustering
- No details?!
- Newly developed data sets
- Matched
Chinese
name pairs
- Mingpipe
- Name matching tool
- Python package
- Usage
- As stand alone
- Integrated
in a larger
system