E-dictionaries and finite-state automata for the recognition of named entities
Само за регистроване кориснике
2011
Конференцијски прилог (Објављена верзија)
![](/themes/Miragerepff//images/creativecommons/arr.png)
Метаподаци
Приказ свих података о документуАпстракт
In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision.
Кључне речи:
lexical resources / finite state transducers / named entity recognition / Serbian language / system evaluationИзвор:
FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing, 2011, 48-56Издавач:
- Association for Computational Linguistics (ACL)
Финансирање / пројекти:
- Инфраструктура за електронски подржано учење у Србији (RS-MESTD-Integrated and Interdisciplinary Research (IIR or III)-47003)
Институција/група
Filološki fakultet / Faculty of PhilologyTY - CONF AU - Krstev, Cvetana AU - Vitas, Duško AU - Obradović, Ivan AU - Utvić, Miloš PY - 2011 UR - https://repff.fil.bg.ac.rs/handle/123456789/569 AB - In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision. PB - Association for Computational Linguistics (ACL) C3 - FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing T1 - E-dictionaries and finite-state automata for the recognition of named entities EP - 56 SP - 48 UR - conv_2102 ER -
@conference{ author = "Krstev, Cvetana and Vitas, Duško and Obradović, Ivan and Utvić, Miloš", year = "2011", abstract = "In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision.", publisher = "Association for Computational Linguistics (ACL)", journal = "FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing", title = "E-dictionaries and finite-state automata for the recognition of named entities", pages = "56-48", url = "conv_2102" }
Krstev, C., Vitas, D., Obradović, I.,& Utvić, M.. (2011). E-dictionaries and finite-state automata for the recognition of named entities. in FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing Association for Computational Linguistics (ACL)., 48-56. conv_2102
Krstev C, Vitas D, Obradović I, Utvić M. E-dictionaries and finite-state automata for the recognition of named entities. in FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing. 2011;:48-56. conv_2102 .
Krstev, Cvetana, Vitas, Duško, Obradović, Ivan, Utvić, Miloš, "E-dictionaries and finite-state automata for the recognition of named entities" in FSMNLP 2011 - Proceedings of the 9th International Workshop Finite State Methods and Natural Language Processing (2011):48-56, conv_2102 .