Building and Using Existing Hunspell Dictionaries and TEX Hyphenators as Finite-State Automata


Autoria(s): Pirinen, Tommi; Linden, Krister
Contribuinte(s)

University of Helsinki, Käyttäytymistieteellisen tiedekunnan kanslia

University of Helsinki, Department of Modern Languages

Data(s)

01/10/2010

Resumo

There are numerous formats for writing spellcheckers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writer’s tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.

Formato

8

Identificador

http://hdl.handle.net/10138/29360

Idioma(s)

eng

Relação

Proceedings of International Multiconference on Computer Science and Information Technology Computational Linguistics—Applications (CLA'10 )

Proceedings of the International Multiconference on Computer Science and Information Technology

Fonte

Pirinen , T & Linden , K 2010 , ' Building and Using Existing Hunspell Dictionaries and TEX Hyphenators as Finite-State Automata ' in Proceedings of International Multiconference on Computer Science and Information Technology : Computational Linguistics—Applications (CLA'10 ) , pp. 477–484 Proceedings of the International Multiconference on Computer Science and Information Technology .

Palavras-Chave #113 Computer and information sciences #612 Languages and Literature
Tipo

A4 Article in conference publication (refereed)

info:eu-repo/semantics/conferencePaper

http://purl.org/eprint/status/NonPeerReviewed