Biblioteca Digital

3 resultados para Tagset.

Maca - a configurable tool to integrate Polish morphological data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are a number of morphological analysers for Polish. Most of these, however, are non-free resources. What is more, different analysers employ different tagsets and tokenisation strategies. This situation calls for a simpleand universal framework to join different sources of morphological information, including the existing resources as well as user-provided dictionaries. We present such a configurable framework that allows to write simple configuration files that define tokenisation strategies and the behaviour of morphologicalanalysers, including simple tagset conversion.

Veja mais

Development Of A Pos Tagger For Malayalam-An Experience

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Parts of Speech tagger for Malayalam which uses a stochastic approach has been proposed. The tagger makes use of word frequencies and bigram statistics from a corpus. The morphological analyzer is used to generate a tagged corpus due to the unavailability of an annotated corpus in Malayalam. Although the experiments have been performed on a very small corpus, the results have shown that the statistical approach works well with a highly agglutinative language like Malayalam

Veja mais

From constituents to syntax-oriented dependencies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the automatic process of building a dependency annotated corpus based on Ancora constituent structures. The Ancora corpus already has a dependency structure information layer, but the new annotated data applies a purely syntactic orientation and offers in this way a new resource to the linguistic research community. The paper details the process of reannotating the corpus, the linguistic criteria used and the obtained results.

Veja mais

3 resultados para Tagset.

Filtro por publicador

Maca - a configurable tool to integrate Polish morphological data

Development Of A Pos Tagger For Malayalam-An Experience

From constituents to syntax-oriented dependencies