7 resultados para Spelling mistakes

em Universidad de Alicante


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biotic indices have been developed to summarise information provided by benthic macroinvertebrates, but their use can require specialized taxonomic expertise as well as a time-consuming operation. Using high taxonomic level in biotic indices reduces sampling processing time but should be considered with caution, since assigning tolerance level to high taxonomic levels may cause uncertainty. A methodology for family level tolerance categorization based on the affinity of each family with disturbed or undisturbed conditions was employed. This family tolerance classification approach was tested in two different areas from Mediterranean Sea affected by sewage discharges. Biotic indices employed at family level responded correctly to sewage presence. However, in areas with different communities among stations and high diversity of species within each family, assigning the same tolerance level to a whole family could imply mistakes. Thus, use of high taxonomic level in biotic indices should be only restricted to areas where homogeneous community is presented and families across sites have similar species composition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The need to digitise music scores has led to the development of Optical Music Recognition (OMR) tools. Unfortunately, the performance of these systems is still far from providing acceptable results. This situation forces the user to be involved in the process due to the need of correcting the mistakes made during recognition. However, this correction is performed over the output of the system, so these interventions are not exploited to improve the performance of the recognition. This work sets the scenario in which human and machine interact to accurately complete the OMR task with the least possible effort for the user.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Las tecnologías de la información y la comunicación están consiguiendo que la información geográfica sea asequible a un mayor número de profesionales a través de las Tecnologías de la Información Geográfica. La intervención multidisciplinar en el territorio enriquece la investigación y las formas de aplicación de este tipo de recursos tecnológicos. Pero esta facilidad tecnológica puede suponer el riesgo de un uso inadecuado, por falta de conocimientos técnicos adecuados a la complejidad de la información geográfica o por el mal uso de las aplicaciones informáticas. El trabajo catastral puede beneficiarse mucho del empleo de estas tecnologías de información geográfica, al facilitar el uso, la comunicación y su administración electrónica, pero el desconocimiento de las propiedades geométricas y topológicas de la información geográfica puede llevar a cometer errores de graves consecuencias a profesionales no especializados. En este artículo ofrecemos el resultado de la investigación del trabajo de diversos juristas y técnicos, con el objetivo de desarrollar métodos automatizados y aplicaciones informáticas que permitan a los especialistas no expertos en Cartografía usar este tipo de información con garantías de exactitud al más alto nivel, como una solución eficaz para que la información geográfica con calidad topológica enriquezca la seguridad jurídica en el tráfico inmobiliario.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information Retrieval systems normally have to work with rather heterogeneous sources, such as Web sites or documents from Optical Character Recognition tools. The correct conversion of these sources into flat text files is not a trivial task since noise may easily be introduced as a result of spelling or typeset errors. Interestingly, this is not a great drawback when the size of the corpus is sufficiently large, since redundancy helps to overcome noise problems. However, noise becomes a serious problem in restricted-domain Information Retrieval specially when the corpus is small and has little or no redundancy. This paper devises an approach which adds noise-tolerance to Information Retrieval systems. A set of experiments carried out in the agricultural domain proves the effectiveness of the approach presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Diversity-based designing, or the goal of ensuring that web-based information is accessible to as many diverse users as possible, has received growing international acceptance in recent years, with many countries introducing legislation to enforce it. This paper analyses web content accessibility levels in Spanish education portals according to the international guidelines established by the World Wide Web Consortium (W3C) and the Web Accessibility Initiative (WAI). Additionally, it suggests the calculation of an inaccessibility rate as a tool for measuring the degree of non-compliance with WAI Guidelines 2.0 as well as illustrating the significant gap that separates people with disabilities from digital education environments (with a 7.77% average). A total of twenty-one educational web portals with two different web depth levels (42 sampling units) were assessed for this purpose using the automated analysis tool Web Accessibility Test 2.0 (TAW, for its initials in Spanish). The present study reveals a general trend towards non-compliance with the technical accessibility recommendations issued by the W3C-WAI group (97.62% of the websites examined present mistakes in Level A conformance). Furthermore, despite the increasingly high number of legal and regulatory measures about accessibility, their practical application still remains unsatisfactory. A greater level of involvement must be assumed in order to raise awareness and enhance training efforts towards accessibility in the context of collective Information and Communication Technologies (ICTs), since this represents not only a necessity but also an ethical, social, political and legal commitment to be assumed by society.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La próxima reapertura de la radiotelevisión valenciana, teniendo en cuenta el fracaso en la gestión de RTVV durante años, suscita interrogantes sobre cómo se está desarrollando el proceso de creación del nuevo ente público. Cuáles son las medidas que se están llevando a cabo para garantizar la independencia política de la nueva corporación y evitar los errores del pasado. El objetivo de este estudio consiste en comparar los mecanismos y herramientas con los que contaba RTVV y con los que se dotará al nuevo ente, ya que a priori marcarán la diferencia entre ambos. En este sentido, la investigación se fundamenta en el análisis de contenido de la legislación tanto de RTVV como de la futura Corporación Valenciana de Medios de Comunicación, teniendo como referencia el informe de expertos Bases per a la renovació de l’espai comunicatiu valencià i la restitució del servei públic de radiotelevisió. Finalmente, se ha podido concluir que algunas herramientas de autorregulación interna propuestas hoy, ya formaban parte de la antigua RTVV. Estos organismos no cumplieron su cometido debido a la politización de los mismos, que en definitiva fue la principal causa de su declive. No obstante, también existieron carencias en cuanto a regulación externa, y que en esta nueva etapa se están incluyendo para aumentar las garantías de independencia y cumplimiento deontológico profesional.