10 resultados para Multilingual lexical

em Universidad de Alicante


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tesis doctoral con mención europea en procesamiento del lenguaje natural realizada en la Universidad de Alicante por Ester Boldrini bajo la dirección del Dr. Patricio Martínez-Barco. El acto de defensa de la tesis tuvo lugar en la Universidad de Alicante el 23 de enero de 2012 ante el tribunal formado por los doctores Manuel Palomar (Universidad de Alicante), Dr. Paloma Moreda (UA), Dr. Mariona Taulé (Universidad de Barcelona), Dr. Horacio Saggion (Universitat Pompeu Fabra) y Dr. Mike Thelwall (University of Wolverhampton). Calificación: Sobresaliente Cum Laude por unanimidad.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The extension to new languages is a well known bottleneck for rule-based systems. Considerable human effort, which typically consists in re-writing from scratch huge amounts of rules, is in fact required to transfer the knowledge available to the system from one language to a new one. Provided sufficient annotated data, machine learning algorithms allow to minimize the costs of such knowledge transfer but, up to date, proved to be ineffective for some specific tasks. Among these, the recognition and normalization of temporal expressions still remains out of their reach. Focusing on this task, and still adhering to the rule-based framework, this paper presents a bunch of experiments on the automatic porting to Italian of a system originally developed for Spanish. Different automatic rule translation strategies are evaluated and discussed, providing a comprehensive overview of the challenge.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the automatic extension to other languages of TERSEO, a knowledge-based system for the recognition and normalization of temporal expressions originally developed for Spanish. TERSEO was first extended to English through the automatic translation of the temporal expressions. Then, an improved porting process was applied to Italian, where the automatic translation of the temporal expressions from English and from Spanish was combined with the extraction of new expressions from an Italian annotated corpus. Experimental results demonstrate how, while still adhering to the rule-based paradigm, the development of automatic rule translation procedures allowed us to minimize the effort required for porting to new languages. Relying on such procedures, and without any manual effort or previous knowledge of the target language, TERSEO recognizes and normalizes temporal expressions in Italian with good results (72% precision and 83% recall for recognition).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a proposal of a multi-modal dialogue system oriented to multilingual question-answering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the architecture is presented for the first time on the base of the experiences in question-answering and dialogues previously developed. The main objective of this research work is the development of a solid platform that will permit the modular integration of the proposed architecture.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Comunicación presentada en Cross-Language Evaluation Forum (CLEF 2008), Aarhus, Denmark, September 17-19, 2008.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goal of the project is to analyze, experiment, and develop intelligent, interactive and multilingual Text Mining technologies, as a key element of the next generation of search engines, systems with the capacity to find "the need behind the query". This new generation will provide specialized services and interfaces according to the search domain and type of information needed. Moreover, it will integrate textual search (websites) and multimedia search (images, audio, video), it will be able to find and organize information, rather than generating ranked lists of websites.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The English language and the Internet, both separately and taken together, are nowadays well-acknowledged as powerful forces which influence and affect the lexico-grammatical characteristics of other languages world-wide. In fact, many authors like Crystal (2004) have pointed out the emergence of the so-called Netspeak, that is, the language used in the Net or World Wide Web; as Crystal himself (2004: 19) puts it, ‘a type of language displaying features that are unique to the Internet […] arising out of its character as a medium which is electronic, global and interactive’. This ‘language’, however, may be differently understood: either as an adaptation of the English language proper to internet requirements and purposes, or as a new and rapidly-changing and developing language as a result of a rapid evolution or adaptation to Internet requirements of almost all world languages, for whom English is a trendsetter. If the second and probably most plausible interpretation is adopted, there are three salient features of ‘Netspeak’: (a) the rapid expansion of all its new linguistic developments thanks to the Internet itself, which may lead to the generalization and widespread acceptance of new words, coinages, or meanings, hundreds of times faster than was the case with the printed media. As said above, (b) the visible influence of English, the most prevalent language on the Internet. Consequently, (c) this new language tends to reduce the ‘distance’ between English and other languages as well as the ignorance of the former by speakers of other languages, since the ‘Netspeak’ version of the latter adopts grammatical, syntactic and lexical features of English. Thus, linguistic differences may even disappear when code-switching and/or borrowing occurs, as whole fragments of English appear in other language contexts. As a consequence of the new situation, an ideal context appears for interlanguage or multilingual word formation to thrive: puns, blends, compounds and word creativity in general find in the web the ideal place to gain rapid acceptance world-wide, as a result of fashion, coincidence, or sheer merit of the new linguistic proposals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the most important factors of recognition, belonging and identification in scientific communities is their specialized language: doctors, mathematicians and anthropologists feel they are part of a group with which they can interact because they share a common “language”. While ideology is present in all academic registers, it is in human sciences where its presence (or absence) leads to more visible linguistic phenomena. An interesting example is that of lesbian studies: as non-heterosexual members of society have become less stigmatized, lesbian studies have developed a language of their own. In our paper, we shall explore the mechanisms used in the creation of specific vocabulary in this academic area, paying special attention to the refashioning or deconstruction of meaning of established terms as a result of changes in social perception or the challenging of pre-determined meanings.