Biblioteca Digital

54 resultados para Natural Language Queries, NLPX, Bricks, XML-IR, Users

Semàntica del mot : propietats semàntiques de tres verbs de desplaçament

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El treball té com a objectiu l'estudi de les propietats semàntiques d'un grup de verbs de desplaçament i els seus corresponents arguments. La informació sobre el tipus de complement que demana cada verb és important de cara a conèixer l'estructura sintàctica de la frase i oferir solucions pràctiques en tasques de Processament del Llenguatge Natural. L'anàlisi se centrarà en els verbs conduir, navegar i volar, a partir dels sentits bàsics que el Diccionari d'ús dels verbs catalans (DUVC) descriu per a cadascun d'aquests verbs i de les seves restriccions selectives. Comprovarem, mitjançant un centenar de frases extretes del Corpus d'Ús del Català a la Web de la Universitat Pompeu Fabra i del Corpus Textual Informatitzat de la Llengua Catalana de l'Institut d'Estudis Catalans, si en la llengua es donen només els sentits i usos descrits en el DUVC i quins són els més freqüents. Finalment, descriurem els noms que fan de nucli dels arguments en termes de trets semàntics.

Estat de la qüestió de l'estudi de l'aspecte lèxic : una proposta cognitiva de classficació d'esdeveniments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Peer-reviewed

A semantic model of selective dissemination of information for digital libraries

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the theoretical and methodologicalfoundations for the development of a multi-agentSelective Dissemination of Information (SDI) servicemodel that applies Semantic Web technologies for specializeddigital libraries. These technologies make possibleachieving more efficient information management,improving agent–user communication processes, andfacilitating accurate access to relevant resources. Othertools used are fuzzy linguistic modelling techniques(which make possible easing the interaction betweenusers and system) and natural language processing(NLP) techniques for semiautomatic thesaurus generation.Also, RSS feeds are used as “current awareness bulletins”to generate personalized bibliographic alerts.

Towards the Automatic Merging of Lexical Resources: Automatic Mapping

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lexical Resources are a critical component for Natural Language Processing applications. However, the high cost of comparing and merging different resources has been a bottleneck to have richer resources with a broad range of potential uses for a significant number of languages.With the objective of reducing cost byeliminating human intervention, we present a new method for automating the merging of resources,with special emphasis in what we call the mapping step. This mapping step, which converts the resources into a common format that allows latter the merging, is usually performed with huge manual effort and thus makes the whole process very costly. Thus, we propose a method to perform this mapping fully automatically. To test our method, we have addressed the merging of two verb subcategorization frame lexica for Spanish, The resultsachieved, that almost replicate human work, demonstrate the feasibility of the approach.

A Method Towards the Fully Automatic Merging of Lexical Resources

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lexical Resources are a critical component for Natural Language Processing applications. However, the high cost of comparing and merging different resources has been a bottleneck to obtain richer resources and a broader range of potential uses for a significant number of languages. With the objective of reducing cost by eliminating human intervention, we present a new method towards the automatic merging of resources. This method includes both, the automatic mapping of resources involved to a common format and merging them, once in this format. This paper presents how we have addressed the merging of two verb subcategorization frame lexica for Spanish, but our method will be extended to cover other types of Lexical Resources. The achieved results, that almost replicate human work, demonstrate the feasibility of the approach.

Spotting Translationese in two Corpora of Original and Translated Catalan Texts: an Empirical Approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research investigates the phenomenon of translationese in two monolingual comparable corpora of original and translated Catalan texts. Translationese has been defined as the dialect, sub-language or code of translated language. This study aims at giving empirical evidence of translation universals regardless the source language.Traditionally, research conducted on translation strategies has been mainly intuition-based. Computational Linguistics and Natural Language Processing techniques provide reliable information of lexical frequencies, morphological and syntactical distribution in corpora. Therefore, they have been applied to observe which translation strategies occur in these corpora.Results seem to prove the simplification, interference and explicitation hypotheses, whereas no sign of normalization has been detected with the methodology used.The data collected and the resources created for identifying lexical, morphological and syntactic patterns of translations can be useful for Translation Studies teachers, scholars and students: teachers will have more tools to help students avoid the reproduction of translationese patterns. Resources developed will help in detecting non-genuine or inadequate structures in the target language. This fact may imply an improvement in stylistic quality in translations. Translation professionals can also take advantage of these resources to improve their translation quality.

On Paraphrase and Coreference

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By providing a better understanding of paraphrase and coreference in terms of similarities and differences in their linguistic nature, this article delimits what the focus of paraphrase extraction and coreference resolution tasks should be, and to what extent they can help each other. We argue for the relevance of this discussion to Natural Language Processing.

Tree edit distance as a baseline approach for paraphrase representation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Finding an adequate paraphrase representation formalism is a challenging issue in Natural Language Processing. In this paper, we analyse the performance of Tree Edit Distance as a paraphrase representation baseline. Our experiments using Edit Distance Textual Entailment Suite show that, as Tree Edit Distance consists of a purely syntactic approach, paraphrase alternations not based on structural reorganizations do not find an adequate representation. They also show that there is much scope for better modelling of the way trees are aligned.

Paraphrase concept and typology. A linguistically based and computationally oriented approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a critical analysis of the state of the art in the definition and typologies of paraphrasing. This analysis shows that there exists no characterization of paraphrasing that is comprehensive, linguistically based and computationally tractable at the same time. The following sets out to define and delimit the concept on the basis of the propositional content. We present a general, inclusive and computationally oriented typology of the linguistic mechanisms that give rise to form variations between paraphrase pairs.

Plagiarism meets paraphrasing: insights for the new generation in automatic plagiarism detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism detectors find it difficult to detect cases of paraphrase plagiarism. In this article, we analyse the relationship between paraphrasing and plagiarism, paying special attention to which paraphrase phenomena underlie acts of plagiarism and which of them are detected by plagiarism detection systems. With this aim in mind, we created the P4P corpus, a new resource which uses a paraphrase typology to annotate a subset of the PAN-PC-10 corpus for automatic plagiarism detection. The results of the Second International Competition on Plagiarism Detection were analysed in the light of this annotation. The presented experiments show that (i) more complex paraphrase phenomena and a high density of paraphrase mechanisms make plagiarism detection more difficult, (ii) lexical substitutions are the paraphrase mechanisms used the most when plagiarising, and (iii) paraphrase mechanisms tend to shorten the plagiarized text. For the first time, the paraphrase mechanisms behind plagiarism have been analysed, providing critical insights for the improvement of automatic plagiarism detection systems.

ClInt: A bilingual Spanish-Catalan spoken corpus of clinical interviews

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present ClInt (Clinical Interview), a bilingual Spanish-Catalan spoken corpus that contains 15 hours of clinical interviews. It consists of audio files aligned with multiple-level transcriptions comprising orthographic, phonetic and morphological information, as well as linguistic and extralinguistic encoding. This is a previously non-existent resource for these languages and it offers a wide-ranging exploitation potential in a broad variety of disciplines such as Linguistics, Natural Language Processing and related fields.

CoCo, a web interface for corpora compilation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

CoCo is a collaborative web interface for the compilation of linguistic resources. In this demo we are presenting one of its possible applications: paraphrase acquisition.

Traductor de especificaciones XML de redes de Petri coloreadas a un lenguaje para la resolución de problemas de optimización

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Este proyecto presenta el desarrollo de una aplicación que permite traducir Redes de Petri Coloreadas diseñadas en CPN Tools a un lenguaje para la generación de ficheros de entrada a un simulador/optimizador de Redes de Petri Coloreadas. De esta manera se podrán optimizar modelos creados en CPN Tools ya que esta herramienta no facilita la optimización. Todo el proyecto se ha realizado en C++.

Analyzing complexity in foreign language monologic oral production in a CLIL context

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study tested three analytic tools applied in SLA research (T-unit, AS-unit and Idea-unit) against FL learner monologic oral data. The objective was to analyse their effectiveness for the assessment of complexity of learners' academic production in English. The data were learners' individual productions gathered during the implementation of a CLIL teaching sequence on Natural Sciences in a Catalan state secondary school. The analysis showed that only AS-unit was easily applicable and highly effective in segmenting the data and taking complexity measures

Desenvolupament d'un memetracker - rastrejador de notícies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El projecte es centra en el desenvolupament d'un recol·lector de notícies publicades a una llarga llista de blocs ampliada contínuament pel desenvolupador i pels usuaris, afegint els seus blocs preferits. L'aplicació desenvolupada realitza una recol·lecció contínua de notícies consultant les possibles novetats que apareguin en cada un dels blocs inscrits a l'aplicació. Se'ls hi aplica un classificador per idioma i per temàtica i es relaciona amb les altres notícies existents si aquestes parlen sobre el mateix tema. En l'aplicació desenvolupada hi ha la possibilitat d'escollir entre les temàtiques ofertes i en l'idioma que ha estat publicada la notícia. Pel desenvolupament del projecte s'ha desitjat que la plataforma sigui el més compatible possible amb la tecnologia actual fent servir diversos llenguatges de programació que han permès desenvolupar cada un dels algorismes necessaris pel desenvolupament global de l'aplicació; en ordre d'ús he fet servir Php, Matlab, Html, MySql, CSS3, Javascript i XML. s'ha de destacar que el projecte aporta una comoditat per tots aquells lectors de blocs que es troben tantes vegades amb notícies ja llegides en els diferents blocs que consulten.

«
1
2
3
4
»