47 resultados para Language processing


Relevância:

60.00% 60.00%

Publicador:

Resumo:

By providing a better understanding of paraphrase and coreference in terms of similarities and differences in their linguistic nature, this article delimits what the focus of paraphrase extraction and coreference resolution tasks should be, and to what extent they can help each other. We argue for the relevance of this discussion to Natural Language Processing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Finding an adequate paraphrase representation formalism is a challenging issue in Natural Language Processing. In this paper, we analyse the performance of Tree Edit Distance as a paraphrase representation baseline. Our experiments using Edit Distance Textual Entailment Suite show that, as Tree Edit Distance consists of a purely syntactic approach, paraphrase alternations not based on structural reorganizations do not find an adequate representation. They also show that there is much scope for better modelling of the way trees are aligned.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a critical analysis of the state of the art in the definition and typologies of paraphrasing. This analysis shows that there exists no characterization of paraphrasing that is comprehensive, linguistically based and computationally tractable at the same time. The following sets out to define and delimit the concept on the basis of the propositional content. We present a general, inclusive and computationally oriented typology of the linguistic mechanisms that give rise to form variations between paraphrase pairs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism detectors find it difficult to detect cases of paraphrase plagiarism. In this article, we analyse the relationship between paraphrasing and plagiarism, paying special attention to which paraphrase phenomena underlie acts of plagiarism and which of them are detected by plagiarism detection systems. With this aim in mind, we created the P4P corpus, a new resource which uses a paraphrase typology to annotate a subset of the PAN-PC-10 corpus for automatic plagiarism detection. The results of the Second International Competition on Plagiarism Detection were analysed in the light of this annotation. The presented experiments show that (i) more complex paraphrase phenomena and a high density of paraphrase mechanisms make plagiarism detection more difficult, (ii) lexical substitutions are the paraphrase mechanisms used the most when plagiarising, and (iii) paraphrase mechanisms tend to shorten the plagiarized text. For the first time, the paraphrase mechanisms behind plagiarism have been analysed, providing critical insights for the improvement of automatic plagiarism detection systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En el treball es realitza una transcripció de dos programes de televisió, amb la idea de saber quin és el tipus de llenguatge que usen aquests mitjans per adreçar-se al seu públic. Però seria absurd ignorar altres canals per als quals la llengua és imprescindible. Em refereixo al cinema, sobretot. I malgrat que no es considera un mitjà de comunicació, també és un element importantíssim pel que fa al tractament i transmissió lingüístics. I molts productes del cinema acaben sortint per televisió. La premsa escrita i, com a cas especial, Internet, també hi tenen força a dir.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A crucial step for understanding how lexical knowledge is represented is to describe the relative similarity of lexical items, and how it influences language processing. Previous studies of the effects of form similarity on word production have reported conflicting results, notably within and across languages. The aim of the present study was to clarify this empirical issue to provide specific constraints for theoretical models of language production. We investigated the role of phonological neighborhood density in a large-scale picture naming experiment using fine-grained statistical models. The results showed that increasing phonological neighborhood density has a detrimental effect on naming latencies, and re-analyses of independently obtained data sets provide supplementary evidence for this effect. Finally, we reviewed a large body of evidence concerning phonological neighborhood density effects in word production, and discussed the occurrence of facilitatory and inhibitory effects in accuracy measures. The overall pattern shows that phonological neighborhood generates two opposite forces, one facilitatory and one inhibitory. In cases where speech production is disrupted (e.g. certain aphasic symptoms), the facilitatory component may emerge, but inhibitory processes dominate in efficient naming by healthy speakers. These findings are difficult to accommodate in terms of monitoring processes, but can be explained within interactive activation accounts combining phonological facilitation and lexical competition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The prediction filters are well known models for signal estimation, in communications, control and many others areas. The classical method for deriving linear prediction coding (LPC) filters is often based on the minimization of a mean square error (MSE). Consequently, second order statistics are only required, but the estimation is only optimal if the residue is independent and identically distributed (iid) Gaussian. In this paper, we derive the ML estimate of the prediction filter. Relationships with robust estimation of auto-regressive (AR) processes, with blind deconvolution and with source separation based on mutual information minimization are then detailed. The algorithm, based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics. Experimental results emphasize on the interest of this approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we present ClInt (Clinical Interview), a bilingual Spanish-Catalan spoken corpus that contains 15 hours of clinical interviews. It consists of audio files aligned with multiple-level transcriptions comprising orthographic, phonetic and morphological information, as well as linguistic and extralinguistic encoding. This is a previously non-existent resource for these languages and it offers a wide-ranging exploitation potential in a broad variety of disciplines such as Linguistics, Natural Language Processing and related fields.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

CoCo is a collaborative web interface for the compilation of linguistic resources. In this demo we are presenting one of its possible applications: paraphrase acquisition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The starting point of our investigation was the longstanding notion that bilingual individuals need effective mechanisms to prevent interference from one language while processing material in the other (e.g. Penfield and Roberts, 1959). To demonstrate how the prevention of interference is implemented in the brain we employed event-related brain potentials (ERPs; see Munte, Urbach, ¨ Duzel and Kutas, 2000, for an introductory review) ¨ and functional magnetic resonance imaging (fMRI) techniques, thus pursuing a combined temporal and spatial imaging approach. In contrast to previous investigations using neuroimaging techniques in bilinguals, which had been mainly concerned with the localization of the primary and secondary languages (e.g. Perani, Paulesu, Galles, Dupoux, Dehaene, Bettinardi, Cappa, Fazio and Mehler, 1998; Chee, Caplan, Soon, Sriram, Tan, Thiel and Weekes, 1999), our study addressed the dynamic aspects of bilingual language processing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An important issue in language learning is how new words are integrated in the brain representations that sustain language processing. To identify the brain regions involved in meaning acquisition and word learning, we conducted a functional magnetic resonance imaging study. Young participants were required to deduce the meaning of a novel word presented within increasingly constrained sentence contexts that were read silently during the scanning session. Inconsistent contexts were also presented in which no meaning could be assigned to the novel word. Participants showed meaning acquisition in the consistent but not in the inconsistent condition. A distributed brain network was identified comprising the left anterior inferior frontal gyrus (BA 45), the middle temporal gyrus (BA 21), the parahippocampal gyrus, and several subcortical structures (the thalamus and the striatum). Drawing on previous neuroimaging evidence, we tentatively identify the roles of these brain areas in the retrieval, selection, and encoding of the meaning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the platform developed in the PANACEA project, a distributed factory that automates the stages involved in the acquisition, production, updating and maintenance of Language Resources required by Machine Translation and other Language Technologies. We adopt a set of tools that have been successfully used in the Bioinformatics field, they are adapted to the needs of our field and used to deploy web services, which can be combined to build more complex processing chains (workflows). This paper describes the platform and its different components (web services, registry, workflows, social network and interoperability). We demonstrate the scalability of the platform by carrying out a set of massive data experiments. Finally, a validation of the platform across a set of required criteria proves its usability for different types of users (non-technical users and providers).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Durant els darrers anys, s’han publicat un gran nombre de materials multimèdia destinats a l’aprenentatge de llengües, la major part dels quals son CD-ROM dissenyats com a cursos per l’autoaprenentatge. Amb aquests materials, els alumnes poden treballar independentment sense l’assessorament d’un professor, i per aquest motiu s’ha afirmat que promouen i faciliten l’aprenentatge autònom. Aquesta relació, però, no es certa, com Phil Benson i Peter Voller 1997:10) han manifestat encertadament:(…) Such claims are often dubious, however, because of the limited range of options and roles offered to the learner. Nevertheless, technologies of education in the broadest sense can be considered to be either more or less supportive of autonomy. The question is what kind of criteria do we apply in evaluating them? En aquest article presentem una investigació conjunta on es defineixen els criteris que poden ser utilitzats per avaluar materials multimèdia en relació a la seva facilitat per permetre l’aprenentatge autònom. Aquests criteris son la base d’un qüestionari que s’ha emprat per avaluar una selecció de CD-ROM destinats a l’autoaprenentatge de llengües. La estructura d’aquest article és la següent: - Una introducció de l’estudi - Els criteris que s’han utilitzar per la creació del qüestionari - Els resultats generals de l’avaluació - Les conclusions que s’han extret i la seva importància pel disseny instructiu multimèdia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La recerca efectuada sobre les estratègies d’aprenentatge de llengües ha demostrat que els aprenents que utilitzen estratègies metacognitives (planificació, revisió i avaluació) desenvolupen estratègies cognitives més eficaces (Anderson, 2002). Aquest article descriu les activitats que 43 estudiants de llengua estrangera de la Universitat de Vic van emprendre de forma independent i dedueix les estratègies metacognitives que van utilitzar sense cap formació prèvia en estratègies. Els estudiants van completar un dossier on expressaven les necessitats d’aprenentatge, la planificació i supervisió de les activitats i finalment l’avaluació de l’aprenentatge que havien portat a terme de manera independent fora de les hores lectives. La primera fase de l’anàlisi de les dades revela que, tot i que els estudiants foren capaços d’expressar les necessitats d’aprenentatge en general, la formulació d’objectius i la supervisió de les activitats fou escassa. La discussió gira entorn de la formació dels estudiants de llengües estrangeres en estratègies metacognitives i la integració de l’aprenentatge autònom dins el currículum docent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Són molts els estudis que avui en dia incideixen en la necessitat d’oferir un suport metodològic i psicològic als aprenents que treballen de manera autònoma. L’objectiu d’aquest suport és ajudar-los a desenvolupar les destreses que necessiten per dirigir el seu aprenentatge així com una actitud positiva i una major conscienciació envers aquest aprenentatge. En definitiva, aquests dos tipus de preparació es consideren essencials per ajudar els aprenents a esdevenir més autònoms i més eficients en el seu propi aprenentatge. Malgrat això, si bé és freqüent trobar estudis que exemplifiquen aplicacions del suport metodològic dins els seus programes, principalment en la formació d’estratègies o ajudant els aprenents a desenvolupar un pla de treball, aquest no és el cas quan es tracta de la seva preparació psicològica. Amb rares excepcions, trobem estudis que documentin com s’incideix en les actituds i en les creences dels aprenents, també coneguts com a coneixement metacognitiu (CM), en programes que fomenten l’autonomia en l’aprenentatge. Els objectius d’aquest treball son dos: a) oferir una revisió d’estudis que han utilitzat diferents mitjans per incidir en el CM dels aprenents i b) descriure les febleses i avantatges dels procediments i instruments que utilitzen, tal com han estat valorats en estudis de recerca, ja que ens permetrà establir criteris objectius sobre com i quan utilitzar-los en programes que fomentin l’aprenentatge autodirigit.