9 resultados para Verb phrase ellipsis

em Universidad Politécnica de Madrid


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Espanola: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During sentence processing there is a preference to treat the first noun phrase found as the subject and agent, unless marked the other way. This preference would lead to a conflict in thematic role assignment when the syntactic structure conforms to a non-canonical object-before-subject pattern. Left perisylvian and fronto-parietal brain networks have been found to be engaged by increased computational demands during sentence comprehension, while event-reated brain potentials have been used to study the on-line manifestation of these demands. However, evidence regarding the spatiotemporal organization of brain networks in this domain is scarce. In the current study we used Magnetoencephalography to track spatio-temporally brain activity while Spanish speakers were reading subject- and object-first cleft sentences. Both kinds of sentences remained ambiguous between a subject-first or an object-first interpretation up to the appearance of the second argument. Results show the time-modulation of a frontal network at the disambiguation point of object-first sentences. Moreover, the time windows where these effects took place have been previously related to thematic role integration (300–500 ms) and to sentence reanalysis and resolution of conflicts during processing (beyond 500 ms post-stimulus). These results point to frontal cognitive control as a putative key mechanism which may operate when a revision of the sentence structure and meaning is necessary

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not relevant in the translation process. The categorization module has been incorporated into a phrase-based system and a Statistical Finite State Transducer (SFST). The evaluation results reveal that the BLEU has increased from 69.11% to 78.79% for the phrase-based system and from 69.84% to 75.59% for the SFST.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En la actualidad, los avances en la ciencia de la esteganografía lingüística en español abren nuevas líneas de investigación en su aplicación a la protección / privacidad de las comunicaciones digitales y en el marcado de textos. El presente artículo profundiza en el interés del uso de la reordenación de complementos del verbo en textos existentes en lengua española con utilidad en esteganografía lingüística y en el marcado digital de textos (marca de agua). Abstract. At present the advances in the science of linguistic steganography in Spanish open new lines of research in its application for the protection / privacy of digital communications and in the marking of texts. This article studies the possible interest of reordering complements of the verb in existing texts in Spanish language with regard to its usefulness in linguistic steganography and in digital marking of texts (watermarks).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Esta tesis doctoral trata de investigar cuánto hubo de presencia del cine en el pensamiento de Le Corbusier y Pierre Jeanneret, a la hora de acometer la publicación y maquetación de las páginas dedicadas a la Villa Stein-de Monzie en Garches, en “Le Corbusier et Pierre Jeanneret. Oeuvre Complète 1910-1929”. Qué mecanismos cinematográficos pusieron en juego, cuando montaban las páginas de L´Oeuvre Complète. Más que encontrar elementos cinematográficos utilizados de manera directa; el objetivo de esta tesis es profundizar en los mecanismos, procedimientos, sistemas de generación de ideas a lo largo del proceso de proyecto; en las maneras de percibir y experimentar los espacios, o de observar las formas. El armazón lo compone por lo tanto el análisis del modo en que Le Corbusier representa la villa en L´Oeuvre Complète, sus intenciones y la función pedagógica de este modo de representar; así como sus diferencias con respecto a lo realmente construido. Una realidad elaborada sobre un tablero; maquetando su representación, eligiendo los fragmentos, y componiendo las láminas (cercana a la de un director de cine). La justificación del objeto a estudiar se plantea en el primer capítulo: L´Oeuvre Complète (1937), la reedición en francés, alemán e inglés de la primera edición en alemán, Ihr Gesamtes Werk (1929).Se establecen las intenciones de Le Corbusier de que se convierta en un modelo de Tratado de Arquitectura Moderna, eminentemente visual. Se estudian los mecanismos formales y geométricos de composición del libro, y el modo en el que L´Oeuvre Complète debe ser leída. El desarrollo de los siguientes capítulos (del 2 al 9) recoge el método principal de investigación de esta tesis, basado en una lectura longitudinal, crítica y sistemática, a partir de la observación atenta de la representación de la villa Stein-de Monzie en Garches, en las páginas nº 140 a 149 de L´Oeuvre Complète. Se efectúa mediante un desarrollo lineal y secuenciado, como si de un guión cinematográfico se tratara. Cada capítulo describe y analiza cada uno de los diferentes fragmentos, permitiendo al mismo tiempo enlazar temas de interés que ayudan a comprender aspectos de la villa de Garches, de su concepción en la intensa labor de proyecto (con numerosas variantes y propuestas), e incluso de su aparición en el cine. Además, la tesis arroja luz sobre unos documentos bastante desconocidos: las láminas de la colección del Museo Cooper-Hewitt de nueva York, para la villa de Garches. El análisis de la presentación de la villa de Garches en L´Oeuvre Complète, constata que para Le Corbusier, el fragmento, per se, ha de ser perfecto, produciendo la máxima emoción. Como un prestidigitador, Le Corbusier los manipula, o le niega información al espectador mediante el uso de la elipsis en el relato. Los textos concatenan las imágenes, soportan el hilo de la narración. Los bocetos quieren siempre seducir al espectador: son dibujos que rezuman vitalidad, con una técnica muy cercana a la ligne claire del cómic. Las plantas son un laboratorio para demostrar su jerarquía y su libertad de composición; eliminando elementos, distorsionando la valoración de líneas y apareciendo algún elemento no ejecutado. Los alzados, esquemáticos y abstractos, demuestran el control de la geometría para garantizar la emoción. Las fotografías son controladas en su fase de captura (elección del punto de vista, cuidada puesta en escena de los objetos, composición con la luz, uso de las sombras para la aparición del fuera de campo); pero también en la fase de postproducción y edición, donde son cortadas, alisadas superficies, borrando o dibujando elementos sobre ellas. El montaje compone asimismo una representación dinámica, fragmentada y múltiple de la villa. Como sucede en el cine, los fragmentos sólo encuentran su razón de ser una vez son re-creados y montados en la cabeza del espectador. La falta de raccord es un mecanismo buscado por Le Corbusier, trasladando a la representación una de las primordiales características de la villa de Garches: su permanente dualidad simultánea. Todos estos mecanismos son desplegados por Le Corbusier, para ofrecer una versión idealizada de la villa, que recoja todas las virtudes de los distintos proyectos e incorpore el factor tiempo. ABSTRACT This doctoral thesis tries to investigate how much the cinema affected Le Corbusier and Pierre Jeanneret´s thoughts, at the moment of undertaking the publication and layout of the pages dedicated to the villa Stein-de Monzie in Garches, in the book “Le Corbusier et Pierre Jeanneret. Oeuvre Complète 1910-1929”. Which cinematographic mechanisms they brought into play, when they were mounting those pages. Instead of finding cinematographic elements, used directly; the aim of this thesis is to go deeply into the mechanisms, methods, systems of generation of ideas along the project process; into the ways of seeing and feeling the spaces, or of watching the forms. The body is composed therefore by the analysis of the way in which Le Corbusier represents the villa in L'Oeuvre Complète, his intentions and the pedagogic function of that way of representation; as well as its differences with the real built villa. One reality elaborated on a board; laying out its representation, choosing the fragments, and composing the sheets (near to the work of a director of cinema). The justification of the object to studying appears in the first chapter: L'Oeuvre Complète (1937), the reissue in French, German and English of the first edition in German, Ihr Gesamtes Werk (1929). This chapter shows the intentions of Le Corbusier of turning the book into a model of modern architecture, highly visual. The formal and geometric mechanisms of composition of the book are studied, and the way in which L'Oeuvre Complète must be read. The development of the following chapters (from 2 to 9) gathers the principal method of investigation of this thesis, based on a longitudinal, critical and systematic reading; from the watching of the representation of the villa Stein-de Monzie in Garches, in the pages nr. 140 to 149 of L'Oeuvre Complète. It is carried out by a linear and sequenced development, as a cinematographic script. Every chapter describes and analyzes each of the different fragments, allowing at the same time to connect interesting issues that help to understand aspects of the villa in Garches, of its conception in the intense project process (with numerous variants and designs), and even of its appearance in films. Also, the thesis throws light on some unknown documents: the sheets of the collection of the Museum Cooper-Hewitt in New York, for the villa in Garches. The analysis of the presentation of the villa in Garches in L'Oeuvre Complète, proves that for Le Corbusier, the fragment, itself, has to be perfect, getting the maximum emotion. As a prestidigitator, Le Corbusier manipulates them, or denies information to the spectator by means of the use of the ellipsis in the story. The texts concatenate the images, support the thread of the story. The sketches always attempt to seduce the spectator: they are drawings that leak vitality, with lines very near to the ligne claire of the cómic. The plans are a laboratory to demonstrate their hierarchy and their freedom of composition; deleting elements, distorting the thickness of lines and showing some not executed elements. The elevations, schematic and abstract, shows the control of the geometry to guarantee the emotion in architecture. The pictures are controlled in their instant of capture (choice of the point of view, elegant mise-en-scène of the objects, composition with light, use of the shadows for the appearance of out of vision); but also in the postproduction and edition time, when surfaces are cut, smoothing, erasing or drawing elements in them. The montage composes in the same way a dynamic, fragmented and multiple representation of the villa. As in the films, the fragments only find their raison d'être once they have been re-created and mounted into the mind of the viewer. The continuity error is a mechanism allowed by Le Corbusier, transfering to the representation in the book one of the basic characteristics of the villa in Garches: its constant simultaneous duality. All these methods are displayed by Le Corbusier, to offer an idealized version of the villa, which gathers all the virtues of the different projects, and incorporates the time factor.