838 resultados para Verb phrase ellipsis
Resumo:
This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Espanola: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.
Resumo:
During sentence processing there is a preference to treat the first noun phrase found as the subject and agent, unless marked the other way. This preference would lead to a conflict in thematic role assignment when the syntactic structure conforms to a non-canonical object-before-subject pattern. Left perisylvian and fronto-parietal brain networks have been found to be engaged by increased computational demands during sentence comprehension, while event-reated brain potentials have been used to study the on-line manifestation of these demands. However, evidence regarding the spatiotemporal organization of brain networks in this domain is scarce. In the current study we used Magnetoencephalography to track spatio-temporally brain activity while Spanish speakers were reading subject- and object-first cleft sentences. Both kinds of sentences remained ambiguous between a subject-first or an object-first interpretation up to the appearance of the second argument. Results show the time-modulation of a frontal network at the disambiguation point of object-first sentences. Moreover, the time windows where these effects took place have been previously related to thematic role integration (300–500 ms) and to sentence reanalysis and resolution of conflicts during processing (beyond 500 ms post-stimulus). These results point to frontal cognitive control as a putative key mechanism which may operate when a revision of the sentence structure and meaning is necessary
Resumo:
This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.
Resumo:
This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not relevant in the translation process. The categorization module has been incorporated into a phrase-based system and a Statistical Finite State Transducer (SFST). The evaluation results reveal that the BLEU has increased from 69.11% to 78.79% for the phrase-based system and from 69.84% to 75.59% for the SFST.
Resumo:
Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge.
Resumo:
En la actualidad, los avances en la ciencia de la esteganografía lingüística en español abren nuevas líneas de investigación en su aplicación a la protección / privacidad de las comunicaciones digitales y en el marcado de textos. El presente artículo profundiza en el interés del uso de la reordenación de complementos del verbo en textos existentes en lengua española con utilidad en esteganografía lingüística y en el marcado digital de textos (marca de agua). Abstract. At present the advances in the science of linguistic steganography in Spanish open new lines of research in its application for the protection / privacy of digital communications and in the marking of texts. This article studies the possible interest of reordering complements of the verb in existing texts in Spanish language with regard to its usefulness in linguistic steganography and in digital marking of texts (watermarks).
Resumo:
This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
Resumo:
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.
Resumo:
Esta tesis doctoral trata de investigar cuánto hubo de presencia del cine en el pensamiento de Le Corbusier y Pierre Jeanneret, a la hora de acometer la publicación y maquetación de las páginas dedicadas a la Villa Stein-de Monzie en Garches, en “Le Corbusier et Pierre Jeanneret. Oeuvre Complète 1910-1929”. Qué mecanismos cinematográficos pusieron en juego, cuando montaban las páginas de L´Oeuvre Complète. Más que encontrar elementos cinematográficos utilizados de manera directa; el objetivo de esta tesis es profundizar en los mecanismos, procedimientos, sistemas de generación de ideas a lo largo del proceso de proyecto; en las maneras de percibir y experimentar los espacios, o de observar las formas. El armazón lo compone por lo tanto el análisis del modo en que Le Corbusier representa la villa en L´Oeuvre Complète, sus intenciones y la función pedagógica de este modo de representar; así como sus diferencias con respecto a lo realmente construido. Una realidad elaborada sobre un tablero; maquetando su representación, eligiendo los fragmentos, y componiendo las láminas (cercana a la de un director de cine). La justificación del objeto a estudiar se plantea en el primer capítulo: L´Oeuvre Complète (1937), la reedición en francés, alemán e inglés de la primera edición en alemán, Ihr Gesamtes Werk (1929).Se establecen las intenciones de Le Corbusier de que se convierta en un modelo de Tratado de Arquitectura Moderna, eminentemente visual. Se estudian los mecanismos formales y geométricos de composición del libro, y el modo en el que L´Oeuvre Complète debe ser leída. El desarrollo de los siguientes capítulos (del 2 al 9) recoge el método principal de investigación de esta tesis, basado en una lectura longitudinal, crítica y sistemática, a partir de la observación atenta de la representación de la villa Stein-de Monzie en Garches, en las páginas nº 140 a 149 de L´Oeuvre Complète. Se efectúa mediante un desarrollo lineal y secuenciado, como si de un guión cinematográfico se tratara. Cada capítulo describe y analiza cada uno de los diferentes fragmentos, permitiendo al mismo tiempo enlazar temas de interés que ayudan a comprender aspectos de la villa de Garches, de su concepción en la intensa labor de proyecto (con numerosas variantes y propuestas), e incluso de su aparición en el cine. Además, la tesis arroja luz sobre unos documentos bastante desconocidos: las láminas de la colección del Museo Cooper-Hewitt de nueva York, para la villa de Garches. El análisis de la presentación de la villa de Garches en L´Oeuvre Complète, constata que para Le Corbusier, el fragmento, per se, ha de ser perfecto, produciendo la máxima emoción. Como un prestidigitador, Le Corbusier los manipula, o le niega información al espectador mediante el uso de la elipsis en el relato. Los textos concatenan las imágenes, soportan el hilo de la narración. Los bocetos quieren siempre seducir al espectador: son dibujos que rezuman vitalidad, con una técnica muy cercana a la ligne claire del cómic. Las plantas son un laboratorio para demostrar su jerarquía y su libertad de composición; eliminando elementos, distorsionando la valoración de líneas y apareciendo algún elemento no ejecutado. Los alzados, esquemáticos y abstractos, demuestran el control de la geometría para garantizar la emoción. Las fotografías son controladas en su fase de captura (elección del punto de vista, cuidada puesta en escena de los objetos, composición con la luz, uso de las sombras para la aparición del fuera de campo); pero también en la fase de postproducción y edición, donde son cortadas, alisadas superficies, borrando o dibujando elementos sobre ellas. El montaje compone asimismo una representación dinámica, fragmentada y múltiple de la villa. Como sucede en el cine, los fragmentos sólo encuentran su razón de ser una vez son re-creados y montados en la cabeza del espectador. La falta de raccord es un mecanismo buscado por Le Corbusier, trasladando a la representación una de las primordiales características de la villa de Garches: su permanente dualidad simultánea. Todos estos mecanismos son desplegados por Le Corbusier, para ofrecer una versión idealizada de la villa, que recoja todas las virtudes de los distintos proyectos e incorpore el factor tiempo. ABSTRACT This doctoral thesis tries to investigate how much the cinema affected Le Corbusier and Pierre Jeanneret´s thoughts, at the moment of undertaking the publication and layout of the pages dedicated to the villa Stein-de Monzie in Garches, in the book “Le Corbusier et Pierre Jeanneret. Oeuvre Complète 1910-1929”. Which cinematographic mechanisms they brought into play, when they were mounting those pages. Instead of finding cinematographic elements, used directly; the aim of this thesis is to go deeply into the mechanisms, methods, systems of generation of ideas along the project process; into the ways of seeing and feeling the spaces, or of watching the forms. The body is composed therefore by the analysis of the way in which Le Corbusier represents the villa in L'Oeuvre Complète, his intentions and the pedagogic function of that way of representation; as well as its differences with the real built villa. One reality elaborated on a board; laying out its representation, choosing the fragments, and composing the sheets (near to the work of a director of cinema). The justification of the object to studying appears in the first chapter: L'Oeuvre Complète (1937), the reissue in French, German and English of the first edition in German, Ihr Gesamtes Werk (1929). This chapter shows the intentions of Le Corbusier of turning the book into a model of modern architecture, highly visual. The formal and geometric mechanisms of composition of the book are studied, and the way in which L'Oeuvre Complète must be read. The development of the following chapters (from 2 to 9) gathers the principal method of investigation of this thesis, based on a longitudinal, critical and systematic reading; from the watching of the representation of the villa Stein-de Monzie in Garches, in the pages nr. 140 to 149 of L'Oeuvre Complète. It is carried out by a linear and sequenced development, as a cinematographic script. Every chapter describes and analyzes each of the different fragments, allowing at the same time to connect interesting issues that help to understand aspects of the villa in Garches, of its conception in the intense project process (with numerous variants and designs), and even of its appearance in films. Also, the thesis throws light on some unknown documents: the sheets of the collection of the Museum Cooper-Hewitt in New York, for the villa in Garches. The analysis of the presentation of the villa in Garches in L'Oeuvre Complète, proves that for Le Corbusier, the fragment, itself, has to be perfect, getting the maximum emotion. As a prestidigitator, Le Corbusier manipulates them, or denies information to the spectator by means of the use of the ellipsis in the story. The texts concatenate the images, support the thread of the story. The sketches always attempt to seduce the spectator: they are drawings that leak vitality, with lines very near to the ligne claire of the cómic. The plans are a laboratory to demonstrate their hierarchy and their freedom of composition; deleting elements, distorting the thickness of lines and showing some not executed elements. The elevations, schematic and abstract, shows the control of the geometry to guarantee the emotion in architecture. The pictures are controlled in their instant of capture (choice of the point of view, elegant mise-en-scène of the objects, composition with light, use of the shadows for the appearance of out of vision); but also in the postproduction and edition time, when surfaces are cut, smoothing, erasing or drawing elements in them. The montage composes in the same way a dynamic, fragmented and multiple representation of the villa. As in the films, the fragments only find their raison d'être once they have been re-created and mounted into the mind of the viewer. The continuity error is a mechanism allowed by Le Corbusier, transfering to the representation in the book one of the basic characteristics of the villa in Garches: its constant simultaneous duality. All these methods are displayed by Le Corbusier, to offer an idealized version of the villa, which gathers all the virtues of the different projects, and incorporates the time factor.
Resumo:
No Quarto Evangelho Jesus se apresenta por meio de metáforas, sendo o objeto de nossa pesquisa a frase: “Eu sou o caminho, e a verdade, e a vida”, que será o ponto de partida condutor em busca da identidade do grupo joanino. No final do primeiro século, o grupo joanino se entende como fiéis herdeiros de Jesus, agora seguidores do discípulo João (filho de Zebedeu), o qual caminhou com Jesus. O grupo não se apresenta alheio à realidade da multiplicidade religiosa do período, mas está atento aos conflitos e aos caminhos divergentes para Deus. Isso nos aponta o quão identitário é o tema. A partir de uma leitura em João 13.33-14.31, nossa dissertação tem como objeto o modo como o grupo joanino recebe essa mensagem no imaginário, a exterioriza e reage no cotidiano, bem como os grupos posteriores do gnosticismo —como o Evangelho da Verdade da Biblioteca Copta de Nag Hammadi, elaborado a partir de leituras ulteriores que plasmam o mundo simbólico imaginário, cultivando diferentes características de pertença, gerando a identidade do grupo joanino.
Resumo:
In behavior reminiscent of the responsiveness of human infants to speech, young songbirds innately recognize and prefer to learn the songs of their own species. The acoustic and physiological bases for innate recognition were investigated in fledgling white-crowned sparrows lacking song experience. A behavioral test revealed that the complete conspecific song was not essential for innate recognition: songs composed of single white-crowned sparrow phrases and songs played in reverse elicited vocal responses as strongly as did normal song. In all cases, these responses surpassed those to other species’ songs. Although auditory neurons in the song nucleus HVc and the underlying neostriatum of fledglings did not prefer conspecific song over foreign song, some neurons responded strongly to particular phrase types characteristic of white-crowned sparrows and, thus, could contribute to innate song recognition.
Resumo:
Spoken language is one of the most compact and structured ways to convey information. The linguistic ability to structure individual words into larger sentence units permits speakers to express a nearly unlimited range of meanings. This ability is rooted in speakers' knowledge of syntax and in the corresponding process of syntactic encoding. Syntactic encoding is highly automatized, operates largely outside of conscious awareness, and overlaps closely in time with several other processes of language production. With the use of positron emission tomography we investigated the cortical activations during spoken language production that are related to the syntactic encoding process. In the paradigm of restrictive scene description, utterances varying in complexity of syntactic encoding were elicited. Results provided evidence that the left Rolandic operculum, caudally adjacent to Broca's area, is involved in both sentence-level and local (phrase-level) syntactic encoding during speaking.
Resumo:
The effects of practice on the functional anatomy observed in two different tasks, a verbal and a motor task, are reviewed in this paper. In the first, people practiced a verbal production task, generating an appropriate verb in response to a visually presented noun. Both practiced and unpracticed conditions utilized common regions such as visual and motor cortex. However, there was a set of regions that was affected by practice. Practice produced a shift in activity from left frontal, anterior cingulate, and right cerebellar hemisphere to activity in Sylvian-insular cortex. Similar changes were also observed in the second task, a task in a very different domain, namely the tracing of a maze. Some areas were significantly more activated during initial unskilled performance (right premotor and parietal cortex and left cerebellar hemisphere); a different region (medial frontal cortex, “supplementary motor area”) showed greater activity during skilled performance conditions. Activations were also found in regions that most likely control movement execution irrespective of skill level (e.g., primary motor cortex was related to velocity of movement). One way of interpreting these results is in a “scaffolding-storage” framework. For unskilled, effortful performance, a scaffolding set of regions is used to cope with novel task demands. Following practice, a different set of regions is used, possibly representing storage of particular associations or capabilities that allow for skilled performance. The specific regions used for scaffolding and storage appear to be task dependent.
Resumo:
The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized.
Resumo:
Introdução: A esclerose mesial temporal (EMT) é a principal causa de epilepsia resistente ao tratamento medicamentoso. Pacientes com EMT apresentam dificuldades no processamento semântico e fonológico de linguagem e maior incidência de reorganização cerebral da linguagem (bilateral ou à direita) em relação à população geral. A ressonância magnética funcional (RMf) permite avaliar a reorganização cerebral das redes de linguagem, comparando padrões de ativação cerebral entre diversas regiões cerebrais. Objetivo: Investigar o desempenho linguístico de pacientes com EMT unilateral esquerda e direita e a ocorrência de reorganização das redes de linguagem com RMf para avaliar se a reorganização foi benéfica para a linguagem nestes pacientes. Métodos: Utilizamos provas clínicas de linguagem e paradigmas de nomeação visual e responsiva para RMf, desenvolvidos para este estudo. Foram avaliados 24 pacientes com EMTe, 22 pacientes com EMTd e 24 controles saudáveis, submetidos a provas de linguagem (fluência semântica e fonológica, nomeação de objetos, verbos, nomes próprios e responsiva, e compreensão de palavras) e a três paradigmas de linguagem por RMf [nomeação por confrontação visual (NCV), nomeação responsiva à leitura (NRL) e geração de palavras (GP)]. Seis regiões cerebrais de interesse (ROI) foram selecionadas (giro frontal inferior, giro frontal médio, giro frontal superior, giro temporal inferior, giro temporal médio e giro temporal superior). Índices de Lateralidade (ILs) foram calculados com dois métodos: bootstrap, do programa LI-Toolbox, independe de limiar, e PSC, que indica a intensidade da ativação cerebral de cada voxel. Cada grupo de pacientes (EMTe e EMTd) foi dividido em dois subgrupos, de acordo com o desempenho em relação aos controles na avaliação clinica de linguagem. O <= -1,5 foi utilizado como nota de corte para dividir os grupos em pacientes com bom e com mau desempenho de linguagem. Em seguida, comparou-se o desempenho linguístico dos subgrupos ao índices IL-boot. Resultados: Pacientes com EMT esquerda e direita mostraram pior desempenho que controles nas provas clínicas de nomeação de verbos, nomeação de nomes próprios, nomeação responsiva e fluência verbal. Os mapas de ativação cerebral por RMf mostraram efeito BOLD em regiões frontais e temporoparietais de linguagem. Os mapas de comparação de ativação cerebral entre os grupos revelaram que pacientes com EMT esquerda e direita apresentam maior ativação em regiões homólogas do hemisfério direito em relação aos controles. Os ILs corroboraram estes resultados, mostrando valores médios menores para os pacientes em relação aos controles e, portanto, maior simetria na representação da linguagem. A comparação entre o IL-boot e o desempenho nas provas clínicas de linguagem indicou que, no paradigma de nomeação responsiva à leitura, a reorganização funcional no giro temporal médio, e possivelmente, nos giros temporal inferior e superior associou-se a desempenho preservado em provas de nomeação. Conclusão: Pacientes com EMT direita e esquerda apresentam comprometimento de nomeação e fluência verbal e reorganização da rede cerebral de linguagem. A reorganização funcional de linguagem em regiões temporais, especialmente o giro temporal médio associou-se a desempenho preservado em provas de nomeação em pacientes com EMT esquerda no paradigma de RMf de nomeação responsiva à leitura