986 resultados para sistemi integrati, CAT tools, machine translation
Resumo:
Il paesaggio lagunare porta i segni di importanti interventi antropici che si sono susseguiti dall’epoca romana ad oggi, a partire dalle prima centuriazioni, per seguire con le grandi opere di bonifica e deviazione dei fiumi sino alla rapida trasformazione di Jesolo Lido, da colonia elioterapica degli anni ‘30 a località turistica internazionale. L’insediamento nella zona “Parco Pineta” si propone come costruzione di un paesaggio artificiale, funzionale all’uso particolare del tempo libero in un grande spazio aperto, strettamente connesso alla città e aderente a luoghi precisamente connotati quali sono la pineta, il borgo, il canale e le residenze turistiche progettate dall’architetto Gonçalo Byrne. Le scale di progetto sono due, quella territoriale e quella architettonica. Il progetto mira a connettere i centri di Jesolo e Cortellazzo con un sistema di percorsi ciclabili. Il fiume Piave consente di combinare il turismo sostenibile con la possibilità di ripercorrere i luoghi delle memorie storiche legate al primo conflitto bellico mondiale. L’intenzione è quella di rendere Cortellazzo parte integrante di un itinerario storico, artistico e naturalistico. Il Parco Pineta si configurerebbe così come il punto di partenza di un percorso che risalendo il fiume collega il piccolo borgo di Cortellazzo con San Donà e con il suo già noto Parco della Scultura in Architettura. Alla scala territoriale è l’architettura del paesaggio, il segno materiale, l’elemento dominante e caratterizzante il progetto. L’articolazione spaziale dell’intervento è costruita seguendo la geometria dettata dall’organizzazione fondiaria e la griglia della città di fondazione, con l’intento chiaro di costruire un “fatto territoriale” riconoscibile. L’unità generale è affidata al quadrato della Grande Pianta entro il quale vengono definite le altre unità spaziali. L’insieme propone uno schema organizzativo semplice che, scavalcando il Canale Cavetta, ricongiunge le parti avulse del territorio. Il grande quadrato consente di frazionare gli spazi definendo sistemi integrati diversamente utilizzabili cosicché ogni parte dell’area di progetto abbia una sua connotazione e un suo interesse d’uso. In questo modo i vincoli morfologici dell’area permettono di costruire ambienti specificamente indirizzati non solo funzionalmente ma soprattutto nelle loro fattezze architettoniche e paesistiche. Oggetto di approfondimento della tesi è stato il dispositivo della piattaforma legata all’interpretazione del sistema delle terre alte e terre basse centrale nel progetto di Byrne. Sulla sommità della piattaforma poggiano gli edifici dell’auditorium e della galleria espositiva.
Resumo:
Modern embedded systems embrace many-core shared-memory designs. Due to constrained power and area budgets, most of them feature software-managed scratchpad memories instead of data caches to increase the data locality. It is therefore programmers’ responsibility to explicitly manage the memory transfers, and this make programming these platform cumbersome. Moreover, complex modern applications must be adequately parallelized before they can the parallel potential of the platform into actual performance. To support this, programming languages were proposed, which work at a high level of abstraction, and rely on a runtime whose cost hinders performance, especially in embedded systems, where resources and power budget are constrained. This dissertation explores the applicability of the shared-memory paradigm on modern many-core systems, focusing on the ease-of-programming. It focuses on OpenMP, the de-facto standard for shared memory programming. In a first part, the cost of algorithms for synchronization and data partitioning are analyzed, and they are adapted to modern embedded many-cores. Then, the original design of an OpenMP runtime library is presented, which supports complex forms of parallelism such as multi-level and irregular parallelism. In the second part of the thesis, the focus is on heterogeneous systems, where hardware accelerators are coupled to (many-)cores to implement key functional kernels with orders-of-magnitude of speedup and energy efficiency compared to the “pure software” version. However, three main issues rise, namely i) platform design complexity, ii) architectural scalability and iii) programmability. To tackle them, a template for a generic hardware processing unit (HWPU) is proposed, which share the memory banks with cores, and the template for a scalable architecture is shown, which integrates them through the shared-memory system. Then, a full software stack and toolchain are developed to support platform design and to let programmers exploiting the accelerators of the platform. The OpenMP frontend is extended to interact with it.
Resumo:
Il presente elaborato si propone di dimostrare che per la tipologia testuale presa in considerazione, ovvero il contratto, l’uso di un programma di traduzione assistita è da preferire alla traduzione manuale, per produttività e coerenza traduttiva. Primo obiettivo è dunque quello di svolgere un’analisi comparativa tra i due approcci traduttivi, evidenziando i vantaggi della traduzione assistita rispetto a quella manuale, e al tempo stesso analizzando e valutando i suoi possibili limiti. Inoltre, posto che SDL Trados Studio è il sistema CAT per eccellenza, ma che molto spesso non è alla portata dei tanti traduttori alle prime armi, l’elaborato si propone di offrire una valida alternativa ad esso: memoQ, uno dei programmi di traduzione assistita attualmente in più rapida espansione, che si sta inserendo sempre più all’interno del mercato della traduzione come valida alternativa a SDL Trados Studio.
Resumo:
Review of this book, that is the author's Thesis Dissertation.
Resumo:
This paper describes the UPM system for the Spanish-English translation task at the NAACL 2012 workshop on statistical machine translation. This system is based on Moses. We have used all available free corpora, cleaning and deleting some repetitions. In this paper, we also propose a technique for selecting the sentences for tuning the system. This technique is based on the similarity with the sentences to translate. With our approach, we improve the BLEU score from 28.37% to 28.57%. And as a result of the WMT12 challenge we have obtained a 31.80% BLEU with the 2012 test set. Finally, we explain different experiments that we have carried out after the competition.
Resumo:
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.
Resumo:
El trabajo que se presenta a continuación desarrolla un modelo para calcular la distancia semántica entre dos oraciones representadas por grafos UNL. Este problema se plantea en el contexto de la traducción automática donde diferentes traductores pueden generar oraciones ligeramente diferentes partiendo del mismo original. La medida de distancia que se propone tiene como objetivo proporcionar una evaluación objetiva sobre la calidad del proceso de generación del texto. El autor realiza una exploración del estado del arte sobre esta materia, reuniendo en un único trabajo los modelos propuestos de distancia semántica entre conceptos, los modelos de comparación de grafos y las pocas propuestas realizadas para calcular distancias entre grafos conceptuales. También evalúa los pocos recursos disponibles para poder experimentar el modelo y plantea una metodología para generar los conjuntos de datos que permitirían aplicar la propuesta con el rigor científico necesario y desarrollar la experimentación. Utilizando las piezas anteriores se propone un modelo novedoso de comparación entre grafos conceptuales que permite utilizar diferentes algoritmos de distancia entre conceptos y establecer umbrales de tolerancia para permitir una comparación flexible entre las oraciones. Este modelo se programa utilizando C++, se alimenta con los recursos a los que se ha hecho referencia anteriormente, y se experimenta con un conjunto de oraciones creado por el autor ante la falta de otros recursos disponibles. Los resultados del modelo muestran que la metodología y la implementación pueden conducir a la obtención de una medida de distancia entre grafos UNL con aplicación en sistemas de traducción automática, sin embargo, la carencia de recursos y de datos etiquetados con los que validar el algoritmo requieren un esfuerzo previo importante antes de poder ofrecer resultados concluyentes.---ABSTRACT---The work presented here develops a model to calculate the semantic distance between two sentences represented by their UNL graphs. This problem arises in the context of machine translation where different translators can generate slightly different sentences from the same original. The distance measure that is proposed aims to provide an objective evaluation on the quality of the process involved in the generation of text. The author carries out an exploration of the state of the art on this subject, bringing together in a single work the proposed models of semantic distance between concepts, models for comparison of graphs and the few proposals made to calculate distances between conceptual graphs. It also assesses the few resources available to experience the model and presents a methodology to generate the datasets that would be needed to develop the proposal with the scientific rigor required and to carry out the experimentation. Using the previous parts a new model is proposed to compute differences between conceptual graphs; this model allows the use of different algorithms of distance between concepts and is parametrized in order to be able to perform a flexible comparison between the resulting sentences. This model is implemented in C++ programming language, it is powered with the resources referenced above and is experienced with a set of sentences created by the author due to the lack of other available resources. The results of the model show that the methodology and the implementation can lead to the achievement of a measure of distance between UNL graphs with application in machine translation systems, however, lack of resources and of labeled data to validate the algorithm requires an important effort to be done first in order to be able to provide conclusive results.
Resumo:
La tesis que se presenta tiene como propósito la construcción automática de ontologías a partir de textos, enmarcándose en el área denominada Ontology Learning. Esta disciplina tiene como objetivo automatizar la elaboración de modelos de dominio a partir de fuentes información estructurada o no estructurada, y tuvo su origen con el comienzo del milenio, a raíz del crecimiento exponencial del volumen de información accesible en Internet. Debido a que la mayoría de información se presenta en la web en forma de texto, el aprendizaje automático de ontologías se ha centrado en el análisis de este tipo de fuente, nutriéndose a lo largo de los años de técnicas muy diversas provenientes de áreas como la Recuperación de Información, Extracción de Información, Sumarización y, en general, de áreas relacionadas con el procesamiento del lenguaje natural. La principal contribución de esta tesis consiste en que, a diferencia de la mayoría de las técnicas actuales, el método que se propone no analiza la estructura sintáctica superficial del lenguaje, sino que estudia su nivel semántico profundo. Su objetivo, por tanto, es tratar de deducir el modelo del dominio a partir de la forma con la que se articulan los significados de las oraciones en lenguaje natural. Debido a que el nivel semántico profundo es independiente de la lengua, el método permitirá operar en escenarios multilingües, en los que es necesario combinar información proveniente de textos en diferentes idiomas. Para acceder a este nivel del lenguaje, el método utiliza el modelo de las interlinguas. Estos formalismos, provenientes del área de la traducción automática, permiten representar el significado de las oraciones de forma independiente de la lengua. Se utilizará en concreto UNL (Universal Networking Language), considerado como la única interlingua de propósito general que está normalizada. La aproximación utilizada en esta tesis supone la continuación de trabajos previos realizados tanto por su autor como por el equipo de investigación del que forma parte, en los que se estudió cómo utilizar el modelo de las interlinguas en las áreas de extracción y recuperación de información multilingüe. Básicamente, el procedimiento definido en el método trata de identificar, en la representación UNL de los textos, ciertas regularidades que permiten deducir las piezas de la ontología del dominio. Debido a que UNL es un formalismo basado en redes semánticas, estas regularidades se presentan en forma de grafos, generalizándose en estructuras denominadas patrones lingüísticos. Por otra parte, UNL aún conserva ciertos mecanismos de cohesión del discurso procedentes de los lenguajes naturales, como el fenómeno de la anáfora. Con el fin de aumentar la efectividad en la comprensión de las expresiones, el método provee, como otra contribución relevante, la definición de un algoritmo para la resolución de la anáfora pronominal circunscrita al modelo de la interlingua, limitada al caso de pronombres personales de tercera persona cuando su antecedente es un nombre propio. El método propuesto se sustenta en la definición de un marco formal, que ha debido elaborarse adaptando ciertas definiciones provenientes de la teoría de grafos e incorporando otras nuevas, con el objetivo de ubicar las nociones de expresión UNL, patrón lingüístico y las operaciones de encaje de patrones, que son la base de los procesos del método. Tanto el marco formal como todos los procesos que define el método se han implementado con el fin de realizar la experimentación, aplicándose sobre un artículo de la colección EOLSS “Encyclopedia of Life Support Systems” de la UNESCO. ABSTRACT The purpose of this thesis is the automatic construction of ontologies from texts. This thesis is set within the area of Ontology Learning. This discipline aims to automatize domain models from structured or unstructured information sources, and had its origin with the beginning of the millennium, as a result of the exponential growth in the volume of information accessible on the Internet. Since most information is presented on the web in the form of text, the automatic ontology learning is focused on the analysis of this type of source, nourished over the years by very different techniques from areas such as Information Retrieval, Information Extraction, Summarization and, in general, by areas related to natural language processing. The main contribution of this thesis consists of, in contrast with the majority of current techniques, the fact that the method proposed does not analyze the syntactic surface structure of the language, but explores his deep semantic level. Its objective, therefore, is trying to infer the domain model from the way the meanings of the sentences are articulated in natural language. Since the deep semantic level does not depend on the language, the method will allow to operate in multilingual scenarios, where it is necessary to combine information from texts in different languages. To access to this level of the language, the method uses the interlingua model. These formalisms, coming from the area of machine translation, allow to represent the meaning of the sentences independently of the language. In this particular case, UNL (Universal Networking Language) will be used, which considered to be the only interlingua of general purpose that is standardized. The approach used in this thesis corresponds to the continuation of previous works carried out both by the author of this thesis and by the research group of which he is part, in which it is studied how to use the interlingua model in the areas of multilingual information extraction and retrieval. Basically, the procedure defined in the method tries to identify certain regularities at the UNL representation of texts that allow the deduction of the parts of the ontology of the domain. Since UNL is a formalism based on semantic networks, these regularities are presented in the form of graphs, generalizing in structures called linguistic patterns. On the other hand, UNL still preserves certain mechanisms of discourse cohesion from natural languages, such as the phenomenon of the anaphora. In order to increase the effectiveness in the understanding of expressions, the method provides, as another significant contribution, the definition of an algorithm for the resolution of pronominal anaphora limited to the model of the interlingua, in the case of third person personal pronouns when its antecedent is a proper noun. The proposed method is based on the definition of a formal framework, adapting some definitions from Graph Theory and incorporating new ones, in order to locate the notions of UNL expression and linguistic pattern, as well as the operations of pattern matching, which are the basis of the method processes. Both the formal framework and all the processes that define the method have been implemented in order to carry out the experimentation, applying on an article of the "Encyclopedia of Life Support Systems" of the UNESCO-EOLSS collection.
Resumo:
This paper tells about the recognition of temporal expressions and the resolution of their temporal reference. A proposal of the units we have used to face up this tasks over a restricted domain is shown. We work with newspapers' articles in Spanish, that is why every reference we use is in Spanish. For the identification and recognition of temporal expressions we base on a temporal expression grammar and for the resolution on a dictionary, where we have the information necessary to do the date operation based on the recognized expressions. In the evaluation of our proposal we have obtained successful results for the examples studied.
Resumo:
In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.
Resumo:
Comunicación presentada en Cross-Language Evaluation Forum (CLEF 2008), Aarhus, Denmark, September 17-19, 2008.
Resumo:
Contiene: Jussawalla, Feroza (2003): Chiffon Saris. Toronto: TSAR Publications, 92 pages / Reviewed by Silvia Caporale Bizzini; Fernández Álvarez; M. Pilar and Antón Teodoro Manrique (2002): Antología de la literatura nórdica antigua. Salamanca: Ediciones Universidad / Reviewed by José R. Belda; Schwarlz, Anja (2001). The (im)possibilities of machine translation. Peter Lang. Frankfurt am Main. 323 pages / Reviewed by Silvia Borrás Giner; Terttu Nevalainen and Helena Raumolin-Brunberg (2003): Historical Sociolinguistics: Language Change in Tudor and Stuart England. Great Britain: Pearson Education, 260pages / Reviewed by Sara Ponce Serrano.Contiene: Jussawalla, Feroza (2003): Chiffon Saris. Toronto: TSAR Publications, 92 pages / Reviewed by Silvia Caporale Bizzini; Fernández Álvarez; M. Pilar and Antón Teodoro Manrique (2002): Antología de la literatura nórdica antigua. Salamanca: Ediciones Universidad / Reviewed by José R. Belda; Schwarlz, Anja (2001). The (im)possibilities of machine translation. Peter Lang. Frankfurt am Main. 323 pages / Reviewed by Silvia Borrás Giner; Terttu Nevalainen and Helena Raumolin-Brunberg (2003): Historical Sociolinguistics: Language Change in Tudor and Stuart England. Great Britain: Pearson Education, 260pages / Reviewed by Sara Ponce Serrano.
Resumo:
La traduction automatique statistique est un domaine très en demande et où les machines sont encore loin de produire des résultats de qualité humaine. La principale méthode utilisée est une traduction linéaire segment par segment d'une phrase, ce qui empêche de changer des parties de la phrase déjà traduites. La recherche pour ce mémoire se base sur l'approche utilisée dans Langlais, Patry et Gotti 2007, qui tente de corriger une traduction complétée en modifiant des segments suivant une fonction à optimiser. Dans un premier temps, l'exploration de nouveaux traits comme un modèle de langue inverse et un modèle de collocation amène une nouvelle dimension à la fonction à optimiser. Dans un second temps, l'utilisation de différentes métaheuristiques, comme les algorithmes gloutons et gloutons randomisés permet l'exploration plus en profondeur de l'espace de recherche et permet une plus grande amélioration de la fonction objectif.
Resumo:
This work focuses on Machine Translation (MT) and Speech-to-Speech Translation, two emerging technologies that allow users to automatically translate written and spoken texts. The first part of this work provides a theoretical framework for the evaluation of Google Translate and Microsoft Translator, which is at the core of this study. Chapter one focuses on Machine Translation, providing a definition of this technology and glimpses of its history. In this chapter we will also learn how MT works, who uses it, for what purpose, what its pros and cons are, and how machine translation quality can be defined and assessed. Chapter two deals with Speech-to-Speech Translation by focusing on its history, characteristics and operation, potential uses and limits deriving from the intrinsic difficulty of translating spoken language. After describing the future prospects for SST, the final part of this chapter focuses on the quality assessment of Speech-to-Speech Translation applications. The last part of this dissertation describes the evaluation test carried out on Google Translate and Microsoft Translator, two mobile translation apps also providing a Speech-to-Speech Translation service. Chapter three illustrates the objectives, the research questions, the participants, the methodology and the elaboration of the questionnaires used to collect data. The collected data and the results of the evaluation of the automatic speech recognition subsystem and the language translation subsystem are presented in chapter four and finally analysed and compared in chapter five, which provides a general description of the performance of the evaluated apps and possible explanations for each set of results. In the final part of this work suggestions are made for future research and reflections on the usability and usefulness of the evaluated translation apps are provided.
Resumo:
L’obiettivo di questo lavoro di tesi è rappresentato dalla definizione di un metodo di ricerca terminologica e documentazione, nonché di traduzione assistita, supportato dalle moderne tecnologie disponibili in questo campo (Antconc, Bootcat, Trados ecc.), valido per la traduzione di questo tipo di documenti, gli standard, ma sfruttabile anche in altri ambiti della traduzione tecnico-scientifica, permettendo al traduttore e, di conseguenza, al committente, di ottenere un documento “accettabile” e qualitativamente idoneo in lingua di arrivo. Il percorso tracciato in questo elaborato parte dalla presentazione del quadro storico generale, per poi passare alla classificazione degli additivi alimentari in base alla tipologia e agli impieghi in campo alimentare. Verranno illustrati in modo generale i metodi di analisi degli additivi e i criteri di validazione dei metodi impiegati, in funzione degli standard internazionali relativi alla materia, rivolgendo particolare attenzione al quadro normativo e alle agli organi coinvolti nella regolamentazione e nel controllo di queste sostanze, sia in Italia che in Russia e nel resto del mondo. Tutto ciò in funzione degli avvenimenti sul piano geopolitico e su quello culturale: da un lato le sanzioni economiche tra UE e Russia, dall’altro EXPO 2015, opportunità per numerosi traduttori e terminologi di approfondire e arricchire le proprie conoscenze in un ambito tanto importante: alimentazione e sicurezza alimentare, in relazione al progetto di gestione terminologica VOCA9. La parte finale della tesi è dedicata alla presentazione degli standard russi GOST R e alla loro traduzione in italiano, in funzione della documentazione e alla ricerca terminologica necessarie per la traduzione tramite CAT tools ed indispensabili per la creazione di glossari.