928 resultados para Sentence alignment
Resumo:
Following the internationalization of contemporary higher education, academic institutions based in non-English speaking countries are increasingly urged to produce contents in English to address international prospective students and personnel, as well as to increase their attractiveness. The demand for English translations in the institutional academic domain is consequently increasing at a rate exceeding the capacity of the translation profession. Resources for assisting non-native authors and translators in the production of appropriate texts in L2 are therefore required in order to help academic institutions and professionals streamline their translation workload. Some of these resources include: (i) parallel corpora to train machine translation systems and multilingual authoring tools; and (ii) translation memories for computer-aided tools. The purpose of this study is to create and evaluate reference resources like the ones mentioned in (i) and (ii) through the automatic sentence alignment of a large set of Italian and English as a Lingua Franca (ELF) institutional academic texts given as equivalent but not necessarily parallel (i.e. translated). In this framework, a set of aligning algorithms and alignment tools is examined in order to identify the most profitable one(s) in terms of accuracy and time- and cost-effectiveness. In order to determine the text pairs to align, a sample is selected according to document length similarity (characters) and subsequently evaluated in terms of extent of noisiness/parallelism, alignment accuracy and content leverageability. The results of these analyses serve as the basis for the creation of an aligned bilingual corpus of academic course descriptions, which is eventually used to create a translation memory in TMX format.
Resumo:
Afin d'enrichir les données de corpus bilingues parallèles, il peut être judicieux de travailler avec des corpus dits comparables. En effet dans ce type de corpus, même si les documents dans la langue cible ne sont pas l'exacte traduction de ceux dans la langue source, on peut y retrouver des mots ou des phrases en relation de traduction. L'encyclopédie libre Wikipédia constitue un corpus comparable multilingue de plusieurs millions de documents. Notre travail consiste à trouver une méthode générale et endogène permettant d'extraire un maximum de phrases parallèles. Nous travaillons avec le couple de langues français-anglais mais notre méthode, qui n'utilise aucune ressource bilingue extérieure, peut s'appliquer à tout autre couple de langues. Elle se décompose en deux étapes. La première consiste à détecter les paires d’articles qui ont le plus de chance de contenir des traductions. Nous utilisons pour cela un réseau de neurones entraîné sur un petit ensemble de données constitué d'articles alignés au niveau des phrases. La deuxième étape effectue la sélection des paires de phrases grâce à un autre réseau de neurones dont les sorties sont alors réinterprétées par un algorithme d'optimisation combinatoire et une heuristique d'extension. L'ajout des quelques 560~000 paires de phrases extraites de Wikipédia au corpus d'entraînement d'un système de traduction automatique statistique de référence permet d'améliorer la qualité des traductions produites. Nous mettons les données alignées et le corpus extrait à la disposition de la communauté scientifique.
Resumo:
La traduction statistique requiert des corpus parallèles en grande quantité. L’obtention de tels corpus passe par l’alignement automatique au niveau des phrases. L’alignement des corpus parallèles a reçu beaucoup d’attention dans les années quatre vingt et cette étape est considérée comme résolue par la communauté. Nous montrons dans notre mémoire que ce n’est pas le cas et proposons un nouvel aligneur que nous comparons à des algorithmes à l’état de l’art. Notre aligneur est simple, rapide et permet d’aligner une très grande quantité de données. Il produit des résultats souvent meilleurs que ceux produits par les aligneurs les plus élaborés. Nous analysons la robustesse de notre aligneur en fonction du genre des textes à aligner et du bruit qu’ils contiennent. Pour cela, nos expériences se décomposent en deux grandes parties. Dans la première partie, nous travaillons sur le corpus BAF où nous mesurons la qualité d’alignement produit en fonction du bruit qui atteint les 60%. Dans la deuxième partie, nous travaillons sur le corpus EuroParl où nous revisitons la procédure d’alignement avec laquelle le corpus Europarl a été préparé et montrons que de meilleures performances au niveau des systèmes de traduction statistique peuvent être obtenues en utilisant notre aligneur.
Resumo:
This paper investigates certain methods of training adopted in the Statistical Machine Translator (SMT) from English to Malayalam. In English Malayalam SMT, the word to word translation is determined by training the parallel corpus. Our primary goal is to improve the alignment model by reducing the number of possible alignments of all sentence pairs present in the bilingual corpus. Incorporating morphological information into the parallel corpus with the help of the parts of speech tagger has brought around better training results with improved accuracy
Resumo:
In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.
Resumo:
Organisations are increasingly investing in complex technological innovations such as enterprise information systems with the aim of improving the operations of the business, and in this way gaining competitive advantage. However, the implementation of technological innovations tends to have an excessive focus on either technology innovation effectiveness (also known as system effectiveness), or the resulting operational effectiveness; focusing on either one of them is detrimental to the long-term enterprise benefits through failure to achieve the real value of technological innovations. The lack of research on the dimensions and performance objectives that organisations must be focusing on is the main reason for this misalignment. This research uses a combination of qualitative and quantitative, three-stage methodological approach. Initial findings suggest that factors such as quality of information from technology innovation effectiveness, and quality and speed from operational effectiveness are important and significantly well correlated factors that promote the alignment between technology innovation effectiveness and operational effectiveness.
Resumo:
The objective of the project “Value Alignment Process for Project Delivery” is to provide a catalyst and tools for reform in the building and construction industry to transform business-as-usual performance into exceptional performance. The outcomes of this project will be beneficial to not only the construction industry, but to the community as a whole because a more sophisticated industry can deliver more effective use of assets, financing, operating and maintenance of facilities to suit the community’s needs. The research project consists of a study into best practice project delivery and the development of a suite of products, resources and services to guide project teams towards the best approach for a specific project. These resources will be focused on promoting the principles that underlie best practice project delivery, rather than on identifying a particular delivery system. The need for such tools and resources becomes more and more acute as the environment within which the construction industry operates becomes more and more complex, and as business and political imperatives shift to encompass or represent diverse stakeholder interests. To this end, this literature review looks at why it is essential to achieve transformation in the Australian construction industry in the context of its importance to the Australian economy. It seeks to investigate the concepts of ‘alignment’ and value’ as they pertain to construction industry processes and relationships. It comprehensively reviews drivers of project excellence and best practice project delivery principles and looks at how clients approach selection of project delivery systems. It critiques existing project delivery strategies and gives an overview of recent best practice initiatives. The literature review represents a milestone against the Project Agreement and forms a foundation document for this research project
Resumo:
This paper discusses the different perceptions of first year accounting students about their tutorial activities and their engagements in assessment. As the literature suggests, unless participation in learning activities forms part of graded assessment, it is often difficult to engage students in these activities. Using an action research model, this paper reports the study of first year accounting students' responses to action-oriented learning tasks in tutorials. The paper focuses on the importance of aligning curriculum objectives, learning and teaching activities and assessment, i.e. the notion of constructive alignment. However, as the research findings indicate, without support at institutional level, applying constructive alignment to facilitate quality student learning outcomes is a difficult task. Thus, the impacts of policy constraints on curriculum issues are also discussed, focusing on the limitations faced by tutors and their lack of involvement in curriculum development.
Resumo:
A study into best practice project delivery and the development of a suite of products, resources and services to help guide clients and project teams towards the best approach for specific projects.