1000 resultados para spanish language
Resumo:
This paper describes the application of language translation technologies for generating bus information in Spanish Sign Language (LSE: Lengua de Signos Española). In this work, two main systems have been developed: the first for translating text messages from information panels and the second for translating spoken Spanish into natural conversations at the information point of the bus company. Both systems are made up of a natural language translator (for converting a word sentence into a sequence of LSE signs), and a 3D avatar animation module (for playing back the signs). For the natural language translator, two technological approaches have been analyzed and integrated: an example-based strategy and a statistical translator. When translating spoken utterances, it is also necessary to incorporate a speech recognizer for decoding the spoken utterance into a word sequence, prior to the language translation module. This paper includes a detailed description of the field evaluation carried out in this domain. This evaluation has been carried out at the customer information office in Madrid involving both real bus company employees and deaf people. The evaluation includes objective measurements from the system and information from questionnaires. In the field evaluation, the whole translation presents an SER (Sign Error Rate) of less than 10% and a BLEU greater than 90%.
Resumo:
The convergence process among European academic degrees pursues the exchange of graduate students and the adaptation of university programs to social demand. Within the framework of the European Higher Education, European universities will need to be more competitive not only by increasing or maintaining the student enrolment, but also in their academic performance. Thus, the reinforcing of English language education within the University Programs might play an important role to reach these objectives. In this sense, a complete survey was accomplished at the Agricultural Egineering School of Madrid (ETSIA ) addressing issues such as: identification the needs for bilingual instruction at ETSIA, identification resources needed and interest and background in English language of students and professors (San José et al., 2013). The conclusions and recommendations to promote the bilingual instruction in the ETSIA, taking into account the approaches followed by other Spanish universities, are presented in this work.
Resumo:
This paper presents a dynamic LM adaptation based on the topic that has been identified on a speech segment. We use LSA and the given topic labels in the training dataset to obtain and use the topic models. We propose a dynamic language model adaptation to improve the recognition performance in "a two stages" AST system. The final stage makes use of the topic identification with two variants: the first on uses just the most probable topic and the other one depends on the relative distances of the topics that have been identified. We perform the adaptation of the LM as a linear interpolation between a background model and topic-based LM. The interpolation weight id dynamically adapted according to different parameters. The proposed method is evaluated on the Spanish partition of the EPPS speech database. We achieved a relative reduction in WER of 11.13% over the baseline system which uses a single blackground LM.
Resumo:
In recent years, coinciding with adjustments to the Bologna process, many European universities have attempted to improve their international profile by increasing course offerings in English. According to the Institute of International Education (IIE), Spain has notably increased its English-taught higher education programs, ranking fifth in the list of European countries by number of English-taught Master's programs in 2013. This article presents the goals and preliminary results of an on-going innovative education project (TechEnglish) that aims to promote course offerings in English at the Technical University of Madrid (Universidad Politécnica de Madrid, UPM). The UPM is the oldest and largest of all Technical Universities in Spain. It offers graduate and postgraduate programs that cover all the engineering disciplines as well as architecture. Currently, the UPM has no specific bilingual/multilingual program to promote teaching in English, although there is an Educational Model Whitepaper (with a focus on undergraduate degrees) that promotes the development of activities like an International Semester or a unique shared curriculum. The TechEnglish project is an attempt to foster courses taught in English at 7 UPM Technical Schools, including students and 80 faculty members. Four tasks were identified: (1) to design a university wide framework to increase course offerings, (2) to identify administrative difficulties, (3) to increase visibility of courses offered, and (4) to disseminate the results of the project. First, to design a program we analyzed existing programs at other Spanish universities, and other projects and efforts already under way at the UPM. A total of 13 plans were analyzed and classified according to their relation with students (learning), professors (teaching), administration, course offerings, other actors/institutions within the university (e.g., language departments), funds and projects, dissemination activities, mobility plans and quality control. Second, to begin to identify administrative and organizational difficulties in the implementation of teaching in English, we first estimated the current and potential course offerings at the undergraduate level at the UPM using a survey (student, teacher and administrative demand, level of English and willingness to work in English). Third, to make the course offerings more attractive for both Spanish and international students we examined the way the most prestigious universities in Spain and in Europe try to improve the visibility of their academic offerings in English. Finally, to disseminate the results of the project we created a web page and a workspace on the Moodle education platform and prepared conferences and workshops within the UPM. Preliminary results show that increasing course offerings in English is an important step to promote the internationalization of the University. The main difficulties identified at the UPM were related to how to acknowledge/certify the departments, teachers or students involved in English courses, how students should register for the courses, how departments should split and schedule the courses (Spanish and English), and the lack of qualified personnel. A concerted effort could be made to increase the visibility of English-taught programs offered on-line.
Resumo:
This paper reflects upon the increasing diversity of the United States and the subsequent necessity for mental health providers who can provide psychotherapy services in more than one language. Review of the current literature of clinicians who provide bilingual services highlight the challenges and rewards of working in a second language. The literature focuses on the experiences of those bilingual clinicians who are bilingual in English and Spanish. However, there is little to no research concerning clinicians who can provide psychotherapy in three languages. This writer speaks of her experience growing up in a bilingual Vietnamese-English household in Southern California and her journey of becoming fluent in Spanish. Lastly, she provides recommendations to training programs on how to support trainees who aim to provide psychotherapy services in multiple languages.
Resumo:
Abundant research has shown that poverty has negative influences on young child academic and psychosocial development, and unfortunately, disparities in school readiness between low and high income children can be seen as early the first year of life. The largest federal early care and education intervention for these vulnerable children is Early Head Start (EHS). To diminish these disparate child outcomes, EHS seeks to provide community based flexible programming for infants and toddlers and their families. Given how relatively recent these programs have been offered, little is known about the nuances of how EHS impacts infant and toddler language and psychosocial development. Using a framework of Community Based Participatory Research (CBPR) this paper had 5 goals: 1) to characterize the associations between domain specific and cumulative risk and child outcomes 2) to validate and explore these risk-outcome associations separately for Children of Hispanic immigrants (COHIs), 3) to explore relationships among family characteristics, multiple environmental factors, and dosage patterns in different EHS program types, 4) to examine the relationship between EHS dosage and child outcomes, and 5) to examine how EHS compliance impacts child internalizing and externalizing behaviors and emerging language abilities. Results of the current study showed that risks were differentially related to child outcomes. Poor maternal mental health was related to child internalizing and externalizing behaviors, but not related to emerging child language skills. Although child language skills were not related to maternal mental health, they were related to economic hardship. Additionally, parent level Spanish use and heritage orientation were associated with positive child outcomes. Results also showed that these relationships differed when COHIs and children with native-born parents were examined separately. Further, unique patterns emerged for EHS program use, for example families who participated in home-based care were less likely to comply with EHS attendance requirements. These findings provide tangible suggestions for EHS stakeholders: namely, the need to develop effective programming that targets engagement for diverse families enrolled in EHS programs.
Resumo:
In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of this system is the parser, which uses the grammatical formalism Lexical-Functional Grammars (LFG). Another important component of this system is the anaphora resolution module. To solve the anaphora, this module contains a method based on linguistic information (lexical, morphological, syntactic and semantic), structural information (anaphoric accessibility space in which the anaphor obtains the antecedent) and statistical information. This method is based on constraints and preferences and solves pronouns and definite descriptions. Moreover, this system fits dialogue and non-dialogue discourse features. The anaphora resolution module uses several resources, such as a lexical database (Spanish WordNet) to provide semantic information and a POS tagger providing the part of speech for each word and its root to make this resolution process easier.
Resumo:
There is no question nowadays as to the international and powerful status of English at a global scale and, consequently, as to its presence in non-English speaking countries at different levels. Linguistically speaking, English is one of the languages which have mostly influenced Spanish throughout its history and especially from the late 1960s. In this study, the impact of English on Spanish is considered in the language of sports; particularly, sports Anglicisms and false Anglicisms are analysed. Due attention is paid to the different forms that an Anglicism may adopt and to which of those forms are more widely accepted or rejected by prescriptivists and speakers at large, in the light of a contrastive analysis of their appearance in the Nuevo diccionario de anglicismos, the Diccionario de la Real Academia Española and the Corpus de Referencia del Español Actual.
Resumo:
In this paper we describe Fénix, a data model for exchanging information between Natural Language Processing applications. The format proposed is intended to be flexible enough to cover both current and future data structures employed in the field of Computational Linguistics. The Fénix architecture is divided into four separate layers: conceptual, logical, persistence and physical. This division provides a simple interface to abstract the users from low-level implementation details, such as programming languages and data storage employed, allowing them to focus in the concepts and processes to be modelled. The Fénix architecture is accompanied by a set of programming libraries to facilitate the access and manipulation of the structures created in this framework. We will also show how this architecture has been already successfully applied in different research projects.
Resumo:
Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.
Resumo:
This introduction provides an overview of the state-of-the-art technology in Applications of Natural Language to Information Systems. Specifically, we analyze the need for such technologies to successfully address the new challenges of modern information systems, in which the exploitation of the Web as a main data source on business systems becomes a key requirement. It will also discuss the reasons why Human Language Technologies themselves have shifted their focus onto new areas of interest very directly linked to the development of technology for the treatment and understanding of Web 2.0. These new technologies are expected to be future interfaces for the new information systems to come. Moreover, we will review current topics of interest to this research community, and will present the selection of manuscripts that have been chosen by the program committee of the NLDB 2011 conference as representative cornerstone research works, especially highlighting their contribution to the advancement of such technologies.
Resumo:
This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.
Resumo:
Natural Language Interfaces to Query Databases (NLIDBs) have been an active research field since the 1960s. However, they have not been widely adopted. This article explores some of the biggest challenges and approaches for building NLIDBs and proposes techniques to reduce implementation and adoption costs. The article describes {AskMe*}, a new system that leverages some of these approaches and adds an innovative feature: query-authoring services, which lower the entry barrier for end users. Advantages of these approaches are proven with experimentation. Results confirm that, even when {AskMe*} is automatically reconfigurable against multiple domains, its accuracy is comparable to domain-specific NLIDBs.
Resumo:
One of the main challenges to be addressed in text summarization concerns the detection of redundant information. This paper presents a detailed analysis of three methods for achieving such goal. The proposed methods rely on different levels of language analysis: lexical, syntactic and semantic. Moreover, they are also analyzed for detecting relevance in texts. The results show that semantic-based methods are able to detect up to 90% of redundancy, compared to only the 19% of lexical-based ones. This is also reflected in the quality of the generated summaries, obtaining better summaries when employing syntactic- or semantic-based approaches to remove redundancy.
Resumo:
N.B. reproduced with permission of Peter Lang Verlag. For citation, please, use the original reference, that is Campos Pardillos, M.A. and Balteiro Fernández, I. 2009. “Building bridges… and properties aplenty: cultural problems in Spanish real estate marketing for prospective British buyers”. In: Guillén-Nieto, V., C. Marimón-Llorca and C. Vargas-Sierra. Eds. Intercultural Business Communication and Simulation and Gaming Methodology. Bern: Peter Lang. Pp. 155-174.