939 resultados para cross-language speaker recognition
Resumo:
This paper describes a substantial effort to build a real-time interactive multimodal dialogue system with a focus on emotional and non-verbal interaction capabilities. The work is motivated by the aim to provide technology with competences in perceiving and producing the emotional and non-verbal behaviours required to sustain a conversational dialogue. We present the Sensitive Artificial Listener (SAL) scenario as a setting which seems particularly suited for the study of emotional and non-verbal behaviour, since it requires only very limited verbal understanding on the part of the machine. This scenario allows us to concentrate on non-verbal capabilities without having to address at the same time the challenges of spoken language understanding, task modeling etc. We first summarise three prototype versions of the SAL scenario, in which the behaviour of the Sensitive Artificial Listener characters was determined by a human operator. These prototypes served the purpose of verifying the effectiveness of the SAL scenario and allowed us to collect data required for building system components for analysing and synthesising the respective behaviours. We then describe the fully autonomous integrated real-time system we created, which combines incremental analysis of user behaviour, dialogue management, and synthesis of speaker and listener behaviour of a SAL character displayed as a virtual agent. We discuss principles that should underlie the evaluation of SAL-type systems. Since the system is designed for modularity and reuse, and since it is publicly available, the SAL system has potential as a joint research tool in the affective computing research community.
Resumo:
Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.
Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.
Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.
Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.
Resumo:
The goal of this thesis is twofold. Firstly, it investigates the actual, native use of spatial-deictic demonstratives in Japanese, Finnish and Swedish. Secondly, it investigates and elucidates the interlanguage of Finnish-speaking and Swedish-speaking learners of Japanese regarding their use of Japanese spatial-deictic demonstratives in the light of respective native use and, in comparison to the descriptions of demonstratives in the teaching materials used. Thus, the present study deals with analyses of two sets of empirical data: data produced by native-speaking informants (L1 data) and data produced by language learners (L2 data). These were elicited by Discourse Completion Tasks (DCTs) designed, collected and analyzed using both quantitative and qualitative methods by the author. The results showed that the actual use of demonstratives by the native informants was not always in accordance with the way described in grammars. The typological similarities between Japanese and Finnish were in this study not reflected in the native use of demonstratives, and some uses were not solely based on the spatial relations between the referent, the speaker and the addressee, but rather on social-interactional factors. The main findings regarding the learner data revealed some differences in the usage rate of the demonstratives between the two Finnish-speaking groups and the one Swedish-speaking learner group studied. There were, however, no particular differences found between them regarding the type of demonstrative used. It is suggested that these differences are first and foremost connected both with the teaching materials used and the more or less heterogeneous linguistic environment in which the learners reside, and only thereafter with the typological similarities or differences between their respective native languages, Finnish and Swedish, and the target language, Japanese. It is further argued that the learners’ use of the different Japanese demonstratives, that is the type of demonstrative used, could be explained in terms of familiarity with the grammar. That is, when the situations used in the DCTs were exemplified in teaching materials and were familiar to them, the learners seemed to use Japanese demonstratives as they are described in the teaching materials and as the native Japanese speakers use them. When the situations used in the DCTs were not exemplified in the teaching materials, the learners seem to rely more on their native language. The results, thus, suggest that the learners’ interlanguage is influenced by the grammar of the target language known to the learners, but also by the number of languages (or varieties) that the learners have contact with at the time of learning. The results of the present study have implications for the teaching of Japanese in at least two ways. Firstly, the importance of grammar instruction must be emphasized since its effect on the learners’ language is apparent. Secondly, the contents of teaching materials should be revised on the basis of the native speakers’ actual use of the grammar.
Resumo:
BACKGROUND: Research shows evidence for the importance of physical and emotional closeness for the infant, the parent and the infant-parent dyad. Less is known about how, when and why parents experience emotional closeness to their infants in a neonatal unit (NU), which was the aim of this study. METHODS: A qualitative study using a salutogenic approach to focus on positive health and wellbeing was undertaken in three NUs: one in Sweden, England and Finland. An 'emotional closeness' form was devised, which asked parents to describe moments/situations when, how and why they had felt emotionally close to their infant. Data for 23 parents of preterm infants were analyzed using thematic networks analysis. RESULTS: A global theme of 'pathways for emotional closeness' emerged from the data set. This concept related to how emotional, physical, cognitive and social influences led to feelings of emotional closeness between parents and their infants. The five underpinning organising themes relate to the: Embodied recognition through the power of physical closeness; Reassurance of, and contributing to, infant wellness; Understanding the present and the past; Feeling engaged in the day to day and Spending time and bonding as a family. CONCLUSION: These findings generate important insights into why, how and when parents feel emotionally close. This knowledge contributes to an increased awareness of how to support parents of premature infants to form positive and loving relationships with their infants. Health care staff should create a climate where parents' emotions and their emotional journey are individually supported.
Resumo:
Introduction: Reporting guidelines (e. g. CONSORT) have been developed as tools to improve quality and reduce bias in reporting research findings. Trial registration has been recommended for countering selective publication. The International Committee of Medical Journal Editors (ICMJE) encourages the implementation of reporting guidelines and trial registration as uniform requirements (URM). For the last two decades, however, biased reporting and insufficient registration of clinical trials has been identified in several literature reviews and other investigations. No study has so far investigated the extent to which author instructions in psychiatry journals encourage following reporting guidelines and trial registration. Method: Psychiatry Journals were identified from the 2011 Journal Citation Report. Information given in the author instructions and during the submission procedure of all journals was assessed on whether major reporting guidelines, trial registration and the ICMJE's URM in general were mentioned and adherence recommended. Results: We included 123 psychiatry journals (English and German language) in our analysis. A minority recommend or require 1) following the URM (21%), 2) adherence to reporting guidelines such as CONSORT, PRISMA, STROBE (23%, 7%, 4%), or 3) registration of clinical trials (34%). The subsample of the top-10 psychiatry journals (ranked by impact factor) provided much better but still improvable rates. For example, 70% of the top-10 psychiatry journals do not ask for the specific trial registration number. Discussion: Under the assumption that better reported and better registered clinical research that does not lack substantial information will improve the understanding, credibility, and unbiased translation of clinical research findings, several stakeholders including readers (physicians, patients), authors, reviewers, and editors might benefit from improved author instructions in psychiatry journals. A first step of improvement would consist in requiring adherence to the broadly accepted reporting guidelines and to trial registration.
Resumo:
Two out of three English Language Learners (ELLs) graduate from secondary schools nationwide. Of the nearly five million ELLs in public schools, more than 70% of these students’ first language is Spanish. In order to understand and resolve this phenomena and in an effort to increase the number of graduates, this research examined what high school Latino ELLs identified as the major external and internal factors that support or challenge them on the graduation pathway. The study utilized a 32 quantitative and qualitative question student survey, as well as student focus groups. Both the survey and the focus groups were conducted in English and Spanish. The questions considered the following factors: 1) value of education; 2) expectations in achieving their long-term goals; 3) current education levels; 4) expectations before coming to the United States; 5) family obligations; and 6) future aspirations. The survey was administered to 159 Latino ELLs enrolled in grades 9-12. Research took place at three high schools that provide English for Speakers of Other Languages (ESOL) classes in a large school system in the Mid-Atlantic region. The three schools involved in the study have more than 1,500 ELLs. Two of the schools had large ESOL instructional programs, and one school had a comparatively smaller ESOL program. The majority of students surveyed were from El Salvador (72%) and Guatemala (12.6%). Using Qualtrics, an independent facilitator and a bilingual translator administered the online survey tool to the students during their ESOL classes. Two weeks later, the researcher hosted three follow-up focus groups, totaling 37 students from those students who took the survey. Each focus group was conducted at the three schools by the lead researcher and the translator. The purpose of the focus group was to obtain deeper insight on how secondary age Latino ELLs defined success in school, what they identified to be their support factors, and how previous and present experiences helped or hindered their goals. From the research findings, ten recommendations range from suggested policy updates to cross-cultural/equity training for students and staff; they were developed, stemming from the findings and what the students identified.
Resumo:
Spelling is an important literacy skill, and learning to spell is an important component of learning to write. Learners with strong spelling skills also exhibit greater reading, vocabulary, and orthographic knowledge than those with poor spelling skills (Ehri & Rosenthal, 2007; Ehri & Wilce, 1987; Rankin, Bruning, Timme, & Katkanant, 1993). English, being a deep orthography, has inconsistent sound-to-letter correspondences (Seymour, 2005; Ziegler & Goswami, 2005). This poses a great challenge for learners in gaining spelling fluency and accuracy. The purpose of the present study is to examine cross-linguistic transfer of English vowel spellings in Spanish-speaking adult ESL learners. The research participants were 129 Spanish-speaking adult ESL learners and 104 native English-speaking GED students enrolled in a community college located in the South Atlantic region of the United States. The adult ESL participants were in classes at three different levels of English proficiency: advanced, intermediate, and beginning. An experimental English spelling test was administered to both the native English-speaking and ESL participants. In addition, the adult ESL participants took the standardized spelling tests to rank their spelling skills in both English and Spanish. The data were analyzed using robust regression and Poisson regression procedures, Mann-Whitney test, and descriptive statistics. The study found that both Spanish spelling skills and English proficiency are strong predictors of English spelling skills. Spanish spelling is also a strong predictor of level of L1-influenced transfer. More proficient Spanish spellers made significantly fewer L1-influenced spelling errors than less proficient Spanish spellers. L1-influenced transfer of spelling knowledge from Spanish to English likely occurred in three vowel targets (/ɑɪ/ spelled as ae, ai, or ay, /ɑʊ/ spelled as au, and /eɪ/ spelled as e). The ESL participants and the native English-speaking participants produced highly similar error patterns of English vowel spellings when the errors did not indicate L1-influenced transfer, which implies that the two groups might follow similar trajectories of developing English spelling skills. The findings may help guide future researchers or practitioners to modify and develop instructional spelling intervention to meet the needs of adult ESL learners and help them gain English spelling competence.
Resumo:
Language provides an interesting lens to look at state-building processes because of its cross-cutting nature. For example, in addition to its symbolic value and appeal, a national language has other roles in the process, including: (a) becoming the primary medium of communication which permits the nation to function efficiently in its political and economic life, (b) promoting social cohesion, allowing the nation to develop a common culture, and (c) forming a primordial basis for self-determination. Moreover, because of its cross-cutting nature, language interventions are rarely isolated activities. Languages are adopted by speakers, taking root in and spreading between communities because they are legitimated by legislation, and then reproduced through institutions like the education and military systems. Pádraig Ó’ Riagáin (1997) makes a case for this observing that “Language policy is formulated, implemented, and accomplishes its results within a complex interrelated set of economic, social, and political processes which include, inter alia, the operation of other non-language state policies” (p. 45). In the Turkish case, its foundational role in the formation of the Turkish nation-state but its linkages to human rights issues raises interesting issues about how socio-cultural practices become reproduced through institutional infrastructure formation. This dissertation is a country-level case study looking at Turkey’s nation-state building process through the lens of its language and education policy development processes with a focus on the early years of the Republic between 1927 and 1970. This project examines how different groups self-identified or were self-identified (as the case may be) in official Turkish statistical publications (e.g., the Turkish annual statistical yearbooks and the population censuses) during that time period when language and ethnicity data was made publicly available. The overarching questions this dissertation explores include: 1.What were the geo-political conditions surrounding the development and influencing the Turkish government’s language and education policies? 2.Are there any observable patterns in the geo-spatial distribution of language, literacy, and education participation rates over time? In what ways, are these traditionally linked variables (language, literacy, education participation) problematic? 3.What do changes in population identifiers, e.g., language and ethnicity, suggest about the government’s approach towards nation-state building through the construction of a civic Turkish identity and institution building? Archival secondary source data was digitized, aggregated by categories relevant to this project at national and provincial levels and over the course of time (primarily between 1927 and 2000). The data was then re-aggregated into values that could be longitudinally compared and then layered on aspatial administrative maps. This dissertation contributes to existing body of social policy literature by taking an interdisciplinary approach in looking at the larger socio-economic contexts in which language and education policies are produced.
Resumo:
Is phraseology the third articulation of language? Fresh insights into a theoretical conundrum Jean-Pierre Colson University of Louvain (Louvain-la-Neuve, Belgium) Although the notion of phraseology is now used across a wide range of linguistic disciplines, its definition and the classification of phraseological units remain a subject of intense debate. It is generally agreed that phraseology implies polylexicality, but this term is problematic as well, because it brings us back to one of the most controversial topics in modern linguistics: the definition of a word. On the other hand, another widely accepted principle of language is the double articulation or duality of patterning (Martinet 1960): the first articulation consists of morphemes and the second of phonemes. The very definition of morphemes, however, also poses several problems, and the situation becomes even more confused if we wish to take phraseology into account. In this contribution, I will take the view that a corpus-based and computational approach to phraseology may shed some new light on this theoretical conundrum. A better understanding of the basic units of meaning is necessary for more efficient language learning and translation, especially in the case of machine translation. Previous research (Colson 2011, 2012, 2013, 2014), Corpas Pastor (2000, 2007, 2008, 2013, 2015), Corpas Pastor & Leiva Rojo (2011), Leiva Rojo (2013), has shown the paramount importance of phraseology for translation. A tentative step towards a coherent explanation of the role of phraseology in language has been proposed by Mejri (2006): it is postulated that a third articulation of language intervenes at the level of words, including simple morphemes, sequences of free and bound morphemes, but also phraseological units. I will present results from experiments with statistical associations of morphemes across several languages, and point out that (mainly) isolating languages such as Chinese are interesting for a better understanding of the interplay between morphemes and phraseological units. Named entities, in particular, are an extreme example of intertwining cultural, statistical and linguistic elements. Other examples show that the many borrowings and influences that characterize European languages tend to give a somewhat blurred vision of the interplay between morphology and phraseology. From a statistical point of view, the cpr-score (Colson 2016) provides a methodology for adapting the automatic extraction of phraseological units to the morphological structure of each language. The results obtained can therefore be used for testing hypotheses about the interaction between morphology, phraseology and culture. Experiments with the cpr-score on the extraction of Chinese phraseological units show that results depend on how the basic units of meaning are defined: a morpheme-based approach yields good results, which corroborates the claim by Beck and Mel'čuk (2011) that the association of morphemes into words may be similar to the association of words into phraseological units. A cross-linguistic experiment carried out for English, French, Spanish and Chinese also reveals that the results are quite compatible with Mejri’s hypothesis (2006) of a third articulation of language. Such findings, if confirmed, also corroborate the notion of statistical semantics in language. To illustrate this point, I will present the PhraseoRobot (Colson 2016), a computational tool for extracting phraseological associations around key words from the media, such as Brexit. The results confirm a previous study on the term globalization (Colson 2016): a significant part of sociolinguistic associations prevailing in the media is related to phraseology in the broad sense, and can therefore be partly extracted by means of statistical scores. References Beck, D. & I. Mel'čuk (2011). Morphological phrasemes and Totonacan verbal morphology. Linguistics 49/1: 175-228. Colson, J.-P. (2011). La traduction spécialisée basée sur les corpus : une expérience dans le domaine informatique. In : Sfar, I. & S. Mejri, La traduction de textes spécialisés : retour sur des lieux communs. Synergies Tunisie n° 2. Gerflint, Agence universitaire de la Francophonie, p. 115-123. Colson, J.-P. (2012). Traduire le figement en langue de spécialité : une expérience de phraséologie informatique. In : Mogorrón Huerta, P. & S. Mejri (dirs.), Lenguas de especialidad, traducción, fijación / Langues spécialisées, figement et traduction. Encuentros Mediterráneos / Rencontres Méditerranéennes, N°4. Universidad de Alicante, p. 159-171. Colson, J.-P. (2013). Pratique traduisante et idiomaticité : l’importance des structures semi-figées. In : Mogorrón Huerta, P., Gallego Hernández, D., Masseau, P. & Tolosa Igualada, M. (eds.), Fraseología, Opacidad y Traduccíon. Studien zur romanischen Sprachwissenschaft und interkulturellen Kommunikation (Herausgegeben von Gerd Wotjak). Frankfurt am Main, Peter Lang, p. 207-218. Colson, J.-P. (2014). La phraséologie et les corpus dans les recherches traductologiques. Communication lors du colloque international Europhras 2014, Association Européenne de Phraséologie. Université de Paris Sorbonne, 10-12 septembre 2014. Colson, J-P. (2016). Set phrases around globalization : an experiment in corpus-based computational phraseology. In: F. Alonso Almeida, I. Ortega Barrera, E. Quintana Toledo and M. Sánchez Cuervo (eds.), Input a Word, Analyse the World: Selected Approaches to Corpus Linguistics. Newcastle upon Tyne: Cambridge Scholars Publishing, p. 141-152. Corpas Pastor, G. (2000). Acerca de la (in)traducibilidad de la fraseología. In: G. Corpas Pastor (ed.), Las lenguas de Europa: Estudios de fraseología, fraseografía y traducción. Granada: Comares, p. 483-522. Corpas Pastor, G. (2007). Europäismen - von Natur aus phraseologische Äquivalente? Von blauem Blut und sangre azul. In: M. Emsel y J. Cuartero Otal (eds.), Brücken: Übersetzen und interkulturelle Kommunikationen. Festschrift für Gerd Wotjak zum 65. Geburtstag, Fráncfort: Peter Lang, p. 65-77. Corpas Pastor, G. (2008). Investigar con corpus en traducción: los retos de un nuevo paradigma [Studien zur romanische Sprachwissenschaft und interkulturellen Kommunikation, 49], Fráncfort: Peter Lang. Corpas Pastor, G. (2013). Detección, descripción y contraste de las unidades fraseológicas mediante tecnologías lingüísticas. In Olza, I. & R. Elvira Manero (eds.) Fraseopragmática. Berlin: Frank & Timme, p. 335-373. Leiva Rojo, J. (2013). La traducción de unidades fraseológicas (alemán-español/español-alemán) como parámetro para la evaluación y revisión de traducciones. In: Mellado Blanco, C., Buján, P, Iglesias N.M., Losada M.C. & A. Mansilla (eds), La fraseología del alemán y el español: lexicografía y traducción. ELS, Etudes Linguistiques / Linguistische Studien, Band 11. München: Peniope, p. 31-42. Leiva Rojo, J. & G. Corpas Pastor (2011). Placing Italian idioms in a foreign milieu: a case study. In: Pamies Bertrán, A., Luque Nadal, L., Bretana, J. &; M. Pazos (eds), (2011). Multilingual phraseography. Second Language Learning and Translation Applications. Baltmannsweiler: Schneider Verlag (Colección: Phraseologie und Parömiologie, 28), p. 289-298. Martinet, A. (1966). Eléments de linguistique générale. Paris: Colin. Mejri, S. (2006). Polylexicalité, monolexicalité et double articulation. Cahiers de Lexicologie 2: 209-221.
Resumo:
Forensic speaker comparison exams have complex characteristics, demanding a long time for manual analysis. A method for automatic recognition of vowels, providing feature extraction for acoustic analysis is proposed, aiming to contribute as a support tool in these exams. The proposal is based in formant measurements by LPC (Linear Predictive Coding), selectively by fundamental frequency detection, zero crossing rate, bandwidth and continuity, with the clustering being done by the k-means method. Experiments using samples from three different databases have shown promising results, in which the regions corresponding to five of the Brasilian Portuguese vowels were successfully located, providing visualization of a speaker’s vocal tract behavior, as well as the detection of segments corresponding to target vowels.
Resumo:
International audience
Resumo:
Staff detection and removal is one of the most important issues in optical music recognition (OMR) tasks since common approaches for symbol detection and classification are based on this process. Due to its complexity, staff detection and removal is often inaccurate, leading to a great number of errors in posterior stages. For this reason, a new approach that avoids this stage is proposed in this paper, which is expected to overcome these drawbacks. Our approach is put into practice in a case of study focused on scores written in white mensural notation. Symbol detection is performed by using the vertical projection of the staves. The cross-correlation operator for template matching is used at the classification stage. The goodness of our proposal is shown in an experiment in which our proposal attains an extraction rate of 96 % and a classification rate of 92 %, on average. The results found have reinforced the idea of pursuing a new research line in OMR systems without the need of the removal of staff lines.
Resumo:
Over the past decades, English language teachers have become familiar with several terms which attempt to describe the role of English as a language of international communication. Presently, the term English as a lingua franca (ELF) seems to be one of the most favoured and adopted to depict the global use of English in the 21st century. Basically, the concept of ELF im-plies cross-cultural, cross-linguistic interactions involving native and non-native speakers. Conse-quently, the ELF paradigm suggests some changes in the language classroom concerning teachers’ and students’ goals as far as native speaker norms and cultures are concerned. Based on Kachru’s (1992) fallacies, this article identifies thirteen misconceptions in ELT regarding learning and teach-ing English varieties and cultures, suggesting that an ethnocentred and linguacentred approach to English should be replaced by an ELF perspective which recognizes the diversity of communicative situations involving different native and non-native cultures and varieties of English