819 resultados para terminologia finanziaria, variazione linguistica, analisi corpus-based


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort. In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment labels of those lexicon words being expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie-review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than existing weakly-supervised sentiment classification methods despite using no labeled documents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Networked Learning, e-Learning and Technology Enhanced Learning have each been defined in different ways, as people's understanding about technology in education has developed. Yet each could also be considered as a terminology competing for a contested conceptual space. Theoretically this can be a ‘fertile trans-disciplinary ground for represented disciplines to affect and potentially be re-orientated by others’ (Parchoma and Keefer, 2012), as differing perspectives on terminology and subject disciplines yield new understandings. Yet when used in government policy texts to describe connections between humans, learning and technology, terms tend to become fixed in less fertile positions linguistically. A deceptively spacious policy discourse that suggests people are free to make choices conceals an economically-based assumption that implementing new technologies, in themselves, determines learning. Yet it actually narrows choices open to people as one route is repeatedly in the foreground and humans are not visibly involved in it. An impression that the effective use of technology for endless improvement is inevitable cuts off critical social interactions and new knowledge for multiple understandings of technology in people's lives. This paper explores some findings from a corpus-based Critical Discourse Analysis of UK policy for educational technology during the last 15 years, to help to illuminate the choices made. This is important when through political economy, hierarchical or dominant neoliberal logic promotes a single ‘universal model’ of technology in education, without reference to a wider social context (Rustin, 2013). Discourse matters, because it can ‘mould identities’ (Massey, 2013) in narrow, objective economically-based terms which 'colonise discourses of democracy and student-centredness' (Greener and Perriton, 2005:67). This undermines subjective social, political, material and relational (Jones, 2012: 3) contexts for those learning when humans are omitted. Critically confronting these structures is not considered a negative activity. Whilst deterministic discourse for educational technology may leave people unconsciously restricted, I argue that, through a close analysis, it offers a deceptively spacious theoretical tool for debate about the wider social and economic context of educational technology. Methodologically it provides insights about ways technology, language and learning intersect across disciplinary borders (Giroux, 1992), as powerful, mutually constitutive elements, ever-present in networked learning situations. In sharing a replicable approach for linguistic analysis of policy discourse I hope to contribute to visions others have for a broader theoretical underpinning for educational technology, as a developing field of networked knowledge and research (Conole and Oliver, 2002; Andrews, 2011).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technology discloses man’s mode of dealing with Nature, the process of production by which he sustains his life, and thereby also lays bare the mode of formation of his social relations, and of the mental conceptions that flow from them (Marx, 1990: 372) My thesis is a Sociological analysis of UK policy discourse for educational technology during the last 15 years. My framework is a dialogue between the Marxist-based critical social theory of Lieras and a corpus-based Critical Discourse Analysis (CDA) of UK policy for Technology Enhanced Learning (TEL) in higher education. Embedded in TEL is a presupposition: a deterministic assumption that technology has enhanced learning. This conceals a necessary debate that reminds us it is humans that design learning, not technology. By omitting people, TEL provides a vehicle for strong hierarchical or neoliberal, agendas to make simplified claims politically, in the name of technology. My research has two main aims: firstly, I share a replicable, mixed methodological approach for linguistic analysis of the political discourse of TEL. Quantitatively, I examine patterns in my corpus to question forms of ‘use’ around technology that structure a rigid basic argument which ‘enframes’ educational technology (Heidegger, 1977: 38). In a qualitative analysis of findings, I ask to what extent policy discourse evaluates technology in one way, to support a Knowledge Based Economy (KBE) in a political economy of neoliberalism (Jessop 2004, Fairclough 2006). If technology is commodified as an external enhancement, it is expected to provide an ‘exchange value’ for learners (Marx, 1867). I therefore examine more closely what is prioritised and devalued in these texts. Secondly, I disclose a form of austerity in the discourse where technology, as an abstract force, undertakes tasks usually ascribed to humans (Lieras, 1996, Brey, 2003:2). This risks desubjectivisation, loss of power and limits people’s relationships with technology and with each other. A view of technology in political discourse as complete without people closes possibilities for broader dialectical (Fairclough, 2001, 2007) and ‘convivial’ (Illich, 1973) understandings of the intimate, material practice of engaging with technology in education. In opening the ‘black box’ of TEL via CDA I reveal talking points that are otherwise concealed. This allows me as to be reflexive and self-critical through praxis, to confront my own assumptions about what the discourse conceals and what forms of resistance might be required. In so doing, I contribute to ongoing debates about networked learning, providing a context to explore educational technology as a technology, language and learning nexus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relatively little research on dialect variation has been based on corpora of naturally occurring language. Instead, dialect variation has been studied based primarily on language elicited through questionnaires and interviews. Eliciting dialect data has several advantages, including allowing for dialectologists to select individual informants, control the communicative situation in which language is collected, elicit rare forms directly, and make high-quality audio recordings. Although far less common, a corpus-based approach to data collection also has several advantages, including allowing for dialectologists to collect large amounts of data from a large number of informants, observe dialect variation across a range of communicative situations, and analyze quantitative linguistic variation in large samples of natural language. Although both approaches allow for dialect variation to be observed, they provide different perspectives on language variation and change. The corpus- based approach to dialectology has therefore produced a number of new findings, many of which challenge traditional assumptions about the nature of dialect variation. Most important, this research has shown that dialect variation involves a wider range of linguistic variables and exists across a wider range of language varieties than has previously been assumed. The goal of this chapter is to introduce this emerging approach to dialectology. The first part of this chapter reviews the growing body of research that analyzes dialect variation in corpora, including research on variation across nations, regions, genders, ages, and classes, in both speech and writing, and from both a synchronic and diachronic perspective, with a focus on dialect variation in the English language. Although collections of language data elicited through interviews and questionnaires are now commonly referred to as corpora in sociolinguistics and dialectology (e.g. see Bauer 2002; Tagliamonte 2006; Kretzschmar et al. 2006; D'Arcy 2011), this review focuses on corpora of naturally occurring texts and discourse. The second part of this chapter presents the results of an analysis of variation in not contraction across region, gender, and time in a corpus of American English letters to the editor in order to exemplify a corpus-based approach to dialectology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In global policy documents, the language of Technology-Enhanced Learning (TEL) now firmly structures a perception of educational technology which ‘subsumes’ terms like Networked Learning and e-Learning. Embedded in these three words though is a deterministic, economic assumption that technology has now enhanced learning, and will continue to do so. In a market-driven, capitalist society this is a ‘trouble free’, economically focused discourse which suggests there is no need for further debate about what the use of technology achieves in learning. Yet this raises a problem too: if technology achieves goals for human beings, then in education we are now simply counting on ‘use of technology’ to enhance learning. This closes the door on a necessary and ongoing critical pedagogical conversation that reminds us it is people that design learning, not technology. Furthermore, such discourse provides a vehicle for those with either strong hierarchical, or neoliberal agendas to make simplified claims politically, in the name of technology. This chapter is a reflection on our use of language in the educational technology community through a corpus-based Critical Discourse Analysis (CDA). In analytical examples that are ‘loaded’ with economic expectation, we can notice how the policy discourse of TEL narrows conversational space for learning so that people may struggle to recognise their own subjective being in this language. Through the lens of Lieras’s externality, desubjectivisation and closure (Lieras, 1996) we might examine possible effects of this discourse and seek a more emancipatory approach. A return to discussing Networked Learning is suggested, as a first step towards a more multi-directional conversation than TEL, that acknowledges the interrelatedness of technology, language and learning in people’s practice. Secondly, a reconsideration of how we write policy for educational technology is recommended, with a critical focus on how people learn, rather than on what technology is assumed to enhance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Uncertainty text detection is important to many social-media-based applications since more and more users utilize social media platforms (e.g., Twitter, Facebook, etc.) as information source to produce or derive interpretations based on them. However, existing uncertainty cues are ineffective in social media context because of its specific characteristics. In this paper, we propose a variant of annotation scheme for uncertainty identification and construct the first uncertainty corpus based on tweets. We then conduct experiments on the generated tweets corpus to study the effectiveness of different types of features for uncertainty text identification. © 2013 Association for Computational Linguistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The first study of its kind, Regional Variation in Written American English takes a corpus-based approach to map over a hundred grammatical alternation variables across the United States. A multivariate spatial analysis of these maps shows that grammatical alternation variables follow a relatively small number of common regional patterns in American English, which can be explained based on both linguistic and extra-linguistic factors. Based on this rigorous analysis of extensive data, Grieve identifies five primary modern American dialect regions, demonstrating that regional variation is far more pervasive and complex in natural language than is generally assumed. The wealth of maps and data and the groundbreaking implications of this volume make it essential reading for students and researchers in linguistics, English language, geography, computer science, sociology and communication studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article explores powerful, constraining representations of encounters between digital technologies and the bodies of students and teachers, using corpus-based Critical Discourse Analysis (CDA). It discusses examples from a corpus of UK Higher Education (HE) policy documents, and considers how confronting such documents may strengthen arguments from educators against narrow representations of an automatically enhanced learning. Examples reveal that a promise of enhanced ‘student experience’ through information and communication technologies internalizes the ideological constructs of technology and policy makers, to reinforce a primary logic of exchange value. The identified dominant discursive patterns are closely linked to the Californian ideology. By exposing these texts, they provide a form of ‘linguistic resistance’ for educators to disrupt powerful processes that serve the interests of a neoliberal social imaginary. To mine this current crisis of education, the authors introduce productive links between a Networked Learning approach and a posthumanist perspective. The Networked Learning approach emphasises conscious choices between political alternatives, which in turn could help us reconsider ways we write about digital technologies in policy. Then, based on the works of Haraway, Hayles, and Wark, a posthumanist perspective places human digital learning encounters at the juncture of non-humans and politics. Connections between the Networked Learning approach and the posthumanist perspective are necessary in order to replace a discourse of (mis)representations with a more performative view towards the digital human body, which then becomes situated at the centre of teaching and learning. In practice, however, establishing these connections is much more complex than resorting to the typically straightforward common sense discourse encountered in the Critical Discourse Analysis, and this may yet limit practical applications of this research in policy making.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this chapter, the way in which varied terms such as Networked learning, e-learning and Technology Enhanced Learning (TEL) have each become colonised to support a dominant, economically-based world view of educational technology is discussed. Critical social theory about technology, language and learning is brought into dialogue with examples from a corpus-based Critical Discourse Analysis (CDA) of UK policy texts for educational technology between1997 and 2012. Though these policy documents offer much promise for enhancement of people’s performance via technology, the human presence to enact such innovation is missing. Given that ‘academic workload’ is a ‘silent barrier’ to the implementation of TEL strategies (Gregory and Lodge, 2015), analysis further exposes, through empirical examples, that the academic labour of both staff and students appears to be unacknowledged. Global neoliberal capitalist values have strongly territorialised the contemporary university (Hayes & Jandric, 2014), utilising existing naïve, utopian arguments about what technology alone achieves. Whilst the chapter reveals how humans are easily ‘evicted’, even from discourse about their own learning (Hayes, 2015), it also challenges staff and students to seek to re-occupy the important territory of policy to subvert the established order. We can use the very political discourse that has disguised our networked learning practices, in new explicit ways, to restore our human visibility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este artículo sugiere un enfoque nuevo a la enseñanza de las dos estructuras gramaticales la pasiva refleja y el “se” impersonal para las clases universitarias de E/LE. Concretamente, se argumenta que las dos se deberían tratar como construcciones pasivas, basada en un análisis léxico-funcional de ellas que enfoca la lingüística contrastiva. Incluso para la instrucción de E/LE, se recomienda una aproximación contrastiva en la que se enfocan tanto la reflexión metalingüística como la competencia del estudiante en el L2. Específicamente, el uso de córpora lingüísticos en la clase forma una parte integral de la instrucción. El uso de un corpus estimula la curiosidad del estudiante, le expone a material de lengua auténtica, y promulga la reflexión inductiva independiente.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El objetivo de este trabajo consiste en comparar las funciones pragmáticas de una estructura intensificadora en español peninsular formada con la partícula venga (venga a + infinitivo) con estructuras intensificadoras en inglés. Para ello, analizamos los factores sintácticos, semánticos y pragmáticos que el traductor tiene en cuenta a la hora de usar esta estructura en la traducción al español. El corpus lo componen fragmentos de obras literarias extraídas en Google Books. Los resultados demuestran que con esta construcción se consigue transferir al español no sólo efectos semánticos (iteración), sino también pragmáticos, como la evaluación del hablante (desacuerdo, sorpresa, etc.).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Esta investigación analiza el uso del sufijo diminutivo en un corpus oral de jóvenes de la República Dominicana. El material procede de la transcripción de veinte entrevistas orales realizadas en los años noventa en Santo Domingo. En este estudio se realiza un análisis de las ocurrencias documentadas, su morfología, sus preferencias en cuanto a la selección de las clases de palabras que se toman como base para la formación de diminutivos, sus posibles valores semánticos y comunicativos, y, por último, se determina la frecuencia de uso del diminutivo en función del sexo de los hablantes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present study focuses on the frequency of phrasal verbs with the particle up in the context of crime and police investigative work. This research emerges from the need to enlarge McCarthy and O’Dell’s (2004) scope from purely criminal behavior to police investigative actions. To do so, we relied on a corpus of 504,124 running words made up of spoken dialogues extracted from the script of the American TV series Castle shown on ABC since 2009. Based on Rudzka-Ostyn’s (2003) cognitive motivations for the particle up, we have identified five different meaning extensions for our phrasal verbs. Drawing from these findings, we have designed pedagogical activities for those L2 learners that study English at the Police Academy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Is phraseology the third articulation of language? Fresh insights into a theoretical conundrum Jean-Pierre Colson University of Louvain (Louvain-la-Neuve, Belgium) Although the notion of phraseology is now used across a wide range of linguistic disciplines, its definition and the classification of phraseological units remain a subject of intense debate. It is generally agreed that phraseology implies polylexicality, but this term is problematic as well, because it brings us back to one of the most controversial topics in modern linguistics: the definition of a word. On the other hand, another widely accepted principle of language is the double articulation or duality of patterning (Martinet 1960): the first articulation consists of morphemes and the second of phonemes. The very definition of morphemes, however, also poses several problems, and the situation becomes even more confused if we wish to take phraseology into account. In this contribution, I will take the view that a corpus-based and computational approach to phraseology may shed some new light on this theoretical conundrum. A better understanding of the basic units of meaning is necessary for more efficient language learning and translation, especially in the case of machine translation. Previous research (Colson 2011, 2012, 2013, 2014), Corpas Pastor (2000, 2007, 2008, 2013, 2015), Corpas Pastor & Leiva Rojo (2011), Leiva Rojo (2013), has shown the paramount importance of phraseology for translation. A tentative step towards a coherent explanation of the role of phraseology in language has been proposed by Mejri (2006): it is postulated that a third articulation of language intervenes at the level of words, including simple morphemes, sequences of free and bound morphemes, but also phraseological units. I will present results from experiments with statistical associations of morphemes across several languages, and point out that (mainly) isolating languages such as Chinese are interesting for a better understanding of the interplay between morphemes and phraseological units. Named entities, in particular, are an extreme example of intertwining cultural, statistical and linguistic elements. Other examples show that the many borrowings and influences that characterize European languages tend to give a somewhat blurred vision of the interplay between morphology and phraseology. From a statistical point of view, the cpr-score (Colson 2016) provides a methodology for adapting the automatic extraction of phraseological units to the morphological structure of each language. The results obtained can therefore be used for testing hypotheses about the interaction between morphology, phraseology and culture. Experiments with the cpr-score on the extraction of Chinese phraseological units show that results depend on how the basic units of meaning are defined: a morpheme-based approach yields good results, which corroborates the claim by Beck and Mel'čuk (2011) that the association of morphemes into words may be similar to the association of words into phraseological units. A cross-linguistic experiment carried out for English, French, Spanish and Chinese also reveals that the results are quite compatible with Mejri’s hypothesis (2006) of a third articulation of language. Such findings, if confirmed, also corroborate the notion of statistical semantics in language. To illustrate this point, I will present the PhraseoRobot (Colson 2016), a computational tool for extracting phraseological associations around key words from the media, such as Brexit. The results confirm a previous study on the term globalization (Colson 2016): a significant part of sociolinguistic associations prevailing in the media is related to phraseology in the broad sense, and can therefore be partly extracted by means of statistical scores. References Beck, D. & I. Mel'čuk (2011). Morphological phrasemes and Totonacan verbal morphology. Linguistics 49/1: 175-228. Colson, J.-P. (2011). La traduction spécialisée basée sur les corpus : une expérience dans le domaine informatique. In : Sfar, I. & S. Mejri, La traduction de textes spécialisés : retour sur des lieux communs. Synergies Tunisie n° 2. Gerflint, Agence universitaire de la Francophonie, p. 115-123. Colson, J.-P. (2012). Traduire le figement en langue de spécialité : une expérience de phraséologie informatique. In : Mogorrón Huerta, P. & S. Mejri (dirs.), Lenguas de especialidad, traducción, fijación / Langues spécialisées, figement et traduction. Encuentros Mediterráneos / Rencontres Méditerranéennes, N°4. Universidad de Alicante, p. 159-171. Colson, J.-P. (2013). Pratique traduisante et idiomaticité : l’importance des structures semi-figées. In : Mogorrón Huerta, P., Gallego Hernández, D., Masseau, P. & Tolosa Igualada, M. (eds.), Fraseología, Opacidad y Traduccíon. Studien zur romanischen Sprachwissenschaft und interkulturellen Kommunikation (Herausgegeben von Gerd Wotjak). Frankfurt am Main, Peter Lang, p. 207-218. Colson, J.-P. (2014). La phraséologie et les corpus dans les recherches traductologiques. Communication lors du colloque international Europhras 2014, Association Européenne de Phraséologie. Université de Paris Sorbonne, 10-12 septembre 2014. Colson, J-P. (2016). Set phrases around globalization : an experiment in corpus-based computational phraseology. In: F. Alonso Almeida, I. Ortega Barrera, E. Quintana Toledo and M. Sánchez Cuervo (eds.), Input a Word, Analyse the World: Selected Approaches to Corpus Linguistics. Newcastle upon Tyne: Cambridge Scholars Publishing, p. 141-152. Corpas Pastor, G. (2000). Acerca de la (in)traducibilidad de la fraseología. In: G. Corpas Pastor (ed.), Las lenguas de Europa: Estudios de fraseología, fraseografía y traducción. Granada: Comares, p. 483-522. Corpas Pastor, G. (2007). Europäismen - von Natur aus phraseologische Äquivalente? Von blauem Blut und sangre azul. In: M. Emsel y J. Cuartero Otal (eds.), Brücken: Übersetzen und interkulturelle Kommunikationen. Festschrift für Gerd Wotjak zum 65. Geburtstag, Fráncfort: Peter Lang, p. 65-77. Corpas Pastor, G. (2008). Investigar con corpus en traducción: los retos de un nuevo paradigma [Studien zur romanische Sprachwissenschaft und interkulturellen Kommunikation, 49], Fráncfort: Peter Lang. Corpas Pastor, G. (2013). Detección, descripción y contraste de las unidades fraseológicas mediante tecnologías lingüísticas. In Olza, I. & R. Elvira Manero (eds.) Fraseopragmática. Berlin: Frank & Timme, p. 335-373. Leiva Rojo, J. (2013). La traducción de unidades fraseológicas (alemán-español/español-alemán) como parámetro para la evaluación y revisión de traducciones. In: Mellado Blanco, C., Buján, P, Iglesias N.M., Losada M.C. & A. Mansilla (eds), La fraseología del alemán y el español: lexicografía y traducción. ELS, Etudes Linguistiques / Linguistische Studien, Band 11. München: Peniope, p. 31-42. Leiva Rojo, J. & G. Corpas Pastor (2011). Placing Italian idioms in a foreign milieu: a case study. In: Pamies Bertrán, A., Luque Nadal, L., Bretana, J. &; M. Pazos (eds), (2011). Multilingual phraseography. Second Language Learning and Translation Applications. Baltmannsweiler: Schneider Verlag (Colección: Phraseologie und Parömiologie, 28), p. 289-298. Martinet, A. (1966). Eléments de linguistique générale. Paris: Colin. Mejri, S. (2006). Polylexicalité, monolexicalité et double articulation. Cahiers de Lexicologie 2: 209-221.