980 resultados para Linguistic analysis (Linguistics)
Resumo:
I report on language variation in the unresearched variety of English emerging on Kosrae, Federated States of Micronesia. English is spoken as the inter-island lingua franca throughout Micronesia and has been the official language of FSM since gaining its independence in 1986, though still retaining close ties with the US through and economic “compact” agreement. I present here an analysis of a corpus of over 90 Kosraean English speakers, compiled during a three month fieldwork trip to the island in the Western Pacific. The 45 minute sociolinguistically sensitive recordings are drawn from a corpus of old and young, with varying levels of education and occupations, and off-island experiences. In the paper I analyse two variables. The first variable is the realisation of /h/, often subject to deletion in both L1 and L2 varieties of English. Such occurrences are commonly associated with Cockney English, but also found in Caribbean English and the postcolonial English of Australia. For example: Male, 31: yeah I build their house their local huts and they pay me /h/ deletion is frequent in Kosraean English, but, perhaps expectedly, occurs slightly less among people with higher contact with American English, through having spent longer periods off island. The second feature under scrutiny is the variable epenthesis of [h] to provide a consonantal onset to vowel-initial syllables. Male, 31: that guy is really hold now This practice is also found beyond Kosraean English. Previous studies find h-epenthesis arising in L1 varieties including Newfoundland and Tristan de Cunha English, while similar manifestations are identified in Francophone L2 learners of English. My variationist statistical analysis has shown [h] insertion: to disproportionately occur intervocalically; to be constrained by both speaker gender and age: older males are much more likely to epenthesis [h] in their speech; to be more likely in the onset of stressed as opposed to unstressed syllables. In light of the findings of my analysis, I consider the relationship between h-deletion and h-epenthesis, the plausibility of hypercorrection as a motivation for the variation, and the potential influence of the substrate language, alongside sociolinguistic factors such as attitudes towards the US based on mobility. The analysis sheds light on the extent to which different varieties share this characteristic and the comparability of them in terms of linguistic constraints and attributes. Clarke, S. (2010). Newfoundland and Labrador English. Edinburgh: Edinburgh University Press Hackert, S. (2004). Urban Bahamian Creole: System and Variation. Varieties of English Around the World G32. Amsterdam: Benjamins Milroy, J. (1983). On the Sociolinguistic History of H-dropping in English in Current topics in English historical linguistics: Odense UP
Resumo:
Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpora. Using state-of-the-art measures of homogeneity, we show that these corpora are very similar. In addition, we argue that they present many advantages for research in various fields of linguistics and translation studies, and we also discuss some of their limitations. We conclude by reviewing a number of previous studies that made use of these corpora, emphasizing in each case the possibilities offered by Europarl.
Resumo:
Schoolbooks convey not only school-relevant knowledge; they also influence the development of stereotypes about different social groups. Particularly during the 1970s and 1980s, many studies analysed schoolbooks and criticised the overall predominance of male persons and of traditional role allocations. Since that time, women’s and men’s occupations and social functions have changed considerably. The present research investigated gender portrayals in schoolbooks for German and mathematics that were recently published in Germany. We examined the proportions of female and male persons in pictures and texts and categorized their activities, occupational and parental roles. Going beyond previous studies, we added two criteria: the use of gender-fair language and the spatial arrangements of persons in pictures. Our results show that schoolbooks for German contained almost balanced depictions of girls and boys, whereas women were less frequently shown than men. In mathematics books, males outnumbered females in general. Across both types of books, female and male persons were engaged in many different activities, not only gendertyped ones; however, male persons were more often described via their profession than females. Use of gender-fair language has found its way into schoolbooks but is not used consistently. Books for German were more gender fair in terms of linguistic forms than books for mathematics. For spatial arrangements, we found no indication for gender biases. The results are discussed with a focus on how schoolbooks can be optimized to contribute to gender equality.
Resumo:
El objetivo de este trabajo es analizar la interpretación léxico-gramatical de la relación participante-proceso en textos en inglés y evaluar su incidencia en la efectividad de los resúmenes de dichos textos en español. Utilizando el marco de la Lingüística Sistémico-Funcional (Halliday, 1985, 1994, 2004), se realizó un análisis tripartito de diez resúmenes en español de una noticia periodística en inglés sobre deportes, escritos por alumnos de la carrera de Profesorado de Educación Física de la Universidad Nacional de La Plata en una instancia de evaluación final de la asignatura Capacitación en Inglés-Nivel II. En base a la Teoría del Registro y el Género (Martin y Eggins, 1997), se evaluó la incidencia de las relaciones lingüísticas a nivel de Contexto de Situación (Registro) y Contexto de Cultura (Género). También se solicitó la colaboración de un grupo de colegas para que dieran una opinión adicional sobre la efectividad de los resúmenes. A partir del análisis de los textos meta se pudo observar que aquellos escritores que muestran más dificultades para identificar la relación participante-proceso construyen resúmenes menos efectivos. Este menor grado de efectividad aparece como consecuencia de determinadas selecciones léxico-gramaticales. Por otro lado, la identificación de la relación participante proceso en el texto fuente parece incidir directamente en la construcción de la adecuada estructura genérica del texto meta. En conclusión, la identificación de la relación participante-proceso en textos en inglés puede considerarse central para la construcción de resúmenes efectivos en español
Resumo:
El objetivo de este trabajo es analizar la interpretación léxico-gramatical de la relación participante-proceso en textos en inglés y evaluar su incidencia en la efectividad de los resúmenes de dichos textos en español. Utilizando el marco de la Lingüística Sistémico-Funcional (Halliday, 1985, 1994, 2004), se realizó un análisis tripartito de diez resúmenes en español de una noticia periodística en inglés sobre deportes, escritos por alumnos de la carrera de Profesorado de Educación Física de la Universidad Nacional de La Plata en una instancia de evaluación final de la asignatura Capacitación en Inglés-Nivel II. En base a la Teoría del Registro y el Género (Martin y Eggins, 1997), se evaluó la incidencia de las relaciones lingüísticas a nivel de Contexto de Situación (Registro) y Contexto de Cultura (Género). También se solicitó la colaboración de un grupo de colegas para que dieran una opinión adicional sobre la efectividad de los resúmenes. A partir del análisis de los textos meta se pudo observar que aquellos escritores que muestran más dificultades para identificar la relación participante-proceso construyen resúmenes menos efectivos. Este menor grado de efectividad aparece como consecuencia de determinadas selecciones léxico-gramaticales. Por otro lado, la identificación de la relación participante proceso en el texto fuente parece incidir directamente en la construcción de la adecuada estructura genérica del texto meta. En conclusión, la identificación de la relación participante-proceso en textos en inglés puede considerarse central para la construcción de resúmenes efectivos en español
Resumo:
El objetivo de este trabajo es analizar la interpretación léxico-gramatical de la relación participante-proceso en textos en inglés y evaluar su incidencia en la efectividad de los resúmenes de dichos textos en español. Utilizando el marco de la Lingüística Sistémico-Funcional (Halliday, 1985, 1994, 2004), se realizó un análisis tripartito de diez resúmenes en español de una noticia periodística en inglés sobre deportes, escritos por alumnos de la carrera de Profesorado de Educación Física de la Universidad Nacional de La Plata en una instancia de evaluación final de la asignatura Capacitación en Inglés-Nivel II. En base a la Teoría del Registro y el Género (Martin y Eggins, 1997), se evaluó la incidencia de las relaciones lingüísticas a nivel de Contexto de Situación (Registro) y Contexto de Cultura (Género). También se solicitó la colaboración de un grupo de colegas para que dieran una opinión adicional sobre la efectividad de los resúmenes. A partir del análisis de los textos meta se pudo observar que aquellos escritores que muestran más dificultades para identificar la relación participante-proceso construyen resúmenes menos efectivos. Este menor grado de efectividad aparece como consecuencia de determinadas selecciones léxico-gramaticales. Por otro lado, la identificación de la relación participante proceso en el texto fuente parece incidir directamente en la construcción de la adecuada estructura genérica del texto meta. En conclusión, la identificación de la relación participante-proceso en textos en inglés puede considerarse central para la construcción de resúmenes efectivos en español
Resumo:
The city of Malaga underwent considerable growth in the 19th and 20th centuries. The territorial expansion paired with a massive influx of immigrants occurred in three waves and as a consequence the city of Malaga remains divided into three different parts up to today. The differences between these three neighbourhoods of the city lie in the type of houses, different cultural and industrial activities, socioeconomic level, and very interestingly, also in speech. Thus, the aim of this study is an examination of the interrelation between speech (phonetic features) and urban space in Malaga. A combination of quantitative and qualitative analysis was used, based on two types of data: 1) production data stemming from recordings of 120 speakers; 2) perception data (salience, estimated frequency of use, attitude, spatial and social perception, imitation) which was collected from several surveys with 120 participants each. Results show that the speech production data divides the city of Malaga clearly into three different parts. This tripartition is confirmed by the analysis of the perception data. Moreover, the habitants of these three areas are perceived as different social types, to whom a range of social features is attributed. That is, certain linguistic features, the different neighbourhoods of the city and the social characteristics associated with them are undergoing a process of indexicalization and iconization. As a result, the linguistic features in question function as identity markers on the intraurban level.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Assets are interrelated in risk analysis methodologies for information systems promoted by international standards. This means that an attack on one asset can be propagated through the network and threaten an organization's most valuable assets. It is necessary to valuate all assets, the direct and indirect asset dependencies, as well as the probability of threats and the resulting asset degradation. These methodologies do not, however, consider uncertain valuations and use precise values on different scales, usually percentages. Linguistic terms are used by the experts to represent assets values, dependencies and frequency and asset degradation associated with possible threats. Computations are based on the trapezoidal fuzzy numbers associated with these linguistic terms.