973 resultados para lexical analysis
Resumo:
The geographical proximity and socioeconomic dependence on the United States brought about a deep rooted anglicization of the Cuban Spanish lexis and social strata, especially throughout the Neocolonial period (1902–1959). This study is based on the revision of a renowned newspaper of that time, Diario de la Marina, and the corresponding elaboration of a corpus of English-induced loanwords. Diario de la Marina particularly targeted upper social class, and only crónicas sociales (society pages’ columns) and print advertising were revised because of their fully descriptive texts, which encoded the ruling class ideology and consumerism. The findings show that there existed a high number of lexical and cultural anglicisms in the sociolect in question, and that the sociolinguistic anglicization was openly embraced by the upper socioeconomic stratum, entailing a differentiating sign of sophistication and social stratification. Likewise, a number of the anglicisms collected, particularly those related with social events, are unused in contemporary Cuban Spanish, which suggests a major semantic shifting in this sociolect after 1959.
Resumo:
To investigate the stability of trace reactivation in healthy older adults, 22 older volunteers with no significant neurological history participated in a cross-modal priming task. Whilst both object relative center embedded (ORC) and object relative right branching (ORR) sentences is-ere employed, working memory load was reduced by limiting the number of wordy separating the antecedent front the gap for both sentence types. Analysis of the results did not reveal any significant trace reactivation for the ORC or ORR sentences. The results did reveal, however, a positive correlation between age and semantic printing at the pre-gap position and a negative correlation between age and semantic printing at the gap position for ORC sentences. In contrast, there was no correlation between age and priming effects for the ORR sentences. These results indicated that trace reactivation may be sensitive to a variety of age related factors, including lexical activation and working memory. The implications of these results for sentence processing in the older population arc discussed.
Resumo:
In this paper, we compare a well-known semantic spacemodel, Latent Semantic Analysis (LSA) with another model, Hyperspace Analogue to Language (HAL) which is widely used in different area, especially in automatic query refinement. We conduct this comparative analysis to prove our hypothesis that with respect to ability of extracting the lexical information from a corpus of text, LSA is quite similar to HAL. We regard HAL and LSA as black boxes. Through a Pearsonrsquos correlation analysis to the outputs of these two black boxes, we conclude that LSA highly co-relates with HAL and thus there is a justification that LSA and HAL can potentially play a similar role in the area of facilitating automatic query refinement. This paper evaluates LSA in a new application area and contributes an effective way to compare different semantic space models.
Resumo:
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county-based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S.
Resumo:
Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort. In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment labels of those lexicon words being expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie-review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than existing weakly-supervised sentiment classification methods despite using no labeled documents.
Resumo:
This paper presents a statistical comparison of regional phonetic and lexical variation in American English. Both the phonetic and lexical datasets were first subjected to separate multivariate spatial analyses in order to identify the most common dimensions of spatial clustering in these two datasets. The dimensions of phonetic and lexical variation extracted by these two analyses were then correlated with each other, after being interpolated over a shared set of reference locations, in order to measure the similarity of regional phonetic and lexical variation in American English. This analysis shows that regional phonetic and lexical variation are remarkably similar in Modern American English.
Resumo:
The paper presents an approach to extraction of facts from texts of documents. This approach is based on using knowledge about the subject domain, specialized dictionary and the schemes of facts that describe fact structures taking into consideration both semantic and syntactic compatibility of elements of facts. Actually extracted facts combine into one structure the dictionary lexical objects found in the text and match them against concepts of subject domain ontology.
Resumo:
This paper introduces a quantitative method for identifying newly emerging word forms in large time-stamped corpora of natural language and then describes an analysis of lexical emergence in American social media using this method based on a multi-billion word corpus of Tweets collected between October 2013 and November 2014. In total 29 emerging word forms, which represent various semantic classes, grammatical parts-of speech, and word formations processes, were identified through this analysis. These 29 forms are then examined from various perspectives in order to begin to better understand the process of lexical emergence.
Resumo:
Sociolinguists have documented the substrate influence of various languages on the formation of dialects in numerous ethnic-regional setting throughout the United States. This literature shows that while phonological and grammatical influences from other languages may be instantiated as durable dialect features, lexical phenomena often fade over time as ethnolinguistic communities assimilate with contiguous dialect groups. In preliminary investigations of emerging Miami Latino English, we have observed that lexical forms based on Spanish lexical forms are not only ubiquitous among the speech of the first generation Cuban Americans but also of the second. Examples, observed in field work, casual observation, and studied formally in an experimental context include the following: “get down from the car,” which derives from the Spanish equivalent, bajar del carro instead of “get out of the car”. The translation task administered to thirty-one participants showed a variety lexical phenomena are still maintained at equal or higher frequencies.
Resumo:
This study is a variationist sociolinguistic analysis of two speech styles, performance and interview, of a dinner theatre troupe in Ferryland on the Southern Shore of Newfoundland. Five actors and ten of their characters are analyzed to test if their vowels change across styles. The study adopts a variationist framework with a Community of Practice model, drawing on Bell’s audience and referee design to argue that the performers’ stage conventions and identity construction are influenced by a third person referee: the Idealized Authentic Newfoundlander (IAN). Under this view the goal of the performer is to both communicate with and entertain the audience, which requires different tactics when speaking. These tactics manifest phonetically and are discussed in a quantitative, statistical analysis of the acoustic measurements of the vowel tokens [variables FACE, KIT, LOT/PALM and GOAT lexical sets with Newfoundland Irish English (NIE) variants] and a qualitative discussion.
Resumo:
Il a été avancé que des apprenants expérimentés développeraient des niveaux élevés de conscience métalinguistique (MLA), ce qui leur faciliterait l’apprentissage de langues subséquentes (p.ex., Singleton & Aronin, 2007). De plus, des chercheurs dans le domaine de l’acquisition des langues tierces insistent sur les influences positives qu’exercent les langues précédemment apprises sur l’apprentissage formel d’une langue étrangère (p.ex., Cenoz & Gorter, 2015), et proposent de délaisser le regard traditionnel qui mettait l’accent sur l’interférence à l’origine des erreurs des apprenants pour opter pour une vision plus large et positive de l’interaction entre les langues. Il a été démontré que la similarité typologique ainsi que la compétence dans la langue source influence tous les types de transfert (p.ex., Ringbom, 1987, 2007). Cependant, le défi méthodologique de déterminer, à la fois l’usage pertinent d’une langue cible en tant que résultat d’une influence translinguistique (p.ex., Falk & Bardel, 2010) et d’établir le rôle crucial de la MLA dans l’activation consciente de mots ou de constructions reliés à travers différentes langues, demeure. La présente étude avait pour but de relever ce double défi en faisant appel à des protocoles oraux (TAPs) pour examiner le transfert positif de l’anglais (L2) vers l’allemand (L3) chez des Québécois francophones après cinq semaines d’enseignement formel de la L3. Les participants ont été soumis à une tâche de traduction développée aux fins de la présente étude. Les 42 items ont été sélectionnés sur la base de jugements de similarité et d’imagibilité ainsi que de fréquence des mots provenant d’une étude de cognats allemands-anglais (Friel & Kennison, 2001). Les participants devaient réfléchir à voix haute pendant qu’ils traduisaient des mots inconnus de l’allemand (L3) vers le français (L1). Le transfert positif a été opérationnalisé par des traductions correctes qui étaient basées sur un cognat anglais. La MLA a été mesurée par le biais du THAM (Test d’habiletés métalinguistiques) (Pinto & El Euch, 2015) ainsi que par l’analyse des TAPs. Les niveaux de compétence en anglais ont été établis sur la base du Michigan Test (Corrigan et al., 1979), tandis que les niveaux d’exposition ainsi que l’intérêt envers la langue et la culture allemandes ont été mesurés à l’aide d’un questionnaire. Une analyse fine des TAPs a révélé de la variabilité inter- et intra-individuelle dans l’activation consciente du vocabulaire en L2, tout en permettant l’identification de niveaux distincts de prise de conscience. Deux modèles indépendants de régressions logistiques ont permis d’identifier les deux dimensions de MLA comme prédicteurs de transfert positif. Le premier modèle, dans lequel le THAM était la mesure exclusive de MLA, a déterminé cette dimension réflexive comme principal prédicteur, suivie de la compétence en anglais, tandis qu’aucune des autres variables indépendantes pouvait prédire le transfert positif de l’anglais. Dans le second modèle, incluant le THAM ainsi que les TAPs comme mesures complémentaires de MLA, la dimension appliquée de MLA, telle que mesurée par les TAPs, était de loin le prédicteur principal, suivie de la dimension réflexive, telle que mesurée par le THAM, tandis que la compétence en anglais ne figurait plus parmi les facteurs ayant une influence significative sur la variable réponse. Bien que la verbalisation puisse avoir influencé la performance dans une certaine mesure, nos observations mettent en évidence la contribution précieuse de données introspectives comme complément aux résultats basés sur des caractéristiques purement linguistiques du transfert. Nos analyses soulignent la complexité des processus métalinguistiques et des stratégies individuelles, ce qui reflète une perspective dynamique du multilinguisme (p.ex., Jessner, 2008).
Resumo:
The current study is a post-hoc analysis of data from the original randomized control trial of the Play and Language for Autistic Youngsters (PLAY) Home Consultation program, a parent-mediated, DIR/Floortime based early intervention program for children with ASD (Solomon, Van Egeren, Mahone, Huber, & Zimmerman, 2014). We examined 22 children from the original RCT who received the PLAY program. Children were split into two groups (high and lower functioning) based on the ADOS module administered prior to intervention. Fifteen-minute parent-child video sessions were coded through the use of CHILDES transcription software. Child and maternal language, communicative behaviors, and communicative functions were assessed in the natural language samples both pre- and post-intervention. Results demonstrated significant improvements in both child and maternal behaviors following intervention. There was a significant increase in child verbal and non-verbal initiations and verbal responses in whole group analysis. Total number of utterances, word production, and grammatical complexity all significantly improved when viewed across the whole group of participants; however, lexical growth did not reach significance. Changes in child communicative function were especially noteworthy, and demonstrated a significant increase in social interaction and a significant decrease in non-interactive behaviors. Further, mothers demonstrated an increase in responsiveness to the child’s conversational bids, increased ability to follow the child’s lead, and a decrease in directiveness. When separated for analyses within groups, trends emerged for child and maternal variables, suggesting greater gains in use of communicative function in both high and low groups over changes in linguistic structure. Additional analysis also revealed a significant inverse relationship between maternal responsiveness and child non-interactive behaviors; as mothers became more responsive, children’s non-engagement was decreased. Such changes further suggest that changes in learned skills following PLAY parent training may result in improvements in child social interaction and language abilities.
Resumo:
This article examines discourses associated with a new environmental movement, “Carbon Rationing Action Groups” (CRAGs). This case study is intended to contribute to a wider investigation of the emergence of a new type of language used to debate climate change mitigation. Advice on how to reduce one's “carbon footprint,” for example, is provided almost daily. Much of this advice is framed by the use of metaphors and “carbon compounds”—lexical combinations of at least two roots—such as “carbon finance” or “low carbon diet.” The study uses a combination of tools from frame analysis and lexical pragmatics within the general framework of ecolinguistics to compare and contrast language use on the CRAGs' website with press coverage reporting on them. The analysis shows how the use of such lexical carbon compounds enables and facilitates different types of metaphorical frames such as dieting, finance and tax paying, war time rationing, and religious imperatives in the two corpora.
Resumo:
The aim of the present study was to find out the level of lexical sophistication and the mean length of sentences used in compositions written by Finnish upper secondary school students of English. In addition, the present study investigated the possible relationship between the two abovementioned variables. The study at hand was longitudinal: as data, a set of 50 compositions were collected in 2014 from the same writers both in the first and the final year of upper secondary school, 25 in the first year and 25 in the final year. In the analysis, an internet-based program called VocabProfile was utilized in order to find out the lexical sophistication of the investigated students. To find out the mean length of sentence and the relationship between these two, I used Microsoft Excel. Findings of the present study include a minor decrease in the use of less frequent vocabulary and a slight increase in the use of the two most frequently appearing thousand words of English: both of these changes were 1.99 percentage points. As for the mean length of sentence, it grew by 1.28 words during upper secondary school. As for the relationship between the two variables, no clear correlations could be found. It became, however, relatively clear that the topic of the composition might have an effect on the results. Thus more research is needed to fully see the effect of lexical sophistication and mean length of sentence on one another. In addition, future research would benefit greatly if all investigated students wrote on the same topic.
Resumo:
Raman spectroscopy of formamide-intercalated kaolinites treated using controlled-rate thermal analysis technology (CRTA), allowing the separation of adsorbed formamide from intercalated formamide in formamide-intercalated kaolinites, is reported. The Raman spectra of the CRTA-treated formamide-intercalated kaolinites are significantly different from those of the intercalated kaolinites, which display a combination of both intercalated and adsorbed formamide. An intense band is observed at 3629 cm-1, attributed to the inner surface hydroxyls hydrogen bonded to the formamide. Broad bands are observed at 3600 and 3639 cm-1, assigned to the inner surface hydroxyls, which are hydrogen bonded to the adsorbed water molecules. The hydroxyl-stretching band of the inner hydroxyl is observed at 3621 cm-1 in the Raman spectra of the CRTA-treated formamide-intercalated kaolinites. The results of thermal analysis show that the amount of intercalated formamide between the kaolinite layers is independent of the presence of water. Significant differences are observed in the CO stretching region between the adsorbed and intercalated formamide.