31 resultados para Multilingual lexical
Resumo:
This article offers a critical conceptual discussion and refinement of Chomsky’s (2000, 2001, 2007, 2008) phase system, addressing many of the problematic aspects highlighted in the critique of Boeckx & Grohmann (2007) and seeking to resolve these issues, in particular the stipulative and arbitrary properties of phases and phase edges encoded in the (various versions of the) Phase Impenetrability Condition (PIC). Chomsky’s (2000) original conception of phases as lexical subarrays is demonstrated to derive these properties straightforwardly once a single assumption about the pairwise composition of phases is made, and the PIC is reduced to its necessary core under the Strong Minimalist Thesis (SMT)—namely, the provision of an edge. Finally, a comparison is undertaken of the lexical-subarray conception of phases with the feature-inheritance system of Chomsky 2007, 2008, in which phases are simply the locus of uninterpretable features (probes). Both conceptions are argued to conform to the SMT, and both converge on a pairwise composition of phases. However, the two conceptions of phases are argued to be mutually incompatible in numerous fundamental ways, with no current prospect of unification. The lexical-subarray conception of phases is then to be preferred on grounds of greater empirical adequacy.
Resumo:
Achieving a clearer picture of categorial distinctions in the brain is essential for our understanding of the conceptual lexicon, but much more fine-grained investigations are required in order for this evidence to contribute to lexical research. Here we present a collection of advanced data-mining techniques that allows the category of individual concepts to be decoded from single trials of EEG data. Neural activity was recorded while participants silently named images of mammals and tools, and category could be detected in single trials with an accuracy well above chance, both when considering data from single participants, and when group-training across participants. By aggregating across all trials, single concepts could be correctly assigned to their category with an accuracy of 98%. The pattern of classifications made by the algorithm confirmed that the neural patterns identified are due to conceptual category, and not any of a series of processing-related confounds. The time intervals, frequency bands and scalp locations that proved most informative for prediction permit physiological interpretation: the widespread activation shortly after appearance of the stimulus (from 100. ms) is consistent both with accounts of multi-pass processing, and distributed representations of categories. These methods provide an alternative to fMRI for fine-grained, large-scale investigations of the conceptual lexicon. © 2010 Elsevier Inc.
Resumo:
We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended. © 2007 SPIE-IS&T.
Resumo:
Most studies of conceptual knowledge in the brain focus on a narrow range of concrete conceptual categories, rely on the researchers' intuitions about which object belongs to these categories, and assume a broadly taxonomic organization of knowledge. In this fMRI study, we focus on concepts with a variety of concreteness levels; we use a state of the art lexical resource (WordNet 3.1) as the source for a relatively large number of category distinctions and compare a taxonomic style of organization with a domain-based model (associating concepts with scenarios). Participants mentally simulated situations associated with concepts when cued by text stimuli. Using multivariate pattern analysis, we find evidence that all Taxonomic categories and Domains can be distinguished from fMRI data and also observe a clear concreteness effect: Tools and Locations can be reliably predicted for unseen participants, but less concrete categories (e.g., Attributes, Communications, Events, Social Roles) can only be reliably discriminated within participants. A second concreteness effect relates to the interaction of Domain and Taxonomic category membership: Domain (e.g., relation to Law vs. Music) can be better predicted for less concrete categories. We repeated the analysis within anatomical regions, observing discrimination between all/most categories in the left middle occipital and temporal gyri, and more specialized discrimination for concrete categories Tool and Location in the left precentral and fusiform gyri, respectively. Highly concrete/abstract Taxonomic categories and Domain were segregated in frontal regions. We conclude that both Taxonomic and Domain class distinctions are relevant for interpreting neural structuring of concrete and abstract concepts.
Resumo:
University classroom talk is a collaborative struggle to make meaning. Taking the perspectival nature of interaction as central, this paper presents an investigation of the genre of spoken academic discourse and in particular the types of activities which are orientated to the goal of collaborative ideas or tasks, such as seminars, tutorials, workshops. The purpose of the investigation was to identify examples of dialogicality through an examination of stance-taking. The data used in this study is a spoken corpus of academic English created from recordings of a range of subject discipline classrooms at a UK university. A frequency-based approach to recurrent word sequences (lexical bundles) was used to identify signals of epistemic and attitudinal stance and to initiate an exploration of the features of elaboration. Findings of quantitative and qualitative analyses reveal some similarities and differences between this study and those of US based classroom contexts in relation to the use and frequency of lexical bundles. Findings also highlight the process that elaboration plays in grounding perspectives and negotiating alignment of interactants. Elaboration seems to afford the space for the enactment of student stance in relation to the tutor embodiment of discipline knowledge.
Resumo:
Will Kymlicka's liberal culturalism presents a tension between the idea that linguistic diversity in multilingual polities should be protected and the claim that democratic debate across linguistic boundaries is unfeasible. In this article, I resolve that tension by arguing that trans-lingual democratic deliberation in multilingual polities is necessary to legitimise those measures aimed at the protection of linguistic diversity. I conclude that my account provides a coherent normative response to the challenges faced by the European Union (EU) in the field of language policy and that an EU-wide deliberative forum is not as unfeasible as Kymlicka suggests.
Resumo:
This article discusses the relationship between three language communities in Europe with variant levels of official recognition, namely Kashub, Sorb, and Silesian, and the institutions of their host states as regards their respective use, promotion, and revital-ization. Most language communities across the world campaign for recognition within a geographic/political region, or on the basis of a historic/group identity to ensure their language's use and status. The examples discussed here illustrate that language recognition and policies resulting therefrom and promoting official monolin-gualism strengthen the symbolic status of the language but contribute little to the functionality of language communities outside the area. As this article illustrates, in increasingly multilingual societies, language policies cut off its speakers from the political, economic, and social opportunities accessible through the medium of languages that lack official recognition locally. © 2014 Taylor & Francis Group, LLC.
Resumo:
We present the results of exploratory experiments using lexical valence extracted from brain using electroencephalography (EEG) for sentiment analysis. We selected 78 English words (36 for training and 42 for testing), presented as stimuli to 3 English native speakers. EEG signals were recorded from the subjects while they performed a mental imaging task for each word stimulus. Wavelet decomposition was employed to extract EEG features from the time-frequency domain. The extracted features were used as inputs to a sparse multinomial logistic regression (SMLR) classifier for valence classification, after univariate ANOVA feature selection. After mapping EEG signals to sentiment valences, we exploited the lexical polarity extracted from brain data for the prediction of the valence of 12 sentences taken from the SemEval-2007 shared task, and compared it against existing lexical resources.
Resumo:
This collection explores the central importance of values and evaluative concepts in cross-cultural translational encounters. Written by a group of international scholars from a diverse range of linguistic and cultural backgrounds, the chapters in this book consider what it means to translate cultures by examining core values and their relationship to key evaluative concepts (such as authenticity, clarity, home, honour, or justice) and how they influence the complex multidimensional process of translation. This book will be of interest to academics studying cross-cultural and inter-linguistic interactions, to translators and interpreters, students of translation and of modern languages, and all those dealing with multilingual and multicultural settings.
Resumo:
Over the past decade the concept of ‘resilience’ has been mobilised across an increasingly wide range of policy arenas. For example, it has featured prominently within recent discussions on the nature of warfare, the purpose of urban and regional planning, the effectiveness of development policies, the intent of welfare reform and the stability of the international financial system. The term’s origins can be traced back to the work of the ecologist Crawford S. Holling and his formulation of a science of complexity. This paper reflects on the origins of these ideas and their travels from the field of natural resource management, which it now dominates, to contemporary social practices and policy arenas. It reflects on the ways in which a lexicon of complex adaptive systems, grounded in an epistemology of limited knowledge and uncertain futures, seeks to displace ongoing ‘dependence’ on professionals by valorising self-reliance and responsibility as techniques to be applied by subjects in the making of the resilient self. In so doing, resilience is being mobilised to govern a wide range of threats and sources of uncertainty, from climate change, financial crises and terrorism, to the sustainability of development, the financing of welfare and providing for an aging population. As such, ‘resilience’ risks becoming a measure of its subjects’ ‘fitness’ to survive in what are pre-figured as natural, turbulent orders of things.
Resumo:
Research in emotion analysis of text suggest that emotion lexicon based features are superior to corpus based n-gram features. However the static nature of the general purpose emotion lexicons make them less suited to social media analysis, where the need to adopt to changes in vocabulary usage and context is crucial. In this paper we propose a set of methods to extract a word-emotion lexicon automatically from an emotion labelled corpus of tweets. Our results confirm that the features derived from these lexicons outperform the standard Bag-of-words features when applied to an emotion classification task. Furthermore, a comparative analysis with both manually crafted lexicons and a state-of-the-art lexicon generated using Point-Wise Mutual Information, show that the lexicons generated from the proposed methods lead to significantly better classi- fication performance.
Resumo:
Discussion forums have evolved into a dependablesource of knowledge to solvecommon problems. However, only a minorityof the posts in discussion forumsare solution posts. Identifying solutionposts from discussion forums, hence, is animportant research problem. In this paper,we present a technique for unsupervisedsolution post identification leveraginga so far unexplored textual feature, thatof lexical correlations between problemsand solutions. We use translation modelsand language models to exploit lexicalcorrelations and solution post characterrespectively. Our technique is designedto not rely much on structural featuressuch as post metadata since suchfeatures are often not uniformly availableacross forums. Our clustering-based iterativesolution identification approach basedon the EM-formulation performs favorablyin an empirical evaluation, beatingthe only unsupervised solution identificationtechnique from literature by a verylarge margin. We also show that our unsupervisedtechnique is competitive againstmethods that require supervision, outperformingone such technique comfortably.
Resumo:
We consider the problem of segmenting text documents that have a
two-part structure such as a problem part and a solution part. Documents
of this genre include incident reports that typically involve
description of events relating to a problem followed by those pertaining
to the solution that was tried. Segmenting such documents
into the component two parts would render them usable in knowledge
reuse frameworks such as Case-Based Reasoning. This segmentation
problem presents a hard case for traditional text segmentation
due to the lexical inter-relatedness of the segments. We develop
a two-part segmentation technique that can harness a corpus
of similar documents to model the behavior of the two segments
and their inter-relatedness using language models and translation
models respectively. In particular, we use separate language models
for the problem and solution segment types, whereas the interrelatedness
between segment types is modeled using an IBM Model
1 translation model. We model documents as being generated starting
from the problem part that comprises of words sampled from
the problem language model, followed by the solution part whose
words are sampled either from the solution language model or from
a translation model conditioned on the words already chosen in the
problem part. We show, through an extensive set of experiments on
real-world data, that our approach outperforms the state-of-the-art
text segmentation algorithms in the accuracy of segmentation, and
that such improved accuracy translates well to improved usability
in Case-based Reasoning systems. We also analyze the robustness
of our technique to varying amounts and types of noise and empirically
illustrate that our technique is quite noise tolerant, and
degrades gracefully with increasing amounts of noise