866 resultados para Lexical semantics
Resumo:
Answer Set Programming (ASP) is a popular framework for modelling combinatorial problems. However, ASP cannot be used easily for reasoning about uncertain information. Possibilistic ASP (PASP) is an extension of ASP that combines possibilistic logic and ASP. In PASP a weight is associated with each rule, whereas this weight is interpreted as the certainty with which the conclusion can be established when the body is known to hold. As such, it allows us to model and reason about uncertain information in an intuitive way. In this paper we present new semantics for PASP in which rules are interpreted as constraints on possibility distributions. Special models of these constraints are then identified as possibilistic answer sets. In addition, since ASP is a special case of PASP in which all the rules are entirely certain, we obtain a new characterization of ASP in terms of constraints on possibility distributions. This allows us to uncover a new form of disjunction, called weak disjunction, that has not been previously considered in the literature. In addition to introducing and motivating the semantics of weak disjunction, we also pinpoint its computational complexity. In particular, while the complexity of most reasoning tasks coincides with standard disjunctive ASP, we find that brave reasoning for programs with weak disjunctions is easier.
Resumo:
Research in emotion analysis of text suggest that emotion lexicon based features are superior to corpus based n-gram features. However the static nature of the general purpose emotion lexicons make them less suited to social media analysis, where the need to adopt to changes in vocabulary usage and context is crucial. In this paper we propose a set of methods to extract a word-emotion lexicon automatically from an emotion labelled corpus of tweets. Our results confirm that the features derived from these lexicons outperform the standard Bag-of-words features when applied to an emotion classification task. Furthermore, a comparative analysis with both manually crafted lexicons and a state-of-the-art lexicon generated using Point-Wise Mutual Information, show that the lexicons generated from the proposed methods lead to significantly better classi- fication performance.
Resumo:
Discussion forums have evolved into a dependablesource of knowledge to solvecommon problems. However, only a minorityof the posts in discussion forumsare solution posts. Identifying solutionposts from discussion forums, hence, is animportant research problem. In this paper,we present a technique for unsupervisedsolution post identification leveraginga so far unexplored textual feature, thatof lexical correlations between problemsand solutions. We use translation modelsand language models to exploit lexicalcorrelations and solution post characterrespectively. Our technique is designedto not rely much on structural featuressuch as post metadata since suchfeatures are often not uniformly availableacross forums. Our clustering-based iterativesolution identification approach basedon the EM-formulation performs favorablyin an empirical evaluation, beatingthe only unsupervised solution identificationtechnique from literature by a verylarge margin. We also show that our unsupervisedtechnique is competitive againstmethods that require supervision, outperformingone such technique comfortably.
Resumo:
We consider the problem of segmenting text documents that have a
two-part structure such as a problem part and a solution part. Documents
of this genre include incident reports that typically involve
description of events relating to a problem followed by those pertaining
to the solution that was tried. Segmenting such documents
into the component two parts would render them usable in knowledge
reuse frameworks such as Case-Based Reasoning. This segmentation
problem presents a hard case for traditional text segmentation
due to the lexical inter-relatedness of the segments. We develop
a two-part segmentation technique that can harness a corpus
of similar documents to model the behavior of the two segments
and their inter-relatedness using language models and translation
models respectively. In particular, we use separate language models
for the problem and solution segment types, whereas the interrelatedness
between segment types is modeled using an IBM Model
1 translation model. We model documents as being generated starting
from the problem part that comprises of words sampled from
the problem language model, followed by the solution part whose
words are sampled either from the solution language model or from
a translation model conditioned on the words already chosen in the
problem part. We show, through an extensive set of experiments on
real-world data, that our approach outperforms the state-of-the-art
text segmentation algorithms in the accuracy of segmentation, and
that such improved accuracy translates well to improved usability
in Case-based Reasoning systems. We also analyze the robustness
of our technique to varying amounts and types of noise and empirically
illustrate that our technique is quite noise tolerant, and
degrades gracefully with increasing amounts of noise
Resumo:
Online forums are becoming a popular way of finding useful
information on the web. Search over forums for existing discussion
threads so far is limited to keyword-based search due
to the minimal effort required on part of the users. However,
it is often not possible to capture all the relevant context in a
complex query using a small number of keywords. Examplebased
search that retrieves similar discussion threads given
one exemplary thread is an alternate approach that can help
the user provide richer context and vastly improve forum
search results. In this paper, we address the problem of
finding similar threads to a given thread. Towards this, we
propose a novel methodology to estimate similarity between
discussion threads. Our method exploits the thread structure
to decompose threads in to set of weighted overlapping
components. It then estimates pairwise thread similarities
by quantifying how well the information in the threads are
mutually contained within each other using lexical similarities
between their underlying components. We compare our
proposed methods on real datasets against state-of-the-art
thread retrieval mechanisms wherein we illustrate that our
techniques outperform others by large margins on popular
retrieval evaluation measures such as NDCG, MAP, Precision@k
and MRR. In particular, consistent improvements of
up to 10% are observed on all evaluation measures
Resumo:
Relatório da prática de ensino supervisionada, Mestrado em Ensino do Espanhol Língua Estrangeira, Universidade de Lisboa, 2011
Resumo:
In the context of monolingual and bilingual retrieval, Simple Knowledge Organisation System (SKOS) datasets can play a dual role as knowledge bases for semantic annotations and as language-independent resources for translation. With no existing track of formal evaluations of these aspects for datasets in SKOS format, we describe a case study on the usage of the Thesaurus for the Social Sciences in SKOS format for a retrieval setup based on the CLEF 2004-2006 Domain-Specific Track topics, documents and relevance assessments. Results showed a mixed picture with significant system-level improvements in terms of mean average precision in the bilingual runs. Our experiments set a new and improved baseline for using SKOS-based datasets with the GIRT collection and are an example of component-based evaluation.
Resumo:
The Cappadocian variety of Ulaghátsh is unique among the Greek-speaking world in having lost the inherited preposition ‘se’. The innovation is found with both locative and allative uses and has af-ected both syntactic contexts in which ‘se’ was originally found, that is, as a simple preposition (1) and as the left-occurring member of circumpositions of the type ‘se’ + NP + spatial adverb (2). (1) a. tránse ci [to meidán] en ávʝa see.PST.3SG COMP ART.DEF.SG.ACC yard.SG.ACC COP.3 game.PL.NOM ‘he saw that in the yard is some game’ (Dawkins 1916: 348) b. ta erʝó da qardáʃa évɣan [to qonáq] ART.DEF.PL.NOM two ART.DEF.PL.NOM friend.PL.NOM ascend.PST.3PL ART.DEF.SG.ACC house.SG.ACC ‘the two friends went up to the house’ (Dawkins 1916: 354) (2) émi [ta qonáca mésa], kiríʃde [to ʝasdɯ́q píso] enter.PST.3SG ART.DEF.PL.NOM house.PL.ACC inside hide.PST.3SG. ART.DEF.SG.ACC cushion.SG.ACC behind ‘he went into the houses and hid behind the cushions’ (Dawkins 1916: 348) In this paper, we set out to provide (a) a diachronic account of the loss of ‘se’ in Asia Minor Greek, and (b) a synchronic analysis of its ramifications for the encoding of the semantic and grammatical functions it had prior to its loss. The diachronic development of ‘se’ is traced by comparing the Ulaghátsh data with those obtained from Cappadocian varieties that have neither lost it nor do they show signs of losing it and, crucially, also from varieties in which ‘se’ is in the process of being lost. The comparative analysis shows that the loss first became manifest in circumpositions in which ‘se’ was preposed to the complement to which in turn a wide range of adverbs expressing topological relations were postposed (émi sa qonáca mésa > émi ta qonáca mésa). This finding is accounted for in terms of Sinha and Kuteva’s (1995) distributed spatial semantics framework, which accepts that the elements involved in the constructions under investigation—the verb (émi), ‘se’ and the spatial adverb (mésa)—all contribute to the expression of the spatial relational meaning but with differences in weighting. Of the three, ‘eis’ made the most minimal contribution, the bulk of it being distributed over the verb and the adverb. This allowed for it to be optionally dropped from circumpositions, a stage attested in Phlo-tá Cappadocian and Silliot, and to be later completely abandoned, originally in allative and subsequently in locative contexts (earlier: évɣan so qonáq > évɣan to qonáq; later: so meidán en ávʝa > to meidán en ávʝa). The earlier loss in allative contexts is also dealt with in distributed semantics terms as verbs of motion such as έβγαν are semantically more loaded than vacuous verbs like the copula and therefore the preposition could be left out in the former context more easily than in the latter. The analysis also addresses the possibility that the loss of ‘se’ may ultimately originate in substandard forms of Medieval Greek, which according to Tachibana (1994) displayed SPATIAL ADVERB + NP constructions. Applying the semantic map model (Croft 2003, Haspelmath 2003), the synchronic analysis of the varieties that retain ‘se’ reveals that—like many other allative markers crosslinguistically—it displays a pattern of multifunctionality in expressing nine different functions (among others allative, locative, recipient, addressee, experiencer), which can be mapped against four domains, viz. the spatiotemporal, the social, the mental and the logicotextual (cf. Rice & Kabata 2007). In Ulaghátsh Cappadocian, none of these functions is overtly marked as such. In cases like (1), the intended spatial relational meaning is arrived at through the combination of the syntax and the inherent semantics of the verb and the zero-marked NP as well as from the context. In environments of the type exemplified by (2), the adverb contributes further to the correct interpretation. The analysis additionally shows that, despite the loss of ‘se’, Ulaghátsh patterns with all other Cappadocian varieties in one important aspect: Goal and Location are expressed similarly (by zero in Ulaghátsh, by ‘se’ in the other varieties) whereas Source is being kept distinct (expressed by ‘apó’ in all varieties). Goal-Location polysemy is very common across the world’s languages and, most crucially, prevails over other possible polysemies in the tripartite distinction Source—Location—Goal (Lestrade 2010, Nikitina 2009). Taking into account this empirical observation, our findings suggest that the reor-anisation of spatial systems can have a local effect—in our case the loss of a member of the prepositional paradigm—but will keep the original global picture intact, thus conforming to crosslinguistically robust tendencies. References Croft, W. 2001. Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford: Oxford University Press. Dawkins, R. M. 1916. Modern Greek in Asia Minor: A Study of the Dialects of Sílli, Cappadocia and Phárasa with Grammar, Texts, Translations and Glossary. Cambridge: Cambridge University Press. Haspelmath, M. 2003. The geometry of grammatical meaning: semantic maps and cross-linguistic comparison. In M. Tomasello (Ed.), The New Psychology of Language, Volume 2. New York: Erlbaum, 211–243. Lestrade, S. 2010. The Space of Case. Doctoral dissertation. Radboud University Nijmegen. Nikitina, T. 2009. Subcategorization pattern and lexical meaning of motion verbs: a study of the source/goal ambiguity. Linguistics 47, 1113–1141. Rice, S. & K. Kabata. 2007. Cross-linguistic grammaticalization patterns of the allative. Linguistic Typology 11, 451–514. Sinha, C. & T. Kuteva. 1995. Distributed spatial semantics. Nordic Journal of Linguistics 18:2, 167–199. Tachibana, T. 1994. Syntactic structure of spatial expressions in the “Late Byzantine Prose Alexander Romance”. Propylaia 6, 35–51.
Resumo:
Dissertação apresentada à Escola Superior de Educação de Lisboa para a obtenção de grau de Mestre em Didática da Língua Portuguesa no 1.º e 2.º Ciclos do Ensino Básico
Resumo:
After a historical introduction, the bulk of the thesis concerns the study of a declarative semantics for logic programs. The main original contributions are: ² WFSX (Well–Founded Semantics with eXplicit negation), a new semantics for logic programs with explicit negation (i.e. extended logic programs), which compares favourably in its properties with other extant semantics. ² A generic characterization schema that facilitates comparisons among a diversity of semantics of extended logic programs, including WFSX. ² An autoepistemic and a default logic corresponding to WFSX, which solve existing problems of the classical approaches to autoepistemic and default logics, and clarify the meaning of explicit negation in logic programs. ² A framework for defining a spectrum of semantics of extended logic programs based on the abduction of negative hypotheses. This framework allows for the characterization of different levels of scepticism/credulity, consensuality, and argumentation. One of the semantics of abduction coincides with WFSX. ² O–semantics, a semantics that uniquely adds more CWA hypotheses to WFSX. The techniques used for doing so are applicable as well to the well–founded semantics of normal logic programs. ² By introducing explicit negation into logic programs contradiction may appear. I present two approaches for dealing with contradiction, and show their equivalence. One of the approaches consists in avoiding contradiction, and is based on restrictions in the adoption of abductive hypotheses. The other approach consists in removing contradiction, and is based in a transformation of contradictory programs into noncontradictory ones, guided by the reasons for contradiction.
Resumo:
A presente dissertação insere-se no âmbito do mestrado em Linguística: Ciências da Linguagem. Com este trabalho pretende-se, com base num estudo de caso sobre a Aquisição da Competência Lexical na Aprendizagem do Português Língua segunda, constatar se os alunos angolanos que aprendem o Português como Língua segunda adquirem e desenvolvem a competência lexical, atendendo às suas especificidades. Nesta dissertação discute-se sobre o ensino do Português e consequente aquisição da competência lexical, face à realidade plurilingue considerando as metodologias adotadas para o efeito. Sendo o português a língua do discurso pedagógico em Angola, e concomitantemente, língua segunda para a maioria da população angolana que é utente de diversas línguas (locais, nativas) designadas nacionais ou africanas de Angola, suscitou o mais vivo interesse em refletir sobre o seu ensino, as metodologias usadas para o efeito, visando a aquisição e o desenvolvimento da competência lexical de alunos que o aprendem. A pluralidade linguística de Angola coloca ao estado, aos professores de Língua Portuguesa, e não só, desafios enormes no que diz respeito à adoção de política linguística, quer da Língua Portuguesa, quer das línguas africanas de Angola no que concerne ao seu ensino e na promoção do sucesso escolar nos mais variados níveis de escolaridade. Por estas e outras razões, defende-se nesta dissertação não só a clarificação de metodologias adequadas e contextualizadas para o ensino do Português em Angola, tanto como língua segunda ou como língua materna, optando-se por uma ou outra metodologia com base na realidade específica do aluno, pois não se deve ignorar a proveniência linguística primária do aprendente, para que se consigam aprendizagens harmoniosas, sólidas e significativas.
Resumo:
BACKGROUND: The number of nonagenarians and centenarians is rising dramatically, and many of them live in nursing homes. Very little is known about psychiatric symptoms and cognitive abilities other than memory in this population. This exploratory study focuses on anosognosia and its relationship with common psychiatric and cognitive symptoms. METHODS: Fifty-eight subjects aged 90 years or older were recruited from geriatric nursing homes and divided into five groups according to Mini-Mental State Examination scores. Assessment included the five-word test, executive clock-drawing task, lexical and categorical fluencies, Anosognosia Questionnaire-Dementia, Neuropsychiatric Inventory, and Charlson Comorbidity Index. RESULTS: Subjects had moderate cognitive impairment, with mean ± SD Mini-Mental State Examination being 15.41 ± 7.04. Anosognosia increased with cognitive impairment and was associated with all cognitive domains, as well as with apathy and agitation. Subjects with mild global cognitive decline seemed less anosognosic than subjects with the least or no impairment. Neither anosognosia nor psychopathological features were related to physical conditions. CONCLUSIONS: Anosognosia in oldest-old nursing home residents was mostly mild. It was associated with both cognitive and psychopathological changes, but whether anosognosia is causal to the observed psychopathological features requires further investigation.
Resumo:
Lexical processing among bilinguals is often affected by complex patterns of individual experience. In this paper we discuss the psychocentric perspective on language representation and processing, which highlights the centrality of individual experience in psycholinguistic experimentation. We discuss applications to the investigation of lexical processing among multilinguals and explore the advantages of using high-density experiments with multilinguals. High density experiments are designed to co-index measures of lexical perception and production, as well as participant profiles. We discuss the challenges associated with the characterization of participant profiles and present a new data visualization technique, that we term Facial Profiles. This technique is based on Chernoff faces developed over 40 years ago. The Facial Profile technique seeks to overcome some of the challenges associated with the use of Chernoff faces, while maintaining the core insight that recoding multivariate data as facial features can engage the human face recognition system and thus enhance our ability to detect and interpret patterns within multivariate datasets. We demonstrate that Facial Profiles can code participant characteristics in lexical processing studies by recoding variables such as reading ability, speaking ability, and listening ability into iconically-related relative sizes of eye, mouth, and ear, respectively. The balance of ability in bilinguals can be captured by creating composite facial profiles or Janus Facial Profiles. We demonstrate the use of Facial Profiles and Janus Facial Profiles in the characterization of participant effects in the study of lexical perception and production.
Resumo:
This lexical decision study with eye tracking of Japanese two-kanji-character words investigated the order in which a whole two-character word and its morphographic constituents are activated in the course of lexical access, the relative contributions of the left and the right characters in lexical decision, the depth to which semantic radicals are processed, and how nonlinguistic factors affect lexical processes. Mixed-effects regression analyses of response times and subgaze durations (i.e., first-pass fixation time spent on each of the two characters) revealed joint contributions of morphographic units at all levels of the linguistic structure with the magnitude and the direction of the lexical effects modulated by readers’ locus of attention in a left-to-right preferred processing path. During the early time frame, character effects were larger in magnitude and more robust than radical and whole-word effects, regardless of the font size and the type of nonwords. Extending previous radical-based and character-based models, we propose a task/decision-sensitive character-driven processing model with a level-skipping assumption: Connections from the feature level bypass the lower radical level and link up directly to the higher character level.
Resumo:
Faculty of Medicine, University of Montreal, and the Canadian Institutes of Health Research