61 resultados para Word Sense Disambguaion, WSD, Natural Language Processing
Resumo:
PowerAqua is a Question Answering system, which takes as input a natural language query and is able to return answers drawn from relevant semantic resources found anywhere on the Semantic Web. In this paper we provide two novel contributions: First, we detail a new component of the system, the Triple Similarity Service, which is able to match queries effectively to triples found in different ontologies on the Semantic Web. Second, we provide a first evaluation of the system, which in addition to providing data about PowerAqua's competence, also gives us important insights into the issues related to using the Semantic Web as the target answer set in Question Answering. In particular, we show that, despite the problems related to the noisy and incomplete conceptualizations, which can be found on the Semantic Web, good results can already be obtained.
Resumo:
While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In this work we address this issue with a novel semantic search system that a) provides the user with the capability to query Semantic Web information using natural language, by means of an ontology-based Question Answering (QA) system [14] and b) complements the specific answers retrieved during the QA process with a ranked list of documents from the Web [3]. Our results show that ontology-based semantic search capabilities can be used to complement and enhance keyword search technologies.
Resumo:
The goal of semantic search is to improve on traditional search methods by exploiting the semantic metadata. In this paper, we argue that supporting iterative and exploratory search modes is important to the usability of all search systems. We also identify the types of semantic queries the users need to make, the issues concerning the search environment and the problems that are intrinsic to semantic search in particular. We then review the four modes of user interaction in existing semantic search systems, namely keyword-based, form-based, view-based and natural language-based systems. Future development should focus on multimodal search systems, which exploit the advantages of more than one mode of interaction, and on developing the search systems that can search heterogeneous semantic metadata on the open semantic Web.
Resumo:
The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources.
Resumo:
The management and sharing of complex data, information and knowledge is a fundamental and growing concern in the Water and other Industries for a variety of reasons. For example, risks and uncertainties associated with climate, and other changes require knowledge to prepare for a range of future scenarios and potential extreme events. Formal ways in which knowledge can be established and managed can help deliver efficiencies on acquisition, structuring and filtering to provide only the essential aspects of the knowledge really needed. Ontologies are a key technology for this knowledge management. The construction of ontologies is a considerable overhead on any knowledge management programme. Hence current computer science research is investigating generating ontologies automatically from documents using text mining and natural language techniques. As an example of this, results from application of the Text2Onto tool to stakeholder documents for a project on sustainable water cycle management in new developments are presented. It is concluded that by adopting ontological representations sooner, rather than later in an analytical process, decision makers will be able to make better use of highly knowledgeable systems containing automated services to ensure that sustainability considerations are included.
Resumo:
The project “Reference in Discourse” deals with the selection of a specific object from a visual scene in a natural language situation. The goal of this research is to explain this everyday discourse reference task in terms of a concept generation process based on subconceptual visual and verbal information. The system OINC (Object Identification in Natural Communicators) aims at solving this problem in a psychologically adequate way. The system’s difficulties occurring with incomplete and deviant descriptions correspond to the data from experiments with human subjects. The results of these experiments are reported.
Resumo:
The management and sharing of complex data, information and knowledge is a fundamental and growing concern in the Water and other Industries for a variety of reasons. For example, risks and uncertainties associated with climate, and other changes require knowledge to prepare for a range of future scenarios and potential extreme events. Formal ways in which knowledge can be established and managed can help deliver efficiencies on acquisition, structuring and filtering to provide only the essential aspects of the knowledge really needed. Ontologies are a key technology for this knowledge management. The construction of ontologies is a considerable overhead on any knowledge management programme. Hence current computer science research is investigating generating ontologies automatically from documents using text mining and natural language techniques. As an example of this, results from application of the Text2Onto tool to stakeholder documents for a project on sustainable water cycle management in new developments are presented. It is concluded that by adopting ontological representations sooner, rather than later in an analytical process, decision makers will be able to make better use of highly knowledgeable systems containing automated services to ensure that sustainability considerations are included. © 2010 The authors.
Resumo:
Relatively little research on dialect variation has been based on corpora of naturally occurring language. Instead, dialect variation has been studied based primarily on language elicited through questionnaires and interviews. Eliciting dialect data has several advantages, including allowing for dialectologists to select individual informants, control the communicative situation in which language is collected, elicit rare forms directly, and make high-quality audio recordings. Although far less common, a corpus-based approach to data collection also has several advantages, including allowing for dialectologists to collect large amounts of data from a large number of informants, observe dialect variation across a range of communicative situations, and analyze quantitative linguistic variation in large samples of natural language. Although both approaches allow for dialect variation to be observed, they provide different perspectives on language variation and change. The corpus- based approach to dialectology has therefore produced a number of new findings, many of which challenge traditional assumptions about the nature of dialect variation. Most important, this research has shown that dialect variation involves a wider range of linguistic variables and exists across a wider range of language varieties than has previously been assumed. The goal of this chapter is to introduce this emerging approach to dialectology. The first part of this chapter reviews the growing body of research that analyzes dialect variation in corpora, including research on variation across nations, regions, genders, ages, and classes, in both speech and writing, and from both a synchronic and diachronic perspective, with a focus on dialect variation in the English language. Although collections of language data elicited through interviews and questionnaires are now commonly referred to as corpora in sociolinguistics and dialectology (e.g. see Bauer 2002; Tagliamonte 2006; Kretzschmar et al. 2006; D'Arcy 2011), this review focuses on corpora of naturally occurring texts and discourse. The second part of this chapter presents the results of an analysis of variation in not contraction across region, gender, and time in a corpus of American English letters to the editor in order to exemplify a corpus-based approach to dialectology.
Resumo:
The first study of its kind, Regional Variation in Written American English takes a corpus-based approach to map over a hundred grammatical alternation variables across the United States. A multivariate spatial analysis of these maps shows that grammatical alternation variables follow a relatively small number of common regional patterns in American English, which can be explained based on both linguistic and extra-linguistic factors. Based on this rigorous analysis of extensive data, Grieve identifies five primary modern American dialect regions, demonstrating that regional variation is far more pervasive and complex in natural language than is generally assumed. The wealth of maps and data and the groundbreaking implications of this volume make it essential reading for students and researchers in linguistics, English language, geography, computer science, sociology and communication studies.
Resumo:
It is well established that speech, language and phonological skills are closely associated with literacy, and that children with a family risk of dyslexia (FRD) tend to show deficits in each of these areas in the preschool years. This paper examines what the relationships are between FRD and these skills, and whether deficits in speech, language and phonological processing fully account for the increased risk of dyslexia in children with FRD. One hundred and fifty-three 4-6-year-old children, 44 of whom had FRD, completed a battery of speech, language, phonology and literacy tasks. Word reading and spelling were retested 6 months later, and text reading accuracy and reading comprehension were tested 3 years later. The children with FRD were at increased risk of developing difficulties in reading accuracy, but not reading comprehension. Four groups were compared: good and poor readers with and without FRD. In most cases good readers outperformed poor readers regardless of family history, but there was an effect of family history on naming and nonword repetition regardless of literacy outcome, suggesting a role for speech production skills as an endophenotype of dyslexia. Phonological processing predicted spelling, while language predicted text reading accuracy and comprehension. FRD was a significant additional predictor of reading and spelling after controlling for speech production, language and phonological processing, suggesting that children with FRD show additional difficulties in literacy that cannot be fully explained in terms of their language and phonological skills. It is well established that speech, language and phonological skills are closely associated with literacy, and that children with a family risk of dyslexia (FRD) tend to show deficits in each of these areas in the preschool years. This paper examines what the relationships are between FRD and these skills, and whether deficits in speech, language and phonological processing fully account for the increased risk of dyslexia in children with FRD. One hundred and fifty-three 4-6-year-old children, 44 of whom had FRD, completed a battery of speech, language, phonology and literacy tasks. © 2014 John Wiley & Sons Ltd.
Resumo:
We conducted a detailed study of a case of linguistic talent in the context of autism spectrum disorder, specifically Asperger syndrome. I.A. displays language strengths at the level of morphology and syntax. Yet, despite this grammar advantage, processing of figurative language and inferencing based on context presents a problem for him. The morphology advantage for I.A. is consistent with the weak central coherence (WCC) account of autism. From this account, the presence of a local processing bias is evident in the ways in which autistic individuals solve common problems, such as assessing similarities between objects and finding common patterns, and may therefore provide an advantage in some cognitive tasks compared to typical individuals. We extend the WCC account to language and provide evidence for a connection between the local processing bias and the acquisition of morphology and grammar.
Resumo:
Detection thresholds for two visual- and two auditory-processing tasks were obtained for 73 children and young adults who varied broadly in reading ability. A reading-disabled subgroup had significantly higher thresholds than a normal-reading subgroup for the auditory tasks only. When analyzed across the whole group, the auditory tasks and one of the visual tasks, coherent motion detection, were significantly related to word reading. These effects were largely independent of ADHD ratings; however, none of these measures accounted for significant variance in word reading after controlling for full-scale IQ. In contrast, phoneme awareness, rapid naming, and nonword repetition each explained substantial, significant word reading variance after controlling for IQ, suggesting more specific roles for these oral language skills in the development of word reading. © 2004 Elsevier Inc. All rights reserved.
Resumo:
Background: Oral anticoagulation (OAC) reduces stroke risk in patients with atrial fibrillation (AF); however it is still underutilized and sometimes refused by patients. Two inter-related studies were undertaken to understand the experiences and what influences this un- derutilisation of warfarin treatment in AF patients. These studies explored physician and patient experiences of AF and OAC treatment. The paper focuses on specific sub-themes from the study that explored patients’ experiences will be discussed. Aim: The study in question aimed to explore the experiences which influence patients’ decisions to accept, decline or discontinue OAC. Methods: Semi-structured individual interviews with patients were con- ducted. Three sub-groups of patients (n = 11) diagnosed with AF were interviewed; those who accepted, refused, and who discontinued war- farin. Interpretative phenomenological analysis (IPA) was used to examine the data. IPA is a qualitative method that focuses on how participants make sense of an experiences phenomenon Results: Three over-arching themes comprised patients’ experiences: (i)the initial consultation, (ii) life after the consultation, and (iii) patients’reflections. In the last theme, patients reflected on their perceptions ofaspirin and warfarin. Aspirin was perceived as a natural wonder-drugwhile warfarin was perceived as a dangerous drug usually given to peo-ple at the end of their life. Interestingly they perceive both drugs as‘old’. However, for aspirin it had a positive association, old meaningtried and tested. While for warfarin, old meant ‘has been around fortoo long’.Conclusion: Media had an important role in how patients’ perceptionsof these two drugs were influenced. Literature shows that framingtechniques, i.e. using certain words or phrases such as ‘rat poison’, areprocesses adopted by media to alter medical knowledge into lay per-son’s language. Patients in turn form negative cognitive schemas,between the word ‘poison’ and warfarin, leading to the negative per-ception of warfarin which could influence non-adherence to treatment.This qualitative research highlighted the potential influences of themedia on AF patient perceptions commencing OAC treatment. Theassociation between media stimuli and patient perceptions on OACshould be further explored. The influential power of lay-media couldalso be instrumental in disseminating appropriate educational materialto the public
Resumo:
It has been proposed that language impairments in children with Autism Spectrum Disorders (ASD) stem from atypical neural processing of speech and/or nonspeech sounds. However, the strength of this proposal is compromised by the unreliable outcomes of previous studies of speech and nonspeech processing in ASD. The aim of this study was to determine whether there was an association between poor spoken language and atypical event-related field (ERF) responses to speech and nonspeech sounds in children with ASD (n = 14) and controls (n = 18). Data from this developmental population (ages 6-14) were analysed using a novel combination of methods to maximize the reliability of our findings while taking into consideration the heterogeneity of the ASD population. The results showed that poor spoken language scores were associated with atypical left hemisphere brain responses (200 to 400 ms) to both speech and nonspeech in the ASD group. These data support the idea that some children with ASD may have an immature auditory cortex that affects their ability to process both speech and nonspeech sounds. Their poor speech processing may impair their ability to process the speech of other people, and hence reduce their ability to learn the phonology, syntax, and semantics of their native language.