82 resultados para Toponym disambiguation


Relevância:

10.00% 10.00%

Publicador:

Resumo:

本文提出了一种可以表示常识及语言知识的意象知识体系。在这种知识的形式化表示基础上,给出了NLP中的消歧知识及其表示形式,以及基于消歧知识的消歧策略。最后,论述了这种方法实现上的可行性。

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fictitious personal names and toponyms are not infrequent in legal casenotes as used for didactic purposes nowadays. There is a long tradition of fictitious names being used in the legal literature. The problem with medieval or early modern legal (here, rabbinical) responsa is that if they are used as evidence for historical purposes, as though they were chronicles, confusion may occurs. Historian Eliezer Bashan showed that this is the case, indeed, with particular reference to rabbinical responsa from the Ottoman empire where Holy Land toponyms occur. He set forth several tentative rules to decide whether a toponym is there to literally refer to the place it names, or whether, instead, the name is used fictitiously. This paper formalizes the ruleset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A formal representation is given of the situational structure, and the agents' beliefs about personal identity, in the Smemorato di Collegno amnesia case tried in 1927, in Pollenza, Italy. Another section discusses and formalizes a sample heuristic rule for conjecturing whether an individual identity other than personal, being conveyed by a toponym, was used literally or fictitiously in a given historical corpus of legal casenotes. For example, a landlocked city being named and referred to as though it was a sea port is a fairly good cue for assuming that the toponym is a disguise. Yet, the interpretation is governed by other conventions, when in a play by Shakeaspeare it is stated that a given scene is set on the sea coast of Bohemia. Further discussion of a situational casuistry for identification (especially individual and personal) along with more formal representations will appear in a companion paper "nissanidentifpirandello", also at the disciplinary meet of AI formalisms and legal applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The seventh-century Patrician documents in the Book of Armagh, and other early sources such as Bethu Phátraic, contain references to the toponym Macha, which has been identified by the Dictionary of the Irish Language with either the ecclesiastical centre of Ard Macha or the ‘royal seat’ of Emain Macha. This article examines the evidence for the name in the sources and illustrates that Macha applies primarily to the plain in which both Ard Macha and Emain Macha are located. It is to be identified with Mag Macha ‘the plain of Macha’, familiar to us from the Dindshenchus, and further evidence of the organic potential of a given toponym is witnessed in later sources where the plain is referred to as Mag/Machaire na hE(a)mna ‘the plain of Emain’ and Machaire Aird/Arda Macha ‘the plain of Armagh’. The extent of Macha is difficult to establish with certainty, but it seems very likely that it stretched north to the River Blackwater as well as south towards Slíab Fúait.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Il est connu que les problèmes d'ambiguïté de la langue ont un effet néfaste sur les résultats des systèmes de Recherche d'Information (RI). Toutefois, les efforts de recherche visant à intégrer des techniques de Désambiguisation de Sens (DS) à la RI n'ont pas porté fruit. La plupart des études sur le sujet obtiennent effectivement des résultats négatifs ou peu convaincants. De plus, des investigations basées sur l'ajout d'ambiguïté artificielle concluent qu'il faudrait une très haute précision de désambiguation pour arriver à un effet positif. Ce mémoire vise à développer de nouvelles approches plus performantes et efficaces, se concentrant sur l'utilisation de statistiques de cooccurrence afin de construire des modèles de contexte. Ces modèles pourront ensuite servir à effectuer une discrimination de sens entre une requête et les documents d'une collection. Dans ce mémoire à deux parties, nous ferons tout d'abord une investigation de la force de la relation entre un mot et les mots présents dans son contexte, proposant une méthode d'apprentissage du poids d'un mot de contexte en fonction de sa distance du mot modélisé dans le document. Cette méthode repose sur l'idée que des modèles de contextes faits à partir d'échantillons aléatoires de mots en contexte devraient être similaires. Des expériences en anglais et en japonais montrent que la force de relation en fonction de la distance suit généralement une loi de puissance négative. Les poids résultant des expériences sont ensuite utilisés dans la construction de systèmes de DS Bayes Naïfs. Des évaluations de ces systèmes sur les données de l'atelier Semeval en anglais pour la tâche Semeval-2007 English Lexical Sample, puis en japonais pour la tâche Semeval-2010 Japanese WSD, montrent que les systèmes ont des résultats comparables à l'état de l'art, bien qu'ils soient bien plus légers, et ne dépendent pas d'outils ou de ressources linguistiques. La deuxième partie de ce mémoire vise à adapter les méthodes développées à des applications de Recherche d'Information. Ces applications ont la difficulté additionnelle de ne pas pouvoir dépendre de données créées manuellement. Nous proposons donc des modèles de contextes à variables latentes basés sur l'Allocation Dirichlet Latente (LDA). Ceux-ci seront combinés à la méthodes de vraisemblance de requête par modèles de langue. En évaluant le système résultant sur trois collections de la conférence TREC (Text REtrieval Conference), nous observons une amélioration proportionnelle moyenne de 12% du MAP et 23% du GMAP. Les gains se font surtout sur les requêtes difficiles, augmentant la stabilité des résultats. Ces expériences seraient la première application positive de techniques de DS sur des tâches de RI standard.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Les logiciels de correction grammaticale commettent parfois des détections illégitimes (fausses alertes), que nous appelons ici surdétections. La présente étude décrit les expériences de mise au point d’un système créé pour identifier et mettre en sourdine les surdétections produites par le correcteur du français conçu par la société Druide informatique. Plusieurs classificateurs ont été entraînés de manière supervisée sur 14 types de détections faites par le correcteur, en employant des traits couvrant di-verses informations linguistiques (dépendances et catégories syntaxiques, exploration du contexte des mots, etc.) extraites de phrases avec et sans surdétections. Huit des 14 classificateurs développés sont maintenant intégrés à la nouvelle version d’un correcteur commercial très populaire. Nos expériences ont aussi montré que les modèles de langue probabilistes, les SVM et la désambiguïsation sémantique améliorent la qualité de ces classificateurs. Ce travail est un exemple réussi de déploiement d’une approche d’apprentissage machine au service d’une application langagière grand public robuste.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En apprentissage automatique, domaine qui consiste à utiliser des données pour apprendre une solution aux problèmes que nous voulons confier à la machine, le modèle des Réseaux de Neurones Artificiels (ANN) est un outil précieux. Il a été inventé voilà maintenant près de soixante ans, et pourtant, il est encore de nos jours le sujet d'une recherche active. Récemment, avec l'apprentissage profond, il a en effet permis d'améliorer l'état de l'art dans de nombreux champs d'applications comme la vision par ordinateur, le traitement de la parole et le traitement des langues naturelles. La quantité toujours grandissante de données disponibles et les améliorations du matériel informatique ont permis de faciliter l'apprentissage de modèles à haute capacité comme les ANNs profonds. Cependant, des difficultés inhérentes à l'entraînement de tels modèles, comme les minima locaux, ont encore un impact important. L'apprentissage profond vise donc à trouver des solutions, en régularisant ou en facilitant l'optimisation. Le pré-entraînnement non-supervisé, ou la technique du ``Dropout'', en sont des exemples. Les deux premiers travaux présentés dans cette thèse suivent cette ligne de recherche. Le premier étudie les problèmes de gradients diminuants/explosants dans les architectures profondes. Il montre que des choix simples, comme la fonction d'activation ou l'initialisation des poids du réseaux, ont une grande influence. Nous proposons l'initialisation normalisée pour faciliter l'apprentissage. Le second se focalise sur le choix de la fonction d'activation et présente le rectifieur, ou unité rectificatrice linéaire. Cette étude a été la première à mettre l'accent sur les fonctions d'activations linéaires par morceaux pour les réseaux de neurones profonds en apprentissage supervisé. Aujourd'hui, ce type de fonction d'activation est une composante essentielle des réseaux de neurones profonds. Les deux derniers travaux présentés se concentrent sur les applications des ANNs en traitement des langues naturelles. Le premier aborde le sujet de l'adaptation de domaine pour l'analyse de sentiment, en utilisant des Auto-Encodeurs Débruitants. Celui-ci est encore l'état de l'art de nos jours. Le second traite de l'apprentissage de données multi-relationnelles avec un modèle à base d'énergie, pouvant être utilisé pour la tâche de désambiguation de sens.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sketches are commonly used in the early stages of design. Our previous system allows users to sketch mechanical systems that the computer interprets. However, some parts of the mechanical system might be too hard or too complicated to express in the sketch. Adding speech recognition to create a multimodal system would move us toward our goal of creating a more natural user interface. This thesis examines the relationship between the verbal and sketch input, particularly how to segment and align the two inputs. Toward this end, subjects were recorded while they sketched and talked. These recordings were transcribed, and a set of rules to perform segmentation and alignment was created. These rules represent the knowledge that the computer needs to perform segmentation and alignment. The rules successfully interpreted the 24 data sets that they were given.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La present tesi consisteix en un recull toponímic pretèrit i present del terme d'Osor, comarca de la Selva a les Guilleries. S'hi recullen prop de 3600 noms de lloc recollits oralment o en documentació antiga sobre els quals s'hi realitza una situació, un recull documental, un estudi gràfic i una hipòtesi etimològica. A més a més, s'hi presenta la situació geogràfica (quan és possible) dins l'espai treballat, un estudi del topònim Osor, una mostra dels estudis onomàstics de les comarques gironines, un estudi de genèrics introductors dels topònims de l'estudi, una classificació semàntica dels termes recollits i diversos mapes de situació. Evidentment, s'hi presenta l'etimologia seguida, unes conclusions finals i una àmplia bibliografia a més de dos annexos, un dels llinatges d'Osor en el decurs de la història i un altre amb els malnoms recollits per entrevista oral o documentació antiga.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Word sense disambiguation is the task of determining which sense of a word is intended from its context. Previous methods have found the lack of training data and the restrictiveness of dictionaries' choices of senses to be major stumbling blocks. A robust novel algorithm is presented that uses multiple dictionaries, the Internet, clustering and triangulation to attempt to discern the most useful senses of a given word and learn how they can be disambiguated. The algorithm is explained, and some promising sample results are given.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We monitored 8- and 10-year-old children’s eye movements as they read sentences containing a temporary syntactic ambiguity to obtain a detailed record of their online processing. Children showed the classic garden-path effect in online processing. Their reading was disrupted following disambiguation, relative to control sentences containing a comma to block the ambiguity, although the disruption occurred somewhat later than would be expected for mature readers. We also asked children questions to probe their comprehension of the syntactic ambiguity offline. They made more errors following ambiguous sentences than following control sentences, demonstrating that the initial incorrect parse of the garden-path sentence influenced offline comprehension. These findings are consistent with “good enough” processing effects seen in adults. While faster reading times and more regressions were generally associated with better comprehension, spending longer reading the question predicted comprehension success specifically in the ambiguous condition. This suggests that reading the question prompted children to reconstruct the sentence and engage in some form of processing, which in turn increased the likelihood of comprehension success. Older children were more sensitive to the syntactic function of commas, and, overall, they were faster and more accurate than younger children.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examines the effects of a multi-session Cognitive Bias Modification (CBM) program on interpretative biases and social anxiety in an Iranian sample. Thirty-six volunteers with a high score on social anxiety measures were recruited from a student population and randomly allocated into the experimental and control groups. In the experimental group, participants received 4 sessions of positive CBM for interpretative biases (CBM-I) over 2 weeks in the laboratory. Participants in the control condition completed a neutral task matched the active CBM-I intervention in format and duration but did not encourage positive disambiguation of socially ambiguous scenarios. The results indicated that after training the positive CBM-I group exhibited more positive (and less negative) interpretations of ambiguous scenarios and less social anxiety symptoms relative to the control condition at both 1 week post-test and 7 weeks follow-up. It is suggested that clinical trials are required to establish the clinical efficacy of this intervention for social anxiety.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human functional imaging provides a correlative picture of brain activity during pain. A particular set of central nervous system structures (eg, the anterior cingulate cortex, thalamus, and insula) consistently respond to transient nociceptive stimuli causing pain. Activation of this so-called pain matrix or pain signature has been related to perceived pain intensity, both within and between individuals,1,2 and is now considered a candidate biomarker for pain in medicolegal settings and a tool for drug discovery. The pain-specific interpretation of such functional magnetic resonance imaging (fMRI) responses, although logically flawed,3,4 remains pervasive. For example, a 2015 review states that “the most likely interpretation of activity in the pain matrix seems to be pain.”4 Demonstrating the nonspecificity of the pain matrix requires ruling out the presence of pain when highly salient sensory stimuli are presented. In this study, we administered noxious mechanical stimuli to individuals with congenital insensitivity to pain and sampled their brain activity with fMRI. Loss-of-function SCN9A mutations in these individuals abolishes sensory neuron sodium channel Nav1.7 activity, resulting in pain insensitivity through an impaired peripheral drive that leaves tactile percepts fully intact.5 This allows complete experimental disambiguation of sensory responses and painful sensations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main goal of this work is to clarify the central concepts involved in the study of formalization of conditional sentences. More specifically, it has been done a comparative analysis of the two greater and more traditional proposals of conditional formalization (Lewis 1973c e Adams 1975). These proposals were responsible for the creation of a way of analysis that still present in the current debate about this subject. This work pursues to explain the principal assumptions held within these proposals. According to certain disambiguation techniques from Bennett (2003) and Lycan (2005), this work tries to explicit how these assumptions connect to the aims sought by the initial approaches. The following results show that there is a not declared presumption, the definition of the object of study of these theories, i.e., the definition of conditional sentence. This work argues that despite of not explicitly declared the definition of the study object has a central role in the intelligibility of the debate itself