998 resultados para Syntactic semantic patterns
Resumo:
In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.
Resumo:
In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.
Resumo:
Most existing approaches to Twitter sentiment analysis assume that sentiment is explicitly expressed through affective words. Nevertheless, sentiment is often implicitly expressed via latent semantic relations, patterns and dependencies among words in tweets. In this paper, we propose a novel approach that automatically captures patterns of words of similar contextual semantics and sentiment in tweets. Unlike previous work on sentiment pattern extraction, our proposed approach does not rely on external and fixed sets of syntactical templates/patterns, nor requires deep analyses of the syntactic structure of sentences in tweets. We evaluate our approach with tweet- and entity-level sentiment analysis tasks by using the extracted semantic patterns as classification features in both tasks. We use 9 Twitter datasets in our evaluation and compare the performance of our patterns against 6 state-of-the-art baselines. Results show that our patterns consistently outperform all other baselines on all datasets by 2.19% at the tweet-level and 7.5% at the entity-level in average F-measure.
Resumo:
Clinical text understanding (CTU) is of interest to health informatics because critical clinical information frequently represented as unconstrained text in electronic health records are extensively used by human experts to guide clinical practice, decision making, and to document delivery of care, but are largely unusable by information systems for queries and computations. Recent initiatives advocating for translational research call for generation of technologies that can integrate structured clinical data with unstructured data, provide a unified interface to all data, and contextualize clinical information for reuse in multidisciplinary and collaborative environment envisioned by CTSA program. This implies that technologies for the processing and interpretation of clinical text should be evaluated not only in terms of their validity and reliability in their intended environment, but also in light of their interoperability, and ability to support information integration and contextualization in a distributed and dynamic environment. This vision adds a new layer of information representation requirements that needs to be accounted for when conceptualizing implementation or acquisition of clinical text processing tools and technologies for multidisciplinary research. On the other hand, electronic health records frequently contain unconstrained clinical text with high variability in use of terms and documentation practices, and without commitmentto grammatical or syntactic structure of the language (e.g. Triage notes, physician and nurse notes, chief complaints, etc). This hinders performance of natural language processing technologies which typically rely heavily on the syntax of language and grammatical structure of the text. This document introduces our method to transform unconstrained clinical text found in electronic health information systems to a formal (computationally understandable) representation that is suitable for querying, integration, contextualization and reuse, and is resilient to the grammatical and syntactic irregularities of the clinical text. We present our design rationale, method, and results of evaluation in processing chief complaints and triage notes from 8 different emergency departments in Houston Texas. At the end, we will discuss significance of our contribution in enabling use of clinical text in a practical bio-surveillance setting.
Resumo:
The design of interfaces to facilitate user search has become critical for search engines, ecommercesites, and intranets. This study investigated the use of targeted instructional hints to improve search by measuring the quantitative effects of users' performance and satisfaction. The effects of syntactic, semantic and exemplar search hints on user behavior were evaluated in an empirical investigation using naturalistic scenarios. Combining the three search hint components, each with two levels of intensity, in a factorial design generated eight search engine interfaces. Eighty participants participated in the study and each completed six realistic search tasks. Results revealed that the inclusion of search hints improved user effectiveness, efficiency and confidence when using the search interfaces, but with complex interactions that require specific guidelines for search interface designers. These design guidelines will allow search designers to create more effective interfaces for a variety of searchapplications.
Resumo:
Purpose: Increasing costs of health care, fuelled by demand for high quality, cost-effective healthcare has drove hospitals to streamline their patient care delivery systems. One such systematic approach is the adaptation of Clinical Pathways (CP) as a tool to increase the quality of healthcare delivery. However, most organizations still rely on are paper-based pathway guidelines or specifications, which have limitations in process management and as a result can influence patient safety outcomes. In this paper, we present a method for generating clinical pathways based on organizational semiotics by capturing knowledge from syntactic, semantic and pragmatic to social level. Design/methodology/approach: The proposed modeling approach to generation of CPs adopts organizational semiotics and enables the generation of semantically rich representation of CP knowledge. Semantic Analysis Method (SAM) is applied to explicitly represent the semantics of the concepts, their relationships and patterns of behavior in terms of an ontology chart. Norm Analysis Method (NAM) is adopted to identify and formally specify patterns of behavior and rules that govern the actions identified on the ontology chart. Information collected during semantic and norm analysis is integrated to guide the generation of CPs using best practice represented in BPMN thus enabling the automation of CP. Findings: This research confirms the necessity of taking into consideration social aspects in designing information systems and automating CP. The complexity of healthcare processes can be best tackled by analyzing stakeholders, which we treat as social agents, their goals and patterns of action within the agent network. Originality/value: The current modeling methods describe CPs from a structural aspect comprising activities, properties and interrelationships. However, these methods lack a mechanism to describe possible patterns of human behavior and the conditions under which the behavior will occur. To overcome this weakness, a semiotic approach to generation of clinical pathway is introduced. The CP generated from SAM together with norms will enrich the knowledge representation of the domain through ontology modeling, which allows the recognition of human responsibilities and obligations and more importantly, the ultimate power of decision making in exceptional circumstances.
Resumo:
This paper attempts to investigate the discourse manifestations of the grammatical relation direct object with respect to the syntactic, semantic and pragmatic properties that underlie this element. The research adopts theoretical orientation of the functionalism from North American and Brazilian schools inspired in Givón (1995, 2001), Hopper and Thompson (1980), Chafe (1979), Furtado da Cunha, Oliveira, Martelotta (2003) inter alia. From functionalism, the research uses principles of iconicity, markedness and informativity and it analize categories of transitivity, grounding and animacy. This research is anchored in prototype model (TAYLOR 1995); construction grammar model (GOLDBERG 1996, 2002). Both theoretical orientations share the view that language is a malleable living organism subject to socio-cultural context. Grammar is then the result of created, maintained, and systematized linguistic patterns developed from and used for language use. According to a functional linguistics and cognitivist linguistics verbs are stored in the speakers lexicon in syntactic-semantic frames which are more frequent. These frames carry information concerning obligatory and optional arguments and the semantic roles these arguments take in the clause. The analysis focuses on the semantic type of the verbs and its relationship with the argument encoded as a direct object observing the aspectual nature of verbs. Direct objects are classified according to their morphology (lexical or pronominal noun phrase), semantic role, informational content and animacy. This study discusses pedagogical implications with relation to how the grammatical concepts touched on this paper are treated in school textbooks. The empirical data come from Corpus Discurso & Gramática: a língua falada e escrita na cidade do Natal (FURTADO DA CUNHA, 1998). This corpus is composed of texts that contain spoken and written modalities. These modalities are in turn organized according to different types: personal narratives, retold narrative, description of preferred place, procedural place, procedural description and report on argumentation. The sample data totals 40 texts produced by four language consultants of the last graduation date. The paper shows that the same syntactic structures (formed through Subject-Verb-Object) correspond to different semantic-pragmatic structures in relation to specific communicative purposes even verb is an event, process or state. The argument structure are not aleatory but are related to experience; that is the way humans conceptualize the world and talk about it
Resumo:
This work analyzes deverbal nominalizations with the sufix dor in Brazilian Portuguese, under the perspective of Cognitive Linguistics, more specifically, the Construction Grammar. The aim is to determine the general features of interpretation that characterize this deverbal construction and its use in formal writing. Based on the cognitive assumption that grammatical structure is motivated, explained, and determined by the structure of cognitive patterns, created from our experience in the world, and by the communicative function of language, the dor deverbal is treated as a polysemic grammatical construction. In the composition of V+dor, the relation rootsuffix is focused, through a characterization of the syntactic-semantic nature of the verb and the values of the suffix. Among the different values conventionally related to the XDOR construction, the agentive is considered as the prototypical sense. The relation between the other values and the prototype is explained by cognitive abilities and discourse motivations. The deverbal construction X-DOR is also interpreted as a valency noun that, like an action nominal, retains the argument structure of the deriving predicate. It is also intended to demonstrate the textual function of this deverbal construction, as a device of information condensing and anaphoric recovery. The data were taken from Veja magazine and the approach is qualitative (explicative), with quantitative support
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
The Leximancer system is a relatively new method for transforming lexical co-occurrence information from natural language into semantic patterns in an unsupervised manner. It employs two stages of co-occurrence information extraction-semantic and relational-using a different algorithm for each stage. The algorithms used are statistical, but they employ nonlinear dynamics and machine learning. This article is an attempt to validate the output of Leximancer, using a set of evaluation criteria taken from content analysis that are appropriate for knowledge discovery tasks.
Resumo:
This paper argues, on the basis of a corpus-based study of the meanings of can and may in contemporary British, American and Australian English, that a polysemy-based analysis is applicable to both modals. With may, epistemic possibility is the dominant meaning, but the dynamic and deontic possibility meanings still account for over 16.5% of tokens. By contrast the meanings of can, apart from a small percentage (1.1%) of epistemic cases, are united through the concept of potentiality. Nevertheless there are signs that the epistemic possibility meaning is becoming established, as it sheds its syntactic/semantic restriction to non-affirmative contexts.
Resumo:
This PhD project aims to study paraphrasing, initially understood as the different ways in which the same content is expressed linguistically. We will go into that concept in depth trying to define and delimit its scope more accurately. In that sense, we also aim to discover which kind of structures and phenomena it covers. Although there exist some paraphrasing typologies, the great majority of them only apply to English, and focus on lexical and syntactic transformations. Our intention is to go further into this subject and propose a paraphrasing typology for Spanish and Catalan combining lexical, syntactic, semantic and pragmatic knowledge. We apply a bottom-up methodology trying to collect evidence of this phenomenon from the data. For this purpose, we are initially using the Spanish Wikipedia as our corpus. The internal structure of this encyclopedia makes it a good resource for extracting paraphrasing examples for our investigation. This empirical approach will be complemented with the use of linguistic knowledge, and by comparing and contrasting our results to previously proposed paraphrasing typologies in order to enlarge the possible paraphrasing forms found in our corpus. The fact that the same content can be expressed in many different ways presents a major challenge for Natural Language Processing (NLP) applications. Thus, research on paraphrasing has recently been attracting increasing attention in the fields of NLP and Computational Linguistics. The results obtained in this investigation would be of great interest in many of these applications.
Resumo:
This work investigates the syntactic, semantic, and pragmatic properties of nominal Split Topicalization (ST) constructions in Standard and non-Standard German. The topic phrase denotes a property, and the MF phrase either modifies this property or picks out a specific entity. Semantically, the topic phrase will be analysed as a property-denoting expression which restricts the denotation of the verbal predicate, while the MF phrase is composed either via specify or restrict (cf. Chung and Ladusaw, 2003). Syntactically, the base position of the topic phrase is the (incorporating) verb, and the MF phrase is generated independently as the complement of the verb containing an empty pronoun. Since predicates introduce abstract discourse referents, the topic phrase can be resumed via "pro" in the MF phrase.
Resumo:
L'objectiu d'aquest article és analitzar els principals criteris que les guies d'estil recomanen per visibilitzar les dones ¿o per fer un ús no sexista del llenguatge¿ des de dos punts de vista: el sintacticosemàntic i el discursiu. Des del punt de vista sintacticosemàntic, s'estudien bàsicament els fenòmens relacionats amb la coordinació, la concordança i la repetició o elisió d'elements (per exemple, especificadors del nom), i la manera com les diferents opcions afecten el significat oracional. Des del punt de vista discursiu, s'analitzen els fenòmens relacionats amb la coreferència; és a dir, la relació entre les diferents maneres d'expressar un mateix referent per mitjà d'elements nominals al llarg del text, i l'efecte que provoquen en el text en conjunt. Amb aquest objectiu, l'estudi analitza des d'un punt de vista qualitatiu les dades proporcionades per un corpus de textos procedents de tres àmbits (polític, administratiu i educatiu) en què s'apliquen sovint aquesta mena de criteris. Paraules clau: català, llenguatge no sexista, visibilització lingüística de les dones, sintaxi, cohesió, coreferència, llenguatge androcèntric, estil. The goal of this article is to analyse the main criteria recommended by style guides aimed at making women more visible or, in other words, to make a non-sexist use of language. I will concentrate on two main aspects: the syntactic-semantic and the discursive. From a syntactic-semantic point of view, the main elements being studied are those related to coordination, agreement and repetition or omission of elements (for instance, noun specifiers), and also the way the different options chosen affect the meaning of the sentence. From a discursive and stylistic point of view, the elements analysed are those related to coreference, that is, the relationship between the different ways of expressing a same referent through different elements in the text, and the effect they produce in the text as a whole. Having this as the main goal, the study analyses from a qualitative point of view the data from a corpus in three different areas (politics, administration and education), which often apply this kind of criteria. Keywords: Catalan, non-sexist language, female linguistic visibility, syntax, cohesion, co-reference, androcentric language, style
Resumo:
Author identification is the problem of identifying the author of an anonymous text or text whose authorship is in doubt from a given set of authors. The works by different authors are strongly distinguished by quantifiable features of the text. This paper deals with the attempts made on identifying the most likely author of a text in Malayalam from a list of authors. Malayalam is a Dravidian language with agglutinative nature and not much successful tools have been developed to extract syntactic & semantic features of texts in this language. We have done a detailed study on the various stylometric features that can be used to form an authors profile and have found that the frequencies of word collocations can be used to clearly distinguish an author in a highly inflectious language such as Malayalam. In our work we try to extract the word level and character level features present in the text for characterizing the style of an author. Our first step was towards creating a profile for each of the candidate authors whose texts were available with us, first from word n-gram frequencies and then by using variable length character n-gram frequencies. Profiles of the set of authors under consideration thus formed, was then compared with the features extracted from anonymous text, to suggest the most likely author.