874 resultados para natural language
Resumo:
Wydział Matematyki i Informatyki: Zakład Lingwistyki Informatycznej i Sztucznej Inteligencji
Resumo:
BACKGROUND: Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. METHODOLOGY/PRINCIPAL FINDINGS: Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. CONCLUSIONS/SIGNIFICANCE: Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.
Resumo:
The effects of natural language comments, meaningful variable names, and structure on the comprehensibility of Z specifications are investigated through a designed experiment conducted with a range of undergraduate and post-graduate student subjects. The times taken on three assessment questions are analysed and related to the abilities of the students as indicated by their total score, with the result that stronger students need less time than weaker students to complete the assessment. Individual question scores, and total score, are then analysed and the influence of comments, naming, structure and level of student's class are determined. In the whole experimental group, only meaningful naming significantly enhances comprehension. In contrast, for those obtaining the best score of 3/3 the only significant factor is commenting. Finally, the subjects' ratings of the five specifications used in the study in terms of their perceived comprehensibility have been analysed. Comments, naming and structure are again found to be of importance in the group when analysed as a whole, but in the sub-group of best performing subjects only the comments had an effect on perceived comprehensibility.
Resumo:
Dascalu, M., Stavarache, L.L., Dessus, P., Trausan-Matu, S., McNamara, D.S., & Bianco, M. (2015). ReaderBench: An Integrated Cohesion-Centered Framework. In G. Conole, T. Klobucar, C. Rensing, J. Konert & É. Lavoué (Eds.), 10th European Conf. on Technology Enhanced Learning (pp. 505–508). Toledo, Spain: Springer.
Resumo:
Nistor, N., Dascalu, M., Stavarache, L.L., Tarnai, C., & Trausan-Matu, S. (2015). Predicting Newcomer Integration in Online Knowledge Communities by Automated Dialog Analysis. In Y. Li, M. Chang, M. Kravcik, E. Popescu, R. Huang, Kinshuk & N.-S. Chen (Eds.), State-of-the-Art and Future Directions of Smart Learning (Vol. Lecture Notes in Educational Technology, pp. 13–17). Berlin, Germany: Springer-Verlag Singapur
Resumo:
In previous papers, we have presented a logic-based framework based on fusion rules for merging structured news reports. Structured news reports are XML documents, where the textentries are restricted to individual words or simple phrases, such as names and domain-specific terminology, and numbers and units. We assume structured news reports do not require natural language processing. Fusion rules are a form of scripting language that define how structured news reports should be merged. The antecedent of a fusion rule is a call to investigate the information in the structured news reports and the background knowledge, and the consequent of a fusion rule is a formula specifying an action to be undertaken to form a merged report. It is expected that a set of fusion rules is defined for any given application. In this paper we extend the approach to handling probability values, degrees of beliefs, or necessity measures associated with textentries in the news reports. We present the formal definition for each of these types of uncertainty and explain how they can be handled using fusion rules. We also discuss the methods of detecting inconsistencies among sources.
Resumo:
Approximants that can be considered weaker versions of voiced fricatives (termed here ‘frictionless continuants’) are poorly served by the IPA in terms of symbolization as compared to semi-vowel approximants. In this paper we survey the central approximants and the symbols and diacritics used to transcribe them; we focus on evidence for the use of non-rhotic frictionless continuants in both natural language (by which we mean non-clinical varieties) and disordered speech; and we suggest some possible unitary symbols for those that currently require the use of a hard-to-read lowering diacritic beneath the symbol for the corresponding voiced fricative.
Resumo:
Purpose: The purpose of this paper is to engage a different notion of feminism in accounting by addressing the issues of feminism, balance, and integration as a means of understanding differently the world for which one accounts. The ideas are communicated by the sharing of experiences through myth and storytelling.
Design/methodology/approach: An alternative lens for understanding the giving of accounts is proposed, drawing on earlier feminist accounting literature as well as storytelling and myth.
Findings: Including the subjective and intersubjective approaches to experiencing and understanding the world recommends an approach whereby both the feminine-intuitive and the masculine-rational processes are integrated in constructing decision models and accounts.
Research limitations/implications: Through an expanded view of values that can be included in reporting or recounting a different model is seen, and different decisions are enabled. The primary limitation is having to use words to convey one’s subjective and intersubjective understandings. The written medium is not the most natural language for such an undertaking.
Practical implications: By enabling the inclusion of more feminine values, a way is opened to engage more holistically with the society in which decisions are embedded.
Originality/value: Drawing on the storytelling tradition, a holistic model is suggested that can lead to emergence of a more balanced societal reporting.
Keywords: Feminism, Integration, Accounting, Storytelling, Myths
Paper type: Research paper
Resumo:
This paper describes a data model for content representation of temporal media in an IP based sensor network. The model is formed by introducing the idea of semantic-role from linguistics into the underlying concepts of formal event representation with the aim of developing a common event model. The architecture of a prototype system for a multi camera surveillance system, based on the proposed model is described. The important aspects of the proposed model are its expressiveness, its ability to model content of temporal media, and its suitability for use with a natural language interface. It also provides a platform for temporal information fusion, as well as organizing sensor annotations by help of ontologies.
Resumo:
Decision making is an important element throughout the life-cycle of large-scale projects. Decisions are critical as they have a direct impact upon the success/outcome of a project and are affected by many factors including the certainty and precision of information. In this paper we present an evidential reasoning framework which applies Dempster-Shafer Theory and its variant Dezert-Smarandache Theory to aid decision makers in making decisions where the knowledge available may be imprecise, conflicting and uncertain. This conceptual framework is novel as natural language based information extraction techniques are utilized in the extraction and estimation of beliefs from diverse textual information sources, rather than assuming these estimations as already given. Furthermore we describe an algorithm to define a set of maximal consistent subsets before fusion occurs in the reasoning framework. This is important as inconsistencies between subsets may produce results which are incorrect/adverse in the decision making process. The proposed framework can be applied to problems involving material selection and a Use Case based in the Engineering domain is presented to illustrate the approach. © 2013 Elsevier B.V. All rights reserved.
Resumo:
Southern Tiwa (Tanoan) exhibits agreement with up to three arguments (ergative, absolutive, dative). This agreement is subject to certain restrictions resembling the Person-Case Constraint paradigm (Bonet 1991). Moreover, there is a correlation between agreement restrictions and conditions on (the obviation of) noun-incorporation in Southern Tiwa, as explicitly and elegantly captured by Rosen (1990) in terms of a heterogeneous feature hierarchy and rules of association. We attempt to recast Rosen’s central insights in terms of Anagnostopoulou’s probe-sharing model of Person-Case Constraint effects (Anagnostopoulou 2003, 2006), to show that the full range of Southern Tiwa agreement and (non-)incorporation restrictions can be given a single, unified analysis within the probe-goal-Agree framework of Chomsky (2001). In particular, we argue that Southern Tiwa’s triple-agreement system is characterized by (a) an independent class probe located on the heads T and v, and (b) a rule that allows this class probe to be deleted in the context of local-person T-agreement. The various restrictions on agreement and non-incorporation then reduce to a single source: failure of class-valuation with DP (as opposed to NP) arguments.
Resumo:
We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended. © 2007 SPIE-IS&T.
Resumo:
In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.
Resumo:
This paper contributes a new approach for developing UML software designs from Natural Language (NL), making use of a meta-domain oriented ontology, well established software design principles and Natural Language Processing (NLP) tools. In the approach described here, banks of grammatical rules are used to assign event flows from essential use cases. A domain specific ontology is also constructed, permitting semantic mapping between the NL input and the modeled domain. Rules based on the widely-used General Responsibility Assignment Software Principles (GRASP) are then applied to derive behavioral models.
Resumo:
Identifying responsibility for classes in object oriented software design phase is a crucial task. This paper proposes an approach for producing high quality and robust behavioural diagrams (e.g. Sequence Diagrams) through Class Responsibility Assignment (CRA). GRASP or General Responsibility Assignment Software Pattern (or Principle) was used to direct the CRA process when deriving behavioural diagrams. A set of tools to support CRA was developed to provide designers and developers with a cognitive toolkit that can be used when analysing and designing object-oriented software. The tool developed is called Use Case Specification to Sequence Diagrams (UC2SD). UC2SD uses a new approach for developing Unified Modelling Language (UML) software designs from Natural Language, making use of a meta-domain oriented ontology, well established software design principles and established Natural Language Processing (NLP) tools. UC2SD generates a well-formed UML sequence diagrams as output.