Biblioteca Digital

19 resultados para Wilks

Words and intelligence I:selected papers by Yorick Wilks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Yorick Wilks is a central figure in the fields of Natural Language Processing and Artificial Intelligence. His influence extends to many areas and includes contributions to Machines Translation, word sense disambiguation, dialogue modeling and Information Extraction. This book celebrates the work of Yorick Wilks in the form of a selection of his papers which are intended to reflect the range and depth of his work. The volume accompanies a Festschrift which celebrates his contribution to the fields of Computational Linguistics and Artificial Intelligence. The papers include early work carried out at Cambridge University, descriptions of groundbreaking work on Machine Translation and Preference Semantics as well as more recent works on belief modeling and computational semantics. The selected papers reflect Yorick’s contribution to both practical and theoretical aspects of automatic language processing.

Words and intelligence II:essays in honor of Yorick Wilks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Yorick Wilks is a central figure in the fields of Natural Language Processing and Artificial Intelligence. His influence has extends to many areas of these fields and includes contributions to Machine Translation, word sense disambiguation, dialogue modeling and Information Extraction.This book celebrates the work of Yorick Wilks from the perspective of his peers. It consists of original chapters each of which analyses an aspect of his work and links it to current thinking in that area. His work has spanned over four decades but is shown to be pertinent to recent developments in language processing such as the Semantic Web.This volume forms a two-part set together with Words and Intelligence I, Selected Works by Yorick Wilks, by the same editors.

Using HLT for acquiring, retrieving and publishing knowledge in AKT:position paper

Relevância:

10.00% 10.00%

Publicador:

Resumo:

AKT is a major research project applying a variety of technologies to knowledge management. Knowledge is a dynamic, ubiquitous resource, which is to be found equally in an expert's head, under terabytes of data, or explicitly stated in manuals. AKT will extend knowledge management technologies to exploit the potential of the semantic web, covering the use of knowledge over its entire lifecycle, from acquisition to maintenance and deletion. In this paper we discuss how HLT will be used in AKT and how the use of HLT will affect different areas of KM, such as knowledge acquisition, retrieval and publishing.

Data driven ontology evaluation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.

Background and foreground knowledge in dynamic ontology construction

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ontologies have become a key component in the Semantic Web and Knowledge management. One accepted goal is to construct ontologies from a domain specific set of texts. An ontology reflects the background knowledge used in writing and reading a text. However, a text is an act of knowledge maintenance, in that it re-enforces the background assumptions, alters links and associations in the ontology, and adds new concepts. This means that background knowledge is rarely expressed in a machine interpretable manner. When it is, it is usually in the conceptual boundaries of the domain, e.g. in textbooks or when ideas are borrowed into other domains. We argue that a partial solution to this lies in searching external resources such as specialized glossaries and the internet. We show that a random selection of concept pairs from the Gene Ontology do not occur in a relevant corpus of texts from the journal Nature. In contrast, a significant proportion can be found on the internet. Thus, we conclude that sources external to the domain corpus are necessary for the automatic construction of ontologies.

User-centred ontology learning for knowledge management

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic ontology building is a vital issue in many fields where they are currently built manually. This paper presents a user-centred methodology for ontology construction based on the use of Machine Learning and Natural Language Processing. In our approach, the user selects a corpus of texts and sketches a preliminary ontology (or selects an existing one) for a domain with a preliminary vocabulary associated to the elements in the ontology (lexicalisations). Examples of sentences involving such lexicalisation (e.g. ISA relation) in the corpus are automatically retrieved by the system. Retrieved examples are validated by the user and used by an adaptive Information Extraction system to generate patterns that discover other lexicalisations of the same objects in the ontology, possibly identifying new concepts or relations. New instances are added to the existing ontology or used to tune it. This process is repeated until a satisfactory ontology is obtained. The methodology largely automates the ontology construction process and the output is an ontology with an associated trained leaner to be used for further ontology modifications.

Knowledge acquisition for knowledge management: position paper

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With this paper, we propose a set of techniques to largely automate the process of KA, by using technologies based on Information Extraction (IE) , Information Retrieval and Natural Language Processing. We aim to reduce all the impeding factors mention above and thereby contribute to the wider utility of the knowledge management tools. In particular we intend to reduce the introspection of knowledge engineers or the extended elicitations of knowledge from experts by extensive textual analysis using a variety of methods and tools, as texts are largely available and in them - we believe - lies most of an organization's memory.

The ontology: Chimaera or Pegasus

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the context of the needs of the Semantic Web and Knowledge Management, we consider what the requirements are of ontologies. The ontology as an artifact of knowledge representation is in danger of becoming a Chimera. We present a series of facts concerning the foundations on which automated ontology construction must build. We discuss a number of different functions that an ontology seeks to fulfill, and also a wish list of ideal functions. Our objective is to stimulate discussion as to the real requirements of ontology engineering and take the view that only a selective and restricted set of requirements will enable the beast to fly.

Dynamic iterative ontology learning

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The fundamental failure of current approaches to ontology learning is to view it as single pipeline with one or more specific inputs and a single static output. In this paper, we present a novel approach to ontology learning which takes an iterative view of knowledge acquisition for ontologies. Our approach is founded on three open-ended resources: a set of texts, a set of learning patterns and a set of ontological triples, and the system seeks to maintain these in equilibrium. As events occur which disturb this equilibrium, actions are triggered to re-establish a balance between the resources. We present a gold standard based evaluation of the final output of the system, the intermediate output showing the iterative process and a comparison of performance using different seed input. The results are comparable to existing performance in the literature.

Ontologies, taxonomies, thesauri:learning from texts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.

An incremental tri-partite approach to ontology learning

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a new approach to ontology learning. Its basis lies in a dynamic and iterative view of knowledge acquisition for ontologies. The Abraxas approach is founded on three resources, a set of texts, a set of learning patterns and a set of ontological triples, each of which must remain in equilibrium. As events occur which disturb this equilibrium various actions are triggered to re-establish a balance between the resources. Such events include acquisition of a further text from external resources such as the Web or the addition of ontological triples to the ontology. We develop the concept of a knowledge gap between the coverage of an ontology and the corpus of texts as a measure triggering actions. We present an overview of the algorithm and its functionalities.

Image annotation with Photocopain

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Photo annotation is a resource-intensive task, yet is increasingly essential as image archives and personal photo collections grow in size. There is an inherent con?ict in the process of describing and archiving personal experiences, because casual users are generally unwilling to expend large amounts of e?ort on creating the annotations which are required to organise their collections so that they can make best use of them. This paper describes the Photocopain system, a semi-automatic image annotation system which combines information about the context in which a photograph was captured with information from other readily available sources in order to generate outline annotations for that photograph that the user may further extend or amend.

Dialogue, speech and images:the companions project data set

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a large part of which focus on reminiscing about photographs. The texts are in English and Czech. We describe the context and objectives for which this dialogue corpus is being collected, the methodology being used and make observations on the resulting data. The corpora will be made available to the wider research community through the Companions Project web site.

Natural language processing as a foundation of the semantic Web

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.

Transcriptome analysis of a respiratory Saccharomyces cerevisiae strain suggests the expression of its phenotype is glucose insensitive and predominantly controlled by Hap4, Cat8 and Mig1

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: We previously described the first respiratory Saccharomyces cerevisiae strain, KOY.TM6*P, by integrating the gene encoding a chimeric hexose transporter, Tm6*, into the genome of an hxt null yeast. Subsequently we transferred this respiratory phenotype in the presence of up to 50 g/L glucose to a yeast strain, V5 hxt1-7Delta, in which only HXT1-7 had been deleted. In this study, we compared the transcriptome of the resultant strain, V5.TM6*P, with that of its wild-type parent, V5, at different glucose concentrations. RESULTS: cDNA array analyses revealed that alterations in gene expression that occur when transitioning from a respiro-fermentative (V5) to a respiratory (V5.TM6*P) strain, are very similar to those in cells undergoing a diauxic shift. We also undertook an analysis of transcription factor binding sites in our dataset by examining previously-published biological data for Hap4 (in complex with Hap2, 3, 5), Cat8 and Mig1, and used this in combination with verified binding consensus sequences to identify genes likely to be regulated by one or more of these. Of the induced genes in our dataset, 77% had binding sites for the Hap complex, with 72% having at least two. In addition, 13% were found to have a binding site for Cat8 and 21% had a binding site for Mig1. Unexpectedly, both the up- and down-regulation of many of the genes in our dataset had a clear glucose dependence in the parent V5 strain that was not present in V5.TM6*P. This indicates that the relief of glucose repression is already operable at much higher glucose concentrations than is widely accepted and suggests that glucose sensing might occur inside the cell. CONCLUSION: Our dataset gives a remarkably complete view of the involvement of genes in the TCA cycle, glyoxylate cycle and respiratory chain in the expression of the phenotype of V5.TM6*P. Furthermore, 88% of the transcriptional response of the induced genes in our dataset can be related to the potential activities of just three proteins: Hap4, Cat8 and Mig1. Overall, our data support genetic remodelling in V5.TM6*P consistent with a respiratory metabolism which is insensitive to external glucose concentrations.

«
1
2
»