880 resultados para Feature ontology
Resumo:
This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the Hidden Vector State (HVS) model. The HVS model belongs to the category of statistical learning methods. It can be trained directly from un-annotated data in a constrained way whilst at the same time being able to capture the underlying named entity relationships. However, it is difficult to incorporate background knowledge or non-local information into the HVS model. This paper proposes to represent the HVS model as a conditionally trained undirected graphical model in which non-local features derived from PPI ontology through inference would be easily incorporated. The seamless fusion of ontology inference with statistical learning produces a new paradigm to information extraction.
Resumo:
Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labelled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
Resumo:
Semantic Web Service, one of the most significant research areas within the Semantic Web vision, has attracted increasing attention from both the research community and industry. The Web Service Modelling Ontology (WSMO) has been proposed as an enabling framework for the total/partial automation of the tasks (e.g., discovery, selection, composition, mediation, execution, monitoring, etc.) involved in both intra- and inter-enterprise integration of Web services. To support the standardisation and tool support of WSMO, a formal model of the language is highly desirable. As several variants of WSMO have been proposed by the WSMO community, which are still under development, the syntax and semantics of WSMO should be formally defined to facilitate easy reuse and future development. In this paper, we present a formal Object-Z formal model of WSMO, where different aspects of the language have been precisely defined within one unified framework. This model not only provides a formal unambiguous model which can be used to develop tools and facilitate future development, but as demonstrated in this paper, can be used to identify and eliminate errors present in existing documentation.
Resumo:
In this poster we presented our preliminary work on the study of spammer detection and analysis with 50 active honeypot profiles implemented on Weibo.com and QQ.com microblogging networks. We picked out spammers from legitimate users by manually checking every captured user's microblogs content. We built a spammer dataset for each social network community using these spammer accounts and a legitimate user dataset as well. We analyzed several features of the two user classes and made a comparison on these features, which were found to be useful to distinguish spammers from legitimate users. The followings are several initial observations from our analysis on the features of spammers captured on Weibo.com and QQ.com. ¦The following/follower ratio of spammers is usually higher than legitimate users. They tend to follow a large amount of users in order to gain popularity but always have relatively few followers. ¦There exists a big gap between the average numbers of microblogs posted per day from these two classes. On Weibo.com, spammers post quite a lot microblogs every day, which is much more than legitimate users do; while on QQ.com spammers post far less microblogs than legitimate users. This is mainly due to the different strategies taken by spammers on these two platforms. ¦More spammers choose a cautious spam posting pattern. They mix spam microblogs with ordinary ones so that they can avoid the anti-spam mechanisms taken by the service providers. ¦Aggressive spammers are more likely to be detected so they tend to have a shorter life while cautious spammers can live much longer and have a deeper influence on the network. The latter kind of spammers may become the trend of social network spammer. © 2012 IEEE.
Resumo:
Increasingly, people's digital identities are attached to, and expressed through, their mobile devices. At the same time digital sensors pervade smart environments in which people are immersed. This paper explores different perspectives in which users' modelling features can be expressed through the information obtained by their attached personal sensors. We introduce the PreSense Ontology, which is designed to assign meaning to sensors' observations in terms of user modelling features. We believe that the Sensing Presence ( PreSense ) Ontology is a first step toward the integration of user modelling and "smart environments". In order to motivate our work we present a scenario and demonstrate how the ontology could be applied in order to enable context-sensitive services. © 2012 Springer-Verlag.
Resumo:
This work investigates the process of selecting, extracting and reorganizing content from Semantic Web information sources, to produce an ontology meeting the specifications of a particular domain and/or task. The process is combined with traditional text-based ontology learning methods to achieve tolerance to knowledge incompleteness. The paper describes the approach and presents experiments in which an ontology was built for a diet evaluation task. Although the example presented concerns the specific case of building a nutritional ontology, the methods employed are domain independent and transferrable to other use cases. © 2011 ACM.
Resumo:
Despite years of effort in building organisational taxonomies, the potential of ontologies to support knowledge management in complex technical domains is under-exploited. The authors of this chapter present an approach to using rich domain ontologies to support sense-making tasks associated with resolving mechanical issues. Using Semantic Web technologies, the authors have built a framework and a suite of tools which support the whole semantic knowledge lifecycle. These are presented by describing the process of issue resolution for a simulated investigation concerning failure of bicycle brakes. Foci of the work have included ensuring that semantic tasks fit in with users’ everyday tasks, to achieve user acceptability and support the flexibility required by communities of practice with differing local sub-domains, tasks, and terminology.
Resumo:
PowerAqua is a Question Answering system, which takes as input a natural language query and is able to return answers drawn from relevant semantic resources found anywhere on the Semantic Web. In this paper we provide two novel contributions: First, we detail a new component of the system, the Triple Similarity Service, which is able to match queries effectively to triples found in different ontologies on the Semantic Web. Second, we provide a first evaluation of the system, which in addition to providing data about PowerAqua's competence, also gives us important insights into the issues related to using the Semantic Web as the target answer set in Question Answering. In particular, we show that, despite the problems related to the noisy and incomplete conceptualizations, which can be found on the Semantic Web, good results can already be obtained.
Resumo:
Schema heterogeneity issues often represent an obstacle for discovering coreference links between individuals in semantic data repositories. In this paper we present an approach, which performs ontology schema matching in order to improve instance coreference resolution performance. A novel feature of the approach is its use of existing instance-level coreference links defined in third-party repositories as background knowledge for schema matching techniques. In our tests of this approach we obtained encouraging results, in particular, a substantial increase in recall in comparison with existing sets of coreference links.
Resumo:
The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users.
Resumo:
The semantic web (SW) vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language (NL) and an ontology as input, and returns answers drawn from one or more knowledge bases (KB). AquaLog presents an elegant solution in which different strategies are combined together in a novel way. AquaLog novel ontology-based relation similarity service makes sense of user queries.
Resumo:
We show a new method for term extraction from a domain relevant corpus using natural language processing for the purposes of semi-automatic ontology learning. Literature shows that topical words occur in bursts. We find that the ranking of extracted terms is insensitive to the choice of population model, but calculating frequencies relative to the burst size rather than the document length in words yields significantly different results.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT