850 resultados para Ontology mining
Resumo:
Recently, user tagging systems have grown in popularity on the web. The tagging process is quite simple for ordinary users, which contributes to its popularity. However, free vocabulary has lack of standardization and semantic ambiguity. It is possible to capture the semantics from user tagging and represent those in a form of ontology, but the application of the learned ontology for recommendation making has not been that flourishing. In this paper we discuss our approach to learn domain ontology from user tagging information and apply the extracted tag ontology in a pilot tag recommendation experiment. The initial result shows that by using the tag ontology to re-rank the recommended tags, the accuracy of the tag recommendation can be improved.
Resumo:
Tags or personal metadata for annotating web resources have been widely adopted in Web 2.0 sites. However, as tags are freely chosen by users, the vocabularies are diverse, ambiguous and sometimes only meaningful to individuals. Tag recommenders may assist users during tagging process. Its objective is to suggest relevant tags to use as well as to help consolidating vocabulary in the systems. In this paper we discuss our approach for providing personalized tag recommendation by making use of existing domain ontology generated from folksonomy. Specifically we evaluated the approach in sparse situation. The evaluation shows that the proposed ontology-based method has improved the accuracy of tag recommendation in this situation.
Resumo:
Different reputation models are used in the web in order to generate reputation values for products using uses' review data. Most of the current reputation models use review ratings and neglect users' textual reviews, because it is more difficult to process. However, we argue that the overall reputation score for an item does not reflect the actual reputation for all of its features. And that's why the use of users' textual reviews is necessary. In our work we introduce a new reputation model that defines a new aggregation method for users' extracted opinions about products' features from users' text. Our model uses features ontology in order to define general features and sub-features of a product. It also reflects the frequencies of positive and negative opinions. We provide a case study to show how our results compare with other reputation models.
Resumo:
Online business or Electronic Commerce (EC) is getting popular among customers today, as a result large number of product reviews have been posted online by the customers. This information is very valuable not only for prospective customers to make decision on buying product but also for companies to gather information of customers’ satisfaction about their products. Opinion mining is used to capture customer reviews and separated this review into subjective expressions (sentiment word) and objective expressions (no sentiment word). This paper proposes a novel, multi-dimensional model for opinion mining, which integrates customers’ characteristics and their opinion about any products. The model captures subjective expression from product reviews and transfers to fact table before representing in multi-dimensions named as customers, products, time and location. Data warehouse techniques such as OLAP and Data Cubes were used to analyze opinionated sentences. A comprehensive way to calculate customers’ orientation on products’ features and attributes are presented in this paper.
Resumo:
Semantic Web offers many possibilities for future Web technologies. Therefore, it is a need to search for ways that can bring the huge amount of unstructured documents from current Web to Semantic Web automatically. One big challenge in searching for such ways is how to understand patterns by both humans and machine. To address this issue, we present an innovative model which interprets patterns to high level concepts. These concepts can explain the patterns' meanings in a human understandable way while improving the information filtering performance. The model is evaluated by comparing it against one state-of-the-art benchmark model using standard Reuters dataset. The results show that the proposed model is successful. The significance of this model is three fold. It gives a way to interpret text mining output, provides a technique to find concepts relevant to the whole set of patterns which is an essential feature to understand the topic, and to some extent overcomes information mismatch and overload problems of existing models. This model will be very useful for knowledge based applications.
Resumo:
This research proposes a multi-dimensional model for Opinion Mining, which integrates customers' characteristics and their opinions about products (or services). Customer opinions are valuable for companies to deliver right products or services to their customers. This research presents a comprehensive framework to evaluate opinions' orientation based on products' hierarchy attributes. It also provides an alternative way to obtain opinion summaries for different groups of customers and different categories of produces.
Resumo:
The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.
Resumo:
Time-series and sequences are important patterns in data mining. Based on an ontology of time-elements, this paper presents a formal characterization of time-series and state-sequences, where a state denotes a collection of data whose validation is dependent on time. While a time-series is formalized as a vector of time-elements temporally ordered one after another, a state-sequence is denoted as a list of states correspondingly ordered by a time-series. In general, a time-series and a state-sequence can be incomplete in various ways. This leads to the distinction between complete and incomplete time-series, and between complete and incomplete state-sequences, which allows the expression of both absolute and relative temporal knowledge in data mining.
Resumo:
Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. These systems provide currently relatively few structure. We discuss in this paper, how association rule mining can be adopted to analyze and structure folksonomies, and how the results can be used for ontology learning and supporting emergent semantics. We demonstrate our approach on a large scale dataset stemming from an online system.
Resumo:
The increase in new electronic devices had generated a considerable increase in obtaining spatial data information; hence these data are becoming more and more widely used. As well as for conventional data, spatial data need to be analyzed so interesting information can be retrieved from them. Therefore, data clustering techniques can be used to extract clusters of a set of spatial data. However, current approaches do not consider the implicit semantics that exist between a region and an object’s attributes. This paper presents an approach that enhances spatial data mining process, so they can use the semantic that exists within a region. A framework was developed, OntoSDM, which enables spatial data mining algorithms to communicate with ontologies in order to enhance the algorithm’s result. The experiments demonstrated a semantically improved result, generating more interesting clusters, therefore reducing manual analysis work of an expert.
Resumo:
Sensor networks are increasingly becoming one of the main sources of Big Data on the Web. However, the observations that they produce are made available with heterogeneous schemas, vocabularies and data formats, making it difficult to share and reuse these data for other purposes than those for which they were originally set up. In this thesis we address these challenges, considering how we can transform streaming raw data to rich ontology-based information that is accessible through continuous queries for streaming data. Our main contribution is an ontology-based approach for providing data access and query capabilities to streaming data sources, allowing users to express their needs at a conceptual level, independent of implementation and language-specific details. We introduce novel query rewriting and data translation techniques that rely on mapping definitions relating streaming data models to ontological concepts. Specific contributions include: • The syntax and semantics of the SPARQLStream query language for ontologybased data access, and a query rewriting approach for transforming SPARQLStream queries into streaming algebra expressions. • The design of an ontology-based streaming data access engine that can internally reuse an existing data stream engine, complex event processor or sensor middleware, using R2RML mappings for defining relationships between streaming data models and ontology concepts. Concerning the sensor metadata of such streaming data sources, we have investigated how we can use raw measurements to characterize streaming data, producing enriched data descriptions in terms of ontological models. Our specific contributions are: • A representation of sensor data time series that captures gradient information that is useful to characterize types of sensor data. • A method for classifying sensor data time series and determining the type of data, using data mining techniques, and a method for extracting semantic sensor metadata features from the time series.
Resumo:
We describe a domain ontology development approach that extracts domain terms from folksonomies and enrich them with data and vocabularies from the Linked Open Data cloud. As a result, we obtain lightweight domain ontologies that combine the emergent knowledge of social tagging systems with formal knowledge from Ontologies. In order to illustrate the feasibility of our approach, we have produced an ontology in the financial domain from tags available in Delicious, using DBpedia, OpenCyc and UMBEL as additional knowledge sources.
Resumo:
This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the Hidden Vector State (HVS) model. The HVS model belongs to the category of statistical learning methods. It can be trained directly from un-annotated data in a constrained way whilst at the same time being able to capture the underlying named entity relationships. However, it is difficult to incorporate background knowledge or non-local information into the HVS model. This paper proposes to represent the HVS model as a conditionally trained undirected graphical model in which non-local features derived from PPI ontology through inference would be easily incorporated. The seamless fusion of ontology inference with statistical learning produces a new paradigm to information extraction.