72 resultados para TSDEAI Semantic-Web Twitter Semantic-Search WordNet LSA

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In this work we address this issue with a novel semantic search system that a) provides the user with the capability to query Semantic Web information using natural language, by means of an ontology-based Question Answering (QA) system [14] and b) complements the specific answers retrieved during the QA process with a ranked list of documents from the Web [3]. Our results show that ontology-based semantic search capabilities can be used to complement and enhance keyword search technologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluations of semantic search systems are generally small scale and ad hoc due to the lack of appropriate resources such as test collections, agreed performance criteria and independent judgements of performance. By analysing our work in building and evaluating semantic tools over the last five years, we conclude that the growth of the semantic web led to an improvement in the available resources and the consequent robustness of performance assessments. We propose two directions for continuing evaluation work: the development of extensible evaluation benchmarks and the use of logging parameters for evaluating individual components of search systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of semantic search is to improve on traditional search methods by exploiting the semantic metadata. In this paper, we argue that supporting iterative and exploratory search modes is important to the usability of all search systems. We also identify the types of semantic queries the users need to make, the issues concerning the search environment and the problems that are intrinsic to semantic search in particular. We then review the four modes of user interaction in existing semantic search systems, namely keyword-based, form-based, view-based and natural language-based systems. Future development should focus on multimodal search systems, which exploit the advantages of more than one mode of interaction, and on developing the search systems that can search heterogeneous semantic metadata on the open semantic Web.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Formulating complex queries is hard, especially when users cannot understand all the data structures of multiple complex knowledge bases. We see a gap between simplistic but user friendly tools and formal query languages. Building on an example comparison search, we propose an approach in which reusable search components take an intermediary role between the user interface and formal query languages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing semantic search tools have been primarily designed to enhance the performance of traditional search technologies but with little support for ordinary end users who are not necessarily familiar with domain specific semantic data, ontologies, or SQL-like query languages. This paper presents SemSearch, a search engine, which pays special attention to this issue by providing several means to hide the complexity of semantic search from end users and thus make it easy to use and effective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents our Semantic Web portal infrastructure, which focuses on how to enhance knowledge access in traditional Web portals by gathering and exploiting semantic metadata. Special attention is paid to three important issues that affect the performance of knowledge access: i) high quality metadata acquisition, which concerns how to ensure high quality while gathering semantic metadata from heterogeneous data sources; ii) semantic search, which addresses how to meet the information querying needs of ordinary end users who are not necessarily familiar with the problem domain or the supported query language; and iii) semantic browsing, which concerns how to help users understand and explore the problem domain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The expansion of the Internet has made the task of searching a crucial one. Internet users, however, have to make a great effort in order to formulate a search query that returns the required results. Many methods have been devised to assist in this task by helping the users modify their query to give better results. In this paper we propose an interactive method for query expansion. It is based on the observation that documents are often found to contain terms with high information content, which can summarise their subject matter. We present experimental results, which demonstrate that our approach significantly shortens the time required in order to accomplish a certain task by performing web searches.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sentiment analysis over Twitter offer organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. In this paper, we introduce a novel approach of adding semantics as additional features into the training set for sentiment analysis. For each extracted entity (e.g. iPhone) from tweets, we add its semantic concept (e.g. Apple product) as an additional feature, and measure the correlation of the representative concept with negative/positive sentiment. We apply this approach to predict sentiment for three different Twitter datasets. Our results show an average increase of F harmonic accuracy score for identifying both negative and positive sentiment of around 6.5% and 4.8% over the baselines of unigrams and part-of-speech features respectively. We also compare against an approach based on sentiment-bearing topic analysis, and find that semantic features produce better Recall and F score when classifying negative sentiment, and better Precision with lower Recall and F score in positive sentiment classification.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we propose algorithms for combining and ranking answers from distributed heterogeneous data sources in the context of a multi-ontology Question Answering task. Our proposal includes a merging algorithm that aggregates, combines and filters ontology-based search results and three different ranking algorithms that sort the final answers according to different criteria such as popularity, confidence and semantic interpretation of results. An experimental evaluation on a large scale corpus indicates improvements in the quality of the search results with respect to a scenario where the merging and ranking algorithms were not applied. These collective methods for merging and ranking allow to answer questions that are distributed across ontologies, while at the same time, they can filter irrelevant answers, fuse similar answers together, and elicit the most accurate answer(s) to a question.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Most existing approaches to Twitter sentiment analysis assume that sentiment is explicitly expressed through affective words. Nevertheless, sentiment is often implicitly expressed via latent semantic relations, patterns and dependencies among words in tweets. In this paper, we propose a novel approach that automatically captures patterns of words of similar contextual semantics and sentiment in tweets. Unlike previous work on sentiment pattern extraction, our proposed approach does not rely on external and fixed sets of syntactical templates/patterns, nor requires deep analyses of the syntactic structure of sentences in tweets. We evaluate our approach with tweet- and entity-level sentiment analysis tasks by using the extracted semantic patterns as classification features in both tasks. We use 9 Twitter datasets in our evaluation and compare the performance of our patterns against 6 state-of-the-art baselines. Results show that our patterns consistently outperform all other baselines on all datasets by 2.19% at the tweet-level and 7.5% at the entity-level in average F-measure.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Social media has become an effective channel for communicating both trends and public opinion on current events. However the automatic topic classification of social media content pose various challenges. Topic classification is a common technique used for automatically capturing themes that emerge from social media streams. However, such techniques are sensitive to the evolution of topics when new event-dependent vocabularies start to emerge (e.g., Crimea becoming relevant to War Conflict during the Ukraine crisis in 2014). Therefore, traditional supervised classification methods which rely on labelled data could rapidly become outdated. In this paper we propose a novel transfer learning approach to address the classification task of new data when the only available labelled data belong to a previous epoch. This approach relies on the incorporation of knowledge from DBpedia graphs. Our findings show promising results in understanding how features age, and how semantic features can support the evolution of topic classifiers.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure. © 2014 Springer International Publishing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a vision and a proposal for using Semantic Web technologies in the organic food industry. This is a very knowledge intensive industry at every step from the producer, to the caterer or restauranteur, through to the consumer. There is a crucial need for a concept of environmental audit which would allow the various stake holders to know the full environmental impact of their economic choices. This is a di?erent and parallel form of knowledge to that of price. Semantic Web technologies can be used e?ectively for the calculation and transfer of this type of knowledge (together with other forms of multimedia data) which could contribute considerably to the commercial and educational impact of the organic food industry. We outline how this could be achieved as our essential ob jective is to show how advanced technologies could be used to both reduce ecological impact and increase public awareness.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.