Biblioteca Digital

Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.

Veja mais

K-tree : large scale document clustering

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Veja mais

Document clustering with K-tree

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.

Veja mais

Mid-level concept learning with visual contextual ontologies and probabilistic inference for image annotation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.

Veja mais

Defining friendworks : communication perspective on social networks types

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper introduces friendwork as a new term in social networks studies. A friendwork is a network of friends. It is a specific case of an interpersonal social network. Naming this seemingly well known and familiar group of people as a friendwork facilitates its differentiation from the overall social network, while highlighting this subgroup's specific attributes and dynamics. The focus on one segment within social networks stimulates a wider discussion regarding the different subgroups within social networks. Other subgroups also discussed in this paper are: family dependent, work related, location based and virtual acquaintances networks. This discussion informs a larger study of social media, specifically addressing interactive communication modes that are in use within friendworks: direct (face-to-face) and mediated (mainly fixed telephone, internet and mobile phone). It explores the role of social media within friendworks while providing a communication perspective on social networks.

Veja mais

Summary and recommendations

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Random Indexing K-tree

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.

Veja mais

936 resultados para Shlomo Ben Ami

Filtro por publicador