5 resultados para Multimodal retrieval

em Boston University Digital Common


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Based on our previous work in deformable shape model-based object detection, a new method is proposed that uses index trees for organizing shape features to support content-based retrieval applications. In the proposed strategy, different shape feature sets can be used in index trees constructed for object detection and shape similarity comparison respectively. There is a direct correspondence between the two shape feature sets. As a result, application-specific features can be obtained efficiently for shape-based retrieval after object detection. A novel approach is proposed that allows retrieval of images based on the population distribution of deformed shapes in each image. Experiments testing these new approaches have been conducted using an image database that contains blood cell micrographs. The precision vs. recall performance measure shows that our method is superior to previous methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Some WWW image engines allow the user to form a query in terms of text keywords. To build the image index, keywords are extracted heuristically from HTML documents containing each image, and/or from the image URL and file headers. Unfortunately, text-based image engines have merely retro-fitted standard SQL database query methods, and it is difficult to include images cues within such a framework. On the other hand, visual statistics (e.g., color histograms) are often insufficient for helping users find desired images in a vast WWW index. By truly unifying textual and visual statistics, one would expect to get better results than either used separately. In this paper, we propose an approach that allows the combination of visual statistics with textual statistics in the vector space representation commonly used in query by image content systems. Text statistics are captured in vector form using latent semantic indexing (LSI). The LSI index for an HTML document is then associated with each of the images contained therein. Visual statistics (e.g., color, orientedness) are also computed for each image. The LSI and visual statistic vectors are then combined into a single index vector that can be used for content-based search of the resulting image database. By using an integrated approach, we are able to take advantage of possible statistical couplings between the topic of the document (latent semantic content) and the contents of images (visual statistics). This allows improved performance in conducting content-based search. This approach has been implemented in a WWW image search engine prototype.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An improved method for deformable shape-based image indexing and retrieval is described. A pre-computed index tree is used to improve the speed of our previously reported on-line model fitting method; simple shape features are used as keys in a pre-generated index tree of model instances. In addition, a coarse to fine indexing scheme is used at different levels of the tree to further improve speed while maintaining matching accuracy. Experimental results show that the speedup is significant, while accuracy of shape-based indexing is maintained. A method for shape population-based retrieval is also described. The method allows query formulation based on the population distributions of shapes in each image. Results of population-based image queries for a database of blood cell micrographs are shown.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract—Personal communication devices are increasingly being equipped with sensors that are able to passively collect information from their surroundings – information that could be stored in fairly small local caches. We envision a system in which users of such devices use their collective sensing, storage, and communication resources to query the state of (possibly remote) neighborhoods. The goal of such a system is to achieve the highest query success ratio using the least communication overhead (power). We show that the use of Data Centric Storage (DCS), or directed placement, is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, amorphous placement, in which sensory samples are cached locally and informed exchanges of cached samples is used to diffuse the sensory data throughout the whole network. In handling queries, the local cache is searched first for potential answers. If unsuccessful, the query is forwarded to one or more direct neighbors for answers. This technique leverages node mobility and caching capabilities to avoid the multi-hop communication overhead of directed placement. Using a simplified mobility model, we provide analytical lower and upper bounds on the ability of amorphous placement to achieve uniform field coverage in one and two dimensions. We show that combining informed shuffling of cached samples upon an encounter between two nodes, with the querying of direct neighbors could lead to significant performance improvements. For instance, under realistic mobility models, our simulation experiments show that amorphous placement achieves 10% to 40% better query answering ratio at a 25% to 35% savings in consumed power over directed placement.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.