7 resultados para Approximate Sum Rule

em Boston University Digital Common


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces BoostMap, a method that can significantly reduce retrieval time in image and video database systems that employ computationally expensive distance measures, metric or non-metric. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated in a hand pose estimation system, and a dynamic gesture recognition system, where the proposed method is used to retrieve approximate nearest neighbors under expensive image and video similarity measures. In both systems, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited memory or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the joins contain much smaller number of tuples than the tuples contained in the sliding windows. Therefore, a stream buffer management policy is needed in that case. We show that the buffer replacement policy is an important determinant of the quality of the produced results. To that end, we propose GreedyDual-Join (GDJ) an adaptive and locality-aware buffering technique for managing these buffers. GDJ exploits the temporal correlations (at both long and short time scales), which we found to be prevalent in many real data streams. We note that our algorithm is readily applicable to multiple data streams and multiple joins and requires almost no additional system resources. We report results of an experimental study using both synthetic and real-world data sets. Our results demonstrate the superiority and flexibility of our approach when contrasted to other recently proposed techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous studies have shown that giving preferential treatment to short jobs helps reduce the average system response time, especially when the job size distribution possesses the heavy-tailed property. Since it has been shown that the TCP flow length distribution also has the same property, it is natural to let short TCP flows enjoy better service inside the network. Analyzing such discriminatory system requires modification to traditional job scheduling models since usually network traffic managers do not have detailed knowledge about individual flows such as their lengths. The Multi-Level (ML) queue, proposed by Kleinrock, can b e used to characterize such system. In an ML queueing system, the priority of a flow is reduced as the flow stays longer. We present an approximate analysis of the ML queueing system to obtain a closed-form solution of the average system response time function for general flow size distributions. We show that the response time of short flows can be significantly reduced without penalizing long flows.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present new, simple, efficient data structures for approximate reconciliation of set differences, a useful standalone primitive for peer-to-peer networks and a natural subroutine in methods for exact reconciliation. In the approximate reconciliation problem, peers A and B respectively have subsets of elements SA and SB of a large universe U. Peer A wishes to send a short message M to peer B with the goal that B should use M to determine as many elements in the set SB–SA as possible. To avoid the expense of round trip communication times, we focus on the situation where a single message M is sent. We motivate the performance tradeoffs between message size, accuracy and computation time for this problem with a straightforward approach using Bloom filters. We then introduce approximation reconciliation trees, a more computationally efficient solution that combines techniques from Patricia tries, Merkle trees, and Bloom filters. We present an analysis of approximation reconciliation trees and provide experimental results comparing the various methods proposed for approximate reconciliation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper shows how knowledge, in the form of fuzzy rules, can be derived from a self-organizing supervised learning neural network called fuzzy ARTMAP. Rule extraction proceeds in two stages: pruning removes those recognition nodes whose confidence index falls below a selected threshold; and quantization of continuous learned weights allows the final system state to be translated into a usable set of rules. Simulations on a medical prediction problem, the Pima Indian Diabetes (PID) database, illustrate the method. In the simulations, pruned networks about 1/3 the size of the original actually show improved performance. Quantization yields comprehensible rules with only slight degradation in test set prediction performance.