10 resultados para area-based matching

em Boston University Digital Common


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Locating hands in sign language video is challenging due to a number of factors. Hand appearance varies widely across signers due to anthropometric variations and varying levels of signer proficiency. Video can be captured under varying illumination, camera resolutions, and levels of scene clutter, e.g., high-res video captured in a studio vs. low-res video gathered by a web cam in a user’s home. Moreover, the signers’ clothing varies, e.g., skin-toned clothing vs. contrasting clothing, short-sleeved vs. long-sleeved shirts, etc. In this work, the hand detection problem is addressed in an appearance matching framework. The Histogram of Oriented Gradient (HOG) based matching score function is reformulated to allow non-rigid alignment between pairs of images to account for hand shape variation. The resulting alignment score is used within a Support Vector Machine hand/not-hand classifier for hand detection. The new matching score function yields improved performance (in ROC area and hand detection rate) over the Vocabulary Guided Pyramid Match Kernel (VGPMK) and the traditional, rigid HOG distance on American Sign Language video gestured by expert signers. The proposed match score function is computationally less expensive (for training and testing), has fewer parameters and is less sensitive to parameter settings than VGPMK. The proposed detector works well on test sequences from an inexpert signer in a non-studio setting with cluttered background.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Controlling the mobility pattern of mobile nodes (e.g., robots) to monitor a given field is a well-studied problem in sensor networks. In this setup, absolute control over the nodes’ mobility is assumed. Apart from the physical ones, no other constraints are imposed on planning mobility of these nodes. In this paper, we address a more general version of the problem. Specifically, we consider a setting in which mobility of each node is externally constrained by a schedule consisting of a list of locations that the node must visit at particular times. Typically, such schedules exhibit some level of slack, which could be leveraged to achieve a specific coverage distribution of a field. Such a distribution defines the relative importance of different field locations. We define the Constrained Mobility Coordination problem for Preferential Coverage (CMC-PC) as follows: given a field with a desired monitoring distribution, and a number of nodes n, each with its own schedule, we need to coordinate the mobility of the nodes in order to achieve the following two goals: 1) satisfy the schedules of all nodes, and 2) attain the required coverage of the given field. We show that the CMC-PC problem is NP-complete (by reduction to the Hamiltonian Cycle problem). Then we propose TFM, a distributed heuristic to achieve field coverage that is as close as possible to the required coverage distribution. We verify the premise of TFM using extensive simulations, as well as taxi logs from a major metropolitan area. We compare TFM to the random mobility strategy—the latter provides a lower bound on performance. Our results show that TFM is very successful in matching the required field coverage distribution, and that it provides, at least, two-fold query success ratio for queries that follow the target coverage distribution of the field.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use appearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the LDA bag-of-words model for appearance. The encoded appearance is then used to establish probable matching across cameras. Preliminary experiments are conducted on a dataset of 20 individuals and comparison against Madden’s I-MCHR is reported.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align the two objects. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting which types of deformations are used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2-D contour and point feature data are shown, and recognition experiments with 2-D images of hand tools and airplanes are described.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Commonly, research work in routing for delay tolerant networks (DTN) assumes that node encounters are predestined, in the sense that they are the result of unknown, exogenous processes that control the mobility of these nodes. In this paper, we argue that for many applications such an assumption is too restrictive: while the spatio-temporal coordinates of the start and end points of a node's journey are determined by exogenous processes, the specific path that a node may take in space-time, and hence the set of nodes it may encounter could be controlled in such a way so as to improve the performance of DTN routing. To that end, we consider a setting in which each mobile node is governed by a schedule consisting of a ist of locations that the node must visit at particular times. Typically, such schedules exhibit some level of slack, which could be leveraged for DTN message delivery purposes. We define the Mobility Coordination Problem (MCP) for DTNs as follows: Given a set of nodes, each with its own schedule, and a set of messages to be exchanged between these nodes, devise a set of node encounters that minimize message delivery delays while satisfying all node schedules. The MCP for DTNs is general enough that it allows us to model and evaluate some of the existing DTN schemes, including data mules and message ferries. In this paper, we show that MCP for DTNs is NP-hard and propose two detour-based approaches to solve the problem. The first (DMD) is a centralized heuristic that leverages knowledge of the message workload to suggest specific detours to optimize message delivery. The second (DNE) is a distributed heuristic that is oblivious to the message workload, and which selects detours so as to maximize node encounters. We evaluate the performance of these detour-based approaches using extensive simulations based on synthetic workloads as well as real schedules obtained from taxi logs in a major metropolitan area. Our evaluation shows that our centralized, workload-aware DMD approach yields the best performance, in terms of message delay and delivery success ratio, and that our distributed, workload-oblivious DNE approach yields favorable performance when compared to approaches that require the use of data mules and message ferries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The performance of different classification approaches is evaluated using a view-based approach for motion representation. The view-based approach uses computer vision and image processing techniques to register and process the video sequence. Two motion representations called Motion Energy Images and Motion History Image are then constructed. These representations collapse the temporal component in a way that no explicit temporal analysis or sequence matching is needed. Statistical descriptions are then computed using moment-based features and dimensionality reduction techniques. For these tests, we used 7 Hu moments, which are invariant to scale and translation. Principal Components Analysis is used to reduce the dimensionality of this representation. The system is trained using different subjects performing a set of examples of every action to be recognized. Given these samples, K-nearest neighbor, Gaussian, and Gaussian mixture classifiers are used to recognize new actions. Experiments are conducted using instances of eight human actions (i.e., eight classes) performed by seven different subjects. Comparisons in the performance among these classifiers under different conditions are analyzed and reported. Our main goals are to test this dimensionality-reduced representation of actions, and more importantly to use this representation to compare the advantages of different classification approaches in this recognition task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most vexing questions facing researchers interested in the World Wide Web is why users often experience long delays in document retrieval. The Internet's size, complexity, and continued growth make this a difficult question to answer. We describe the Wide Area Web Measurement project (WAWM) which uses an infrastructure distributed across the Internet to study Web performance. The infrastructure enables simultaneous measurements of Web client performance, network performance and Web server performance. The infrastructure uses a Web traffic generator to create representative workloads on servers, and both active and passive tools to measure performance characteristics. Initial results based on a prototype installation of the infrastructure are presented in this paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An improved method for deformable shape-based image indexing and retrieval is described. A pre-computed index tree is used to improve the speed of our previously reported on-line model fitting method; simple shape features are used as keys in a pre-generated index tree of model instances. In addition, a coarse to fine indexing scheme is used at different levels of the tree to further improve speed while maintaining matching accuracy. Experimental results show that the speedup is significant, while accuracy of shape-based indexing is maintained. A method for shape population-based retrieval is also described. The method allows query formulation based on the population distributions of shapes in each image. Results of population-based image queries for a database of blood cell micrographs are shown.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose that a simple, closed-form mathematical expression--the Wedge-Dipole mapping--provides a concise approximation to the full-field, two-dimensional topographic structure of macaque V1, V2, and V3. A single map function, which we term a map complex, acts as a simultaneous descriptor of all three areas. Quantitative estimation of the Wedge-Dipole parameters is provided via 2DG data of central-field V1 topography and a publicly available data set of full-field macaque V1 and V2 topography. Good quantitative agreement is obtained between the data and the model presented here. The increasing importance of fMRI-based brain imaging motivates the development of more sophisticated two-dimensional models of cortical visuotopy, in contrast to the one-dimensional approximations that have been in common use. One reason is that topography has traditionally supplied an important aspect of "ground truth", or validation, for brain imaging, suggesting that further development of high-resolution fMRI will be facilitated by this data analysis. In addition, several important insights into the nature of cortical topography follows from this work. The presence of anisotropy in cortical magnification factor is shown to follow mathematically from the shared boundary conditions at the V1-V2 and V2-V3 borders, and therefore may not causally follow from the existence of columnar systems in these areas, as is widely assumed. An application of the Wedge-Dipole model to localizing aspects of visual processing to specific cortical areas--extending previous work in correlating V1 cortical magnification factor to retinal anatomy or visual psychophysics data--is briefly discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes neural network models for adaptive control of arm movement trajectories during visually guided reaching and, more generally, a framework for unsupervised real-time error-based learning. The models clarify how a child, or untrained robot, can learn to reach for objects that it sees. Piaget has provided basic insights with his concept of a circular reaction: As an infant makes internally generated movements of its hand, the eyes automatically follow this motion. A transformation is learned between the visual representation of hand position and the motor representation of hand position. Learning of this transformation eventually enables the child to accurately reach for visually detected targets. Grossberg and Kuperstein have shown how the eye movement system can use visual error signals to correct movement parameters via cerebellar learning. Here it is shown how endogenously generated arm movements lead to adaptive tuning of arm control parameters. These movements also activate the target position representations that are used to learn the visuo-motor transformation that controls visually guided reaching. The AVITE model presented here is an adaptive neural circuit based on the Vector Integration to Endpoint (VITE) model for arm and speech trajectory generation of Bullock and Grossberg. In the VITE model, a Target Position Command (TPC) represents the location of the desired target. The Present Position Command (PPC) encodes the present hand-arm configuration. The Difference Vector (DV) population continuously.computes the difference between the PPC and the TPC. A speed-controlling GO signal multiplies DV output. The PPC integrates the (DV)·(GO) product and generates an outflow command to the arm. Integration at the PPC continues at a rate dependent on GO signal size until the DV reaches zero, at which time the PPC equals the TPC. The AVITE model explains how self-consistent TPC and PPC coordinates are autonomously generated and learned. Learning of AVITE parameters is regulated by activation of a self-regulating Endogenous Random Generator (ERG) of training vectors. Each vector is integrated at the PPC, giving rise to a movement command. The generation of each vector induces a complementary postural phase during which ERG output stops and learning occurs. Then a new vector is generated and the cycle is repeated. This cyclic, biphasic behavior is controlled by a specialized gated dipole circuit. ERG output autonomously stops in such a way that, across trials, a broad sample of workspace target positions is generated. When the ERG shuts off, a modulator gate opens, copying the PPC into the TPC. Learning of a transformation from TPC to PPC occurs using the DV as an error signal that is zeroed due to learning. This learning scheme is called a Vector Associative Map, or VAM. The VAM model is a general-purpose device for autonomous real-time error-based learning and performance of associative maps. The DV stage serves the dual function of reading out new TPCs during performance and reading in new adaptive weights during learning, without a disruption of real-time operation. YAMs thus provide an on-line unsupervised alternative to the off-line properties of supervised error-correction learning algorithms. YAMs and VAM cascades for learning motor-to-motor and spatial-to-motor maps are described. YAM models and Adaptive Resonance Theory (ART) models exhibit complementary matching, learning, and performance properties that together provide a foundation for designing a total sensory-cognitive and cognitive-motor autonomous system.