12 resultados para net radiation estimation

em Boston University Digital Common


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimation of 3D hand pose is useful in many gesture recognition applications, ranging from human-computer interaction to automated recognition of sign languages. In this paper, 3D hand pose estimation is treated as a database indexing problem. Given an input image of a hand, the most similar images in a large database of hand images are retrieved. The hand pose parameters of the retrieved images are used as estimates for the hand pose in the input image. Lipschitz embeddings of edge images into a Euclidean space are used to improve the efficiency of database retrieval. In order to achieve interactive retrieval times, similarity queries are initially performed in this Euclidean space. The paper describes ongoing work that focuses on how to best choose reference images, in order to improve retrieval accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel technique to detect and localize periodic movements in video is presented. The distinctive feature of the technique is that it requires neither feature tracking nor object segmentation. Intensity patterns along linear sample paths in space-time are used in estimation of period of object motion in a given sequence of frames. Sample paths are obtained by connecting (in space-time) sample points from regions of high motion magnitude in the first and last frames. Oscillations in intensity values are induced at time instants when an object intersects the sample path. The locations of peaks in intensity are determined by parameters of both cyclic object motion and orientation of the sample path with respect to object motion. The information about peaks is used in a least squares framework to obtain an initial estimate of these parameters. The estimate is further refined using the full intensity profile. The best estimate for the period of cyclic object motion is obtained by looking for consensus among estimates from many sample paths. The proposed technique is evaluated with synthetic videos where ground-truth is known, and with American Sign Language videos where the goal is to detect periodic hand motions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ongoing work towards appearance-based 3D hand pose estimation from a single image is presented. A large database of synthetic hand views is generated using a 3D hand model and computer graphics. The views display different hand shapes as seen from arbitrary viewpoints. Each synthetic view is automatically labeled with parameters describing its hand shape and viewing parameters. Given an input image, the system retrieves the most similar database views, and uses the shape and viewing parameters of those views as candidate estimates for the parameters of the input image. Preliminary results are presented, in which appearance-based similarity is defined in terms of the chamfer distance between edge images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An appearance-based framework for 3D hand shape classification and simultaneous camera viewpoint estimation is presented. Given an input image of a segmented hand, the most similar matches from a large database of synthetic hand images are retrieved. The ground truth labels of those matches, containing hand shape and camera viewpoint information, are returned by the system as estimates for the input image. Database retrieval is done hierarchically, by first quickly rejecting the vast majority of all database views, and then ranking the remaining candidates in order of similarity to the input. Four different similarity measures are employed, based on edge location, edge orientation, finger location and geometric moments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A fundamental task of vision systems is to infer the state of the world given some form of visual observations. From a computational perspective, this often involves facing an ill-posed problem; e.g., information is lost via projection of the 3D world into a 2D image. Solution of an ill-posed problem requires additional information, usually provided as a model of the underlying process. It is important that the model be both computationally feasible as well as theoretically well-founded. In this thesis, a probabilistic, nonlinear supervised computational learning model is proposed: the Specialized Mappings Architecture (SMA). The SMA framework is demonstrated in a computer vision system that can estimate the articulated pose parameters of a human body or human hands, given images obtained via one or more uncalibrated cameras. The SMA consists of several specialized forward mapping functions that are estimated automatically from training data, and a possibly known feedback function. Each specialized function maps certain domains of the input space (e.g., image features) onto the output space (e.g., articulated body parameters). A probabilistic model for the architecture is first formalized. Solutions to key algorithmic problems are then derived: simultaneous learning of the specialized domains along with the mapping functions, as well as performing inference given inputs and a feedback function. The SMA employs a variant of the Expectation-Maximization algorithm and approximate inference. The approach allows the use of alternative conditional independence assumptions for learning and inference, which are derived from a forward model and a feedback model. Experimental validation of the proposed approach is conducted in the task of estimating articulated body pose from image silhouettes. Accuracy and stability of the SMA framework is tested using artificial data sets, as well as synthetic and real video sequences of human bodies and hands.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. Detector training can be accomplished via standard SVM learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the foreground parameters are provided in training, the detectors can also produce parameter estimate. When the foreground object masks are provided in training, the detectors can also produce object segmentation. The advantages of our method over past methods are demonstrated on data sets of human hands and vehicles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel approach for real-time skin segmentation in video sequences is described. The approach enables reliable skin segmentation despite wide variation in illumination during tracking. An explicit second order Markov model is used to predict evolution of the skin color (HSV) histogram over time. Histograms are dynamically updated based on feedback from the current segmentation and based on predictions of the Markov model. The evolution of the skin color distribution at each frame is parameterized by translation, scaling and rotation in color space. Consequent changes in geometric parameterization of the distribution are propagated by warping and re-sampling the histogram. The parameters of the discrete-time dynamic Markov model are estimated using Maximum Likelihood Estimation, and also evolve over time. Quantitative evaluation of the method was conducted on labeled ground-truth video sequences taken from popular movies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A specialized formulation of Azarbayejani and Pentland's framework for recursive recovery of motion, structure and focal length from feature correspondences tracked through an image sequence is presented. The specialized formulation addresses the case where all tracked points lie on a plane. This planarity constraint reduces the dimension of the original state vector, and consequently the number of feature points needed to estimate the state. Experiments with synthetic data and real imagery illustrate the system performance. The experiments confirm that the specialized formulation provides improved accuracy, stability to observation noise, and rate of convergence in estimation for the case where the tracked points lie on a plane.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standard structure from motion algorithms recover 3D structure of points. If a surface representation is desired, for example a piece-wise planar representation, then a two-step procedure typically follows: in the first step the plane-membership of points is first determined manually, and in a subsequent step planes are fitted to the sets of points thus determined, and their parameters are recovered. This paper presents an approach for automatically segmenting planar structures from a sequence of images, and simultaneously estimating their parameters. In the proposed approach the plane-membership of points is determined automatically, and the planar structure parameters are recovered directly in the algorithm rather than indirectly in a post-processing stage. Simulated and real experimental results show the efficacy of this approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A learning based framework is proposed for estimating human body pose from a single image. Given a differentiable function that maps from pose space to image feature space, the goal is to invert the process: estimate the pose given only image features. The inversion is an ill-posed problem as the inverse mapping is a one to many process. Hence multiple solutions exist, and it is desirable to restrict the solution space to a smaller subset of feasible solutions. For example, not all human body poses are feasible due to anthropometric constraints. Since the space of feasible solutions may not admit a closed form description, the proposed framework seeks to exploit machine learning techniques to learn an approximation that is smoothly parameterized over such a space. One such technique is Gaussian Process Latent Variable Modelling. Scaled conjugate gradient is then used find the best matching pose in the space of feasible solutions when given an input image. The formulation allows easy incorporation of various constraints, e.g. temporal consistency and anthropometric constraints. The performance of the proposed approach is evaluated in the task of upper-body pose estimation from silhouettes and compared with the Specialized Mapping Architecture. The estimation accuracy of the Specialized Mapping Architecture is at least one standard deviation worse than the proposed approach in the experiments with synthetic data. In experiments with real video of humans performing gestures, the proposed approach produces qualitatively better estimation results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A nonparametric probability estimation procedure using the fuzzy ARTMAP neural network is here described. Because the procedure does not make a priori assumptions about underlying probability distributions, it yields accurate estimates on a wide variety of prediction tasks. Fuzzy ARTMAP is used to perform probability estimation in two different modes. In a 'slow-learning' mode, input-output associations change slowly, with the strength of each association computing a conditional probability estimate. In 'max-nodes' mode, a fixed number of categories are coded during an initial fast learning interval, and weights are then tuned by slow learning. Simulations illustrate system performance on tasks in which various numbers of clusters in the set of input vectors mapped to a given class.