14 resultados para Hand tools

em Boston University Digital Common


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align the two objects. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting which types of deformations are used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2-D contour and point feature data are shown, and recognition experiments with 2-D images of hand tools and airplanes are described.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method is proposed that can generate a ranked list of plausible three-dimensional hand configurations that best match an input image. Hand pose estimation is formulated as an image database indexing problem, where the closest matches for an input hand image are retrieved from a large database of synthetic hand images. In contrast to previous approaches, the system can function in the presence of clutter, thanks to two novel clutter-tolerant indexing methods. First, a computationally efficient approximation of the image-to-model chamfer distance is obtained by embedding binary edge images into a high-dimensional Euclide an space. Second, a general-purpose, probabilistic line matching method identifies those line segment correspondences between model and input images that are the least likely to have occurred by chance. The performance of this clutter-tolerant approach is demonstrated in quantitative experiments with hundreds of real hand images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estimation of 3D hand pose is useful in many gesture recognition applications, ranging from human-computer interaction to automated recognition of sign languages. In this paper, 3D hand pose estimation is treated as a database indexing problem. Given an input image of a hand, the most similar images in a large database of hand images are retrieved. The hand pose parameters of the retrieved images are used as estimates for the hand pose in the input image. Lipschitz embeddings of edge images into a Euclidean space are used to improve the efficiency of database retrieval. In order to achieve interactive retrieval times, similarity queries are initially performed in this Euclidean space. The paper describes ongoing work that focuses on how to best choose reference images, in order to improve retrieval accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A framework for the simultaneous localization and recognition of dynamic hand gestures is proposed. At the core of this framework is a dynamic space-time warping (DSTW) algorithm, that aligns a pair of query and model gestures in both space and time. For every frame of the query sequence, feature detectors generate multiple hand region candidates. Dynamic programming is then used to compute both a global matching cost, which is used to recognize the query gesture, and a warping path, which aligns the query and model sequences in time, and also finds the best hand candidate region in every query frame. The proposed framework includes translation invariant recognition of gestures, a desirable property for many HCI systems. The performance of the approach is evaluated on a dataset of hand signed digits gestured by people wearing short sleeve shirts, in front of a background containing other non-hand skin-colored objects. The algorithm simultaneously localizes the gesturing hand and recognizes the hand-signed digit. Although DSTW is illustrated in a gesture recognition setting, the proposed algorithm is a general method for matching time series, that allows for multiple candidate feature vectors to be extracted at each time step.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In gesture and sign language video sequences, hand motion tends to be rapid, and hands frequently appear in front of each other or in front of the face. Thus, hand location is often ambiguous, and naive color-based hand tracking is insufficient. To improve tracking accuracy, some methods employ a prediction-update framework, but such methods require careful initialization of model parameters, and tend to drift and lose track in extended sequences. In this paper, a temporal filtering framework for hand tracking is proposed that can initialize and reset itself without human intervention. In each frame, simple features like color and motion residue are exploited to identify multiple candidate hand locations. The temporal filter then uses the Viterbi algorithm to select among the candidates from frame to frame. The resulting tracking system can automatically identify video trajectories of unambiguous hand motion, and detect frames where tracking becomes ambiguous because of occlusions or overlaps. Experiments on video sequences of several hundred frames in duration demonstrate the system's ability to track hands robustly, to detect and handle tracking ambiguities, and to extract the trajectories of unambiguous hand motion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One-and two-dimensional cellular automata which are known to be fault-tolerant are very complex. On the other hand, only very simple cellular automata have actually been proven to lack fault-tolerance, i.e., to be mixing. The latter either have large noise probability ε or belong to the small family of two-state nearest-neighbor monotonic rules which includes local majority voting. For a certain simple automaton L called the soldiers rule, this problem has intrigued researchers for the last two decades since L is clearly more robust than local voting: in the absence of noise, L eliminates any finite island of perturbation from an initial configuration of all 0's or all 1's. The same holds for a 4-state monotonic variant of L, K, called two-line voting. We will prove that the probabilistic cellular automata Kε and Lε asymptotically lose all information about their initial state when subject to small, strongly biased noise. The mixing property trivially implies that the systems are ergodic. The finite-time information-retaining quality of a mixing system can be represented by its relaxation time Relax(⋅), which measures the time before the onset of significant information loss. This is known to grow as (1/ε)^c for noisy local voting. The impressive error-correction ability of L has prompted some researchers to conjecture that Relax(Lε) = 2^(c/ε). We prove the tight bound 2^(c1log^21/ε) < Relax(Lε) < 2^(c2log^21/ε) for a biased error model. The same holds for Kε. Moreover, the lower bound is independent of the bias assumption. The strong bias assumption makes it possible to apply sparsity/renormalization techniques, the main tools of our investigation, used earlier in the opposite context of proving fault-tolerance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ongoing work towards appearance-based 3D hand pose estimation from a single image is presented. A large database of synthetic hand views is generated using a 3D hand model and computer graphics. The views display different hand shapes as seen from arbitrary viewpoints. Each synthetic view is automatically labeled with parameters describing its hand shape and viewing parameters. Given an input image, the system retrieves the most similar database views, and uses the shape and viewing parameters of those views as candidate estimates for the parameters of the input image. Preliminary results are presented, in which appearance-based similarity is defined in terms of the chamfer distance between edge images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An appearance-based framework for 3D hand shape classification and simultaneous camera viewpoint estimation is presented. Given an input image of a segmented hand, the most similar matches from a large database of synthetic hand images are retrieved. The ground truth labels of those matches, containing hand shape and camera viewpoint information, are returned by the system as estimates for the input image. Database retrieval is done hierarchically, by first quickly rejecting the vast majority of all database views, and then ranking the remaining candidates in order of similarity to the input. Four different similarity measures are employed, based on edge location, edge orientation, finger location and geometric moments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Locating hands in sign language video is challenging due to a number of factors. Hand appearance varies widely across signers due to anthropometric variations and varying levels of signer proficiency. Video can be captured under varying illumination, camera resolutions, and levels of scene clutter, e.g., high-res video captured in a studio vs. low-res video gathered by a web cam in a user’s home. Moreover, the signers’ clothing varies, e.g., skin-toned clothing vs. contrasting clothing, short-sleeved vs. long-sleeved shirts, etc. In this work, the hand detection problem is addressed in an appearance matching framework. The Histogram of Oriented Gradient (HOG) based matching score function is reformulated to allow non-rigid alignment between pairs of images to account for hand shape variation. The resulting alignment score is used within a Support Vector Machine hand/not-hand classifier for hand detection. The new matching score function yields improved performance (in ROC area and hand detection rate) over the Vocabulary Guided Pyramid Match Kernel (VGPMK) and the traditional, rigid HOG distance on American Sign Language video gestured by expert signers. The proposed match score function is computationally less expensive (for training and testing), has fewer parameters and is less sensitive to parameter settings than VGPMK. The proposed detector works well on test sequences from an inexpert signer in a non-studio setting with cluttered background.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In research areas involving mathematical rigor, there are numerous benefits to adopting a formal representation of models and arguments: reusability, automatic evaluation of examples, and verification of consistency and correctness. However, accessibility has not been a priority in the design of formal verification tools that can provide these benefits. In earlier work [30] we attempt to address this broad problem by proposing several specific design criteria organized around the notion of a natural context: the sphere of awareness a working human user maintains of the relevant constructs, arguments, experiences, and background materials necessary to accomplish the task at hand. In this report we evaluate our proposed design criteria by utilizing within the context of novel research a formal reasoning system that is designed according to these criteria. In particular, we consider how the design and capabilities of the formal reasoning system that we employ influence, aid, or hinder our ability to accomplish a formal reasoning task – the assembly of a machine-verifiable proof pertaining to the NetSketch formalism. NetSketch is a tool for the specification of constrained-flow applications and the certification of desirable safety properties imposed thereon. NetSketch is conceived to assist system integrators in two types of activities: modeling and design. It provides capabilities for compositional analysis based on a strongly-typed domain-specific language (DSL) for describing and reasoning about constrained-flow networks and invariants that need to be enforced thereupon. In a companion paper [13] we overview NetSketch, highlight its salient features, and illustrate how it could be used in actual applications. In this paper, we define using a machine-readable syntax major parts of the formal system underlying the operation of NetSketch, along with its semantics and a corresponding notion of validity. We then provide a proof of soundness for the formalism that can be partially verified using a lightweight formal reasoning system that simulates natural contexts. A traditional presentation of these definitions and arguments can be found in the full report on the NetSketch formalism [12].

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In work that involves mathematical rigor, there are numerous benefits to adopting a representation of models and arguments that can be supplied to a formal reasoning or verification system: reusability, automatic evaluation of examples, and verification of consistency and correctness. However, accessibility has not been a priority in the design of formal verification tools that can provide these benefits. In earlier work [Lap09a], we attempt to address this broad problem by proposing several specific design criteria organized around the notion of a natural context: the sphere of awareness a working human user maintains of the relevant constructs, arguments, experiences, and background materials necessary to accomplish the task at hand. This work expands one aspect of the earlier work by considering more extensively an essential capability for any formal reasoning system whose design is oriented around simulating the natural context: native support for a collection of mathematical relations that deal with common constructs in arithmetic and set theory. We provide a formal definition for a context of relations that can be used to both validate and assist formal reasoning activities. We provide a proof that any algorithm that implements this formal structure faithfully will necessary converge. Finally, we consider the efficiency of an implementation of this formal structure that leverages modular implementations of well-known data structures: balanced search trees and transitive closures of hypergraphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A system for recovering 3D hand pose from monocular color sequences is proposed. The system employs a non-linear supervised learning framework, the specialized mappings architecture (SMA), to map image features to likely 3D hand poses. The SMA's fundamental components are a set of specialized forward mapping functions, and a single feedback matching function. The forward functions are estimated directly from training data, which in our case are examples of hand joint configurations and their corresponding visual features. The joint angle data in the training set is obtained via a CyberGlove, a glove with 22 sensors that monitor the angular motions of the palm and fingers. In training, the visual features are generated using a computer graphics module that renders the hand from arbitrary viewpoints given the 22 joint angles. We test our system both on synthetic sequences and on sequences taken with a color camera. The system automatically detects and tracks both hands of the user, calculates the appropriate features, and estimates the 3D hand joint angles from those features. Results are encouraging given the complexity of the task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The therapeutic effects of playing music are being recognized increasingly in the field of rehabilitation medicine. People with physical disabilities, however, often do not have the motor dexterity needed to play an instrument. We developed a camera-based human-computer interface called "Music Maker" to provide such people with a means to make music by performing therapeutic exercises. Music Maker uses computer vision techniques to convert the movements of a patient's body part, for example, a finger, hand, or foot, into musical and visual feedback using the open software platform EyesWeb. It can be adjusted to a patient's particular therapeutic needs and provides quantitative tools for monitoring the recovery process and assessing therapeutic outcomes. We tested the potential of Music Maker as a rehabilitation tool with six subjects who responded to or created music in various movement exercises. In these proof-of-concept experiments, Music Maker has performed reliably and shown its promise as a therapeutic device.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a self-organizing neural model for eye-hand coordination. Called the DIRECT model, it embodies a solution of the classical motor equivalence problem. Motor equivalence computations allow humans and other animals to flexibly employ an arm with more degrees of freedom than the space in which it moves to carry out spatially defined tasks under conditions that may require novel joint configurations. During a motor babbling phase, the model endogenously generates movement commands that activate the correlated visual, spatial, and motor information that are used to learn its internal coordinate transformations. After learning occurs, the model is capable of controlling reaching movements of the arm to prescribed spatial targets using many different combinations of joints. When allowed visual feedback, the model can automatically perform, without additional learning, reaches with tools of variable lengths, with clamped joints, with distortions of visual input by a prism, and with unexpected perturbations. These compensatory computations occur within a single accurate reaching movement. No corrective movements are needed. Blind reaches using internal feedback have also been simulated. The model achieves its competence by transforming visual information about target position and end effector position in 3-D space into a body-centered spatial representation of the direction in 3-D space that the end effector must move to contact the target. The spatial direction vector is adaptively transformed into a motor direction vector, which represents the joint rotations that move the end effector in the desired spatial direction from the present arm configuration. Properties of the model are compared with psychophysical data on human reaching movements, neurophysiological data on the tuning curves of neurons in the monkey motor cortex, and alternative models of movement control.