300 resultados para EIA
Resumo:
An improved method for deformable shape-based image indexing and retrieval is described. A pre-computed index tree is used to improve the speed of our previously reported on-line model fitting method; simple shape features are used as keys in a pre-generated index tree of model instances. In addition, a coarse to fine indexing scheme is used at different levels of the tree to further improve speed while maintaining matching accuracy. Experimental results show that the speedup is significant, while accuracy of shape-based indexing is maintained. A method for shape population-based retrieval is also described. The method allows query formulation based on the population distributions of shapes in each image. Results of population-based image queries for a database of blood cell micrographs are shown.
Resumo:
The majority of the traffic (bytes) flowing over the Internet today have been attributed to the Transmission Control Protocol (TCP). This strong presence of TCP has recently spurred further investigations into its congestion avoidance mechanism and its effect on the performance of short and long data transfers. At the same time, the rising interest in enhancing Internet services while keeping the implementation cost low has led to several service-differentiation proposals. In such service-differentiation architectures, much of the complexity is placed only in access routers, which classify and mark packets from different flows. Core routers can then allocate enough resources to each class of packets so as to satisfy delivery requirements, such as predictable (consistent) and fair service. In this paper, we investigate the interaction among short and long TCP flows, and how TCP service can be improved by employing a low-cost service-differentiation scheme. Through control-theoretic arguments and extensive simulations, we show the utility of isolating TCP flows into two classes based on their lifetime/size, namely one class of short flows and another of long flows. With such class-based isolation, short and long TCP flows have separate service queues at routers. This protects each class of flows from the other as they possess different characteristics, such as burstiness of arrivals/departures and congestion/sending window dynamics. We show the benefits of isolation, in terms of better predictability and fairness, over traditional shared queueing systems with both tail-drop and Random-Early-Drop (RED) packet dropping policies. The proposed class-based isolation of TCP flows has several advantages: (1) the implementation cost is low since it only requires core routers to maintain per-class (rather than per-flow) state; (2) it promises to be an effective traffic engineering tool for improved predictability and fairness for both short and long TCP flows; and (3) stringent delay requirements of short interactive transfers can be met by increasing the amount of resources allocated to the class of short flows.
Resumo:
A system for recovering 3D hand pose from monocular color sequences is proposed. The system employs a non-linear supervised learning framework, the specialized mappings architecture (SMA), to map image features to likely 3D hand poses. The SMA's fundamental components are a set of specialized forward mapping functions, and a single feedback matching function. The forward functions are estimated directly from training data, which in our case are examples of hand joint configurations and their corresponding visual features. The joint angle data in the training set is obtained via a CyberGlove, a glove with 22 sensors that monitor the angular motions of the palm and fingers. In training, the visual features are generated using a computer graphics module that renders the hand from arbitrary viewpoints given the 22 joint angles. We test our system both on synthetic sequences and on sequences taken with a color camera. The system automatically detects and tracks both hands of the user, calculates the appropriate features, and estimates the 3D hand joint angles from those features. Results are encouraging given the complexity of the task.
Resumo:
An improved method for deformable shape-based image segmentation is described. Image regions are merged together and/or split apart, based on their agreement with an a priori distribution on the global deformation parameters for a shape template. The quality of a candidate region merging is evaluated by a cost measure that includes: homogeneity of image properties within the combined region, degree of overlap with a deformed shape model, and a deformation likelihood term. Perceptually-motivated criteria are used to determine where/how to split regions, based on the local shape properties of the region group's bounding contour. A globally consistent interpretation is determined in part by the minimum description length principle. Experiments show that the model-based splitting strategy yields a significant improvement in segmention over a method that uses merging alone.
Resumo:
A method called "SymbolDesign" is proposed that can be used to design user-centered interfaces for pen-based input devices. It can also extend the functionality of pointer input devices such as the traditional computer mouse or the Camera Mouse, a camera-based computer interface. Users can create their own interfaces by choosing single-stroke movement patterns that are convenient to draw with the selected input device and by mapping them to a desired set of commands. A pattern could be the trace of a moving finger detected with the Camera Mouse or a symbol drawn with an optical pen. The core of the SymbolDesign system is a dynamically created classifier, in the current implementation an artificial neural network. The architecture of the neural network automatically adjusts according to the complexity of the classification task. In experiments, subjects used the SymbolDesign method to design and test the interfaces they created, for example, to browse the web. The experiments demonstrated good recognition accuracy and responsiveness of the user interfaces. The method provided an easily-designed and easily-used computer input mechanism for people without physical limitations, and, with some modifications, has the potential to become a computer access tool for people with severe paralysis.
Resumo:
Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.
Resumo:
A human-computer interface (HCI) system designed for use by people with severe disabilities is presented. People that are severely paralyzed or afflicted with diseases such as ALS (Lou Gehrig's disease) or multiple sclerosis are unable to move or control any parts of their bodies except for their eyes. The system presented here detects the user's eye blinks and analyzes the pattern and duration of the blinks, using them to provide input to the computer in the form of a mouse click. After the automatic initialization of the system occurs from the processing of the user's involuntary eye blinks in the first few seconds of use, the eye is tracked in real time using correlation with an online template. If the user's depth changes significantly or rapid head movement occurs, the system is automatically reinitialized. There are no lighting requirements nor offline templates needed for the proper functioning of the system. The system works with inexpensive USB cameras and runs at a frame rate of 30 frames per second. Extensive experiments were conducted to determine both the system's accuracy in classifying voluntary and involuntary blinks, as well as the system's fitness in varying environment conditions, such as alternative camera placements and different lighting conditions. These experiments on eight test subjects yielded an overall detection accuracy of 95.3%.
Resumo:
The heterogeneity and open nature of network systems make analysis of compositions of components quite challenging, making the design and implementation of robust network services largely inaccessible to the average programmer. We propose the development of a novel type system and practical type spaces which reflect simplified representations of the results and conclusions which can be derived from complex compositional theories in more accessible ways, essentially allowing the system architect or programmer to be exposed only to the inputs and output of compositional analysis without having to be familiar with the ins and outs of its internals. Toward this end we present the TRAFFIC (Typed Representation and Analysis of Flows For Interoperability Checks) framework, a simple flow-composition and typing language with corresponding type system. We then discuss and demonstrate the expressive power of a type space for TRAFFIC derived from the network calculus, allowing us to reason about and infer such properties as data arrival, transit, and loss rates in large composite network applications.
Resumo:
This paper formally defines the operational semantic for TRAFFIC, a specification language for flow composition applications proposed in BUCS-TR-2005-014, and presents a type system based on desired safety assurance. We provide proofs on reduction (weak-confluence, strong-normalization and unique normal form), on soundness and completeness of type system with respect to reduction, and on equivalence classes of flow specifications. Finally, we provide a pseudo-code listing of a syntax-directed type checking algorithm implementing rules of the type system capable of inferring the type of a closed flow specification.
Resumo:
Nearest neighbor classifiers are simple to implement, yet they can model complex non-parametric distributions, and provide state-of-the-art recognition accuracy in OCR databases. At the same time, they may be too slow for practical character recognition, especially when they rely on similarity measures that require computationally expensive pairwise alignments between characters. This paper proposes an efficient method for computing an approximate similarity score between two characters based on their exact alignment to a small number of prototypes. The proposed method is applied to both online and offline character recognition, where similarity is based on widely used and computationally expensive alignment methods, i.e., Dynamic Time Warping and the Hungarian method respectively. In both cases significant recognition speedup is obtained at the expense of only a minor increase in recognition error.
Resumo:
Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting and recognition algorithm that is based on the continuous dynamic programming (CDP) algorithm, and runs in real-time. To make gesture spotting efficient a pruning method is proposed that allows the system to evaluate a relatively small number of hypotheses compared to CDP. Pruning is implemented by a set of model-dependent classifiers, that are learned from training examples. To make gesture spotting more accurate a subgesture reasoning process is proposed that models the fact that some gesture models can falsely match parts of other longer gestures. In our experiments, the proposed method with pruning and subgesture modeling is an order of magnitude faster and 18% more accurate compared to the original CDP algorithm.
Resumo:
This paper proposes a method for detecting shapes of variable structure in images with clutter. The term "variable structure" means that some shape parts can be repeated an arbitrary number of times, some parts can be optional, and some parts can have several alternative appearances. The particular variation of the shape structure that occurs in a given image is not known a priori. Existing computer vision methods, including deformable model methods, were not designed to detect shapes of variable structure; they may only be used to detect shapes that can be decomposed into a fixed, a priori known, number of parts. The proposed method can handle both variations in shape structure and variations in the appearance of individual shape parts. A new class of shape models is introduced, called Hidden State Shape Models, that can naturally represent shapes of variable structure. A detection algorithm is described that finds instances of such shapes in images with large amounts of clutter by finding globally optimal correspondences between image features and shape models. Experiments with real images demonstrate that our method can localize plant branches that consist of an a priori unknown number of leaves and can detect hands more accurately than a hand detector based on the chamfer distance.
Resumo:
Nearest neighbor search is commonly employed in face recognition but it does not scale well to large dataset sizes. A strategy to combine rejection classifiers into a cascade for face identification is proposed in this paper. A rejection classifier for a pair of classes is defined to reject at least one of the classes with high confidence. These rejection classifiers are able to share discriminants in feature space and at the same time have high confidence in the rejection decision. In the face identification problem, it is possible that a pair of known individual faces are very dissimilar. It is very unlikely that both of them are close to an unknown face in the feature space. Hence, only one of them needs to be considered. Using a cascade structure of rejection classifiers, the scope of nearest neighbor search can be reduced significantly. Experiments on Face Recognition Grand Challenge (FRGC) version 1 data demonstrate that the proposed method achieves significant speed up and an accuracy comparable with the brute force Nearest Neighbor method. In addition, a graph cut based clustering technique is employed to demonstrate that the pairwise separability of these rejection classifiers is capable of semantic grouping.
Resumo:
Accurate head tilt detection has a large potential to aid people with disabilities in the use of human-computer interfaces and provide universal access to communication software. We show how it can be utilized to tab through links on a web page or control a video game with head motions. It may also be useful as a correction method for currently available video-based assistive technology that requires upright facial poses. Few of the existing computer vision methods that detect head rotations in and out of the image plane with reasonable accuracy can operate within the context of a real-time communication interface because the computational expense that they incur is too great. Our method uses a variety of metrics to obtain a robust head tilt estimate without incurring the computational cost of previous methods. Our system runs in real time on a computer with a 2.53 GHz processor, 256 MB of RAM and an inexpensive webcam, using only 55% of the processor cycles.
Resumo:
Facial features play an important role in expressing grammatical information in signed languages, including American Sign Language(ASL). Gestures such as raising or furrowing the eyebrows are key indicators of constructions such as yes-no questions. Periodic head movements (nods and shakes) are also an essential part of the expression of syntactic information, such as negation (associated with a side-to-side headshake). Therefore, identification of these facial gestures is essential to sign language recognition. One problem with detection of such grammatical indicators is occlusion recovery. If the signer's hand blocks his/her eyebrows during production of a sign, it becomes difficult to track the eyebrows. We have developed a system to detect such grammatical markers in ASL that recovers promptly from occlusion. Our system detects and tracks evolving templates of facial features, which are based on an anthropometric face model, and interprets the geometric relationships of these templates to identify grammatical markers. It was tested on a variety of ASL sentences signed by various Deaf native signers and detected facial gestures used to express grammatical information, such as raised and furrowed eyebrows as well as headshakes.