7 resultados para Detection and representation
em Massachusetts Institute of Technology
Resumo:
Freehand sketching is both a natural and crucial part of design, yet is unsupported by current design automation software. We are working to combine the flexibility and ease of use of paper and pencil with the processing power of a computer to produce a design environment that feels as natural as paper, yet is considerably smarter. One of the most basic steps in accomplishing this is converting the original digitized pen strokes in the sketch into the intended geometric objects using feature point detection and approximation. We demonstrate how multiple sources of information can be combined for feature detection in strokes and apply this technique using two approaches to signal processing, one using simple average based thresholding and a second using scale space.
Resumo:
In this report, a face recognition system that is capable of detecting and recognizing frontal and rotated faces was developed. Two face recognition methods focusing on the aspect of pose invariance are presented and evaluated - the whole face approach and the component-based approach. The main challenge of this project is to develop a system that is able to identify faces under different viewing angles in realtime. The development of such a system will enhance the capability and robustness of current face recognition technology. The whole-face approach recognizes faces by classifying a single feature vector consisting of the gray values of the whole face image. The component-based approach first locates the facial components and extracts them. These components are normalized and combined into a single feature vector for classification. The Support Vector Machine (SVM) is used as the classifier for both approaches. Extensive tests with respect to the robustness against pose changes are performed on a database that includes faces rotated up to about 40 degrees in depth. The component-based approach clearly outperforms the whole-face approach on all tests. Although this approach isproven to be more reliable, it is still too slow for real-time applications. That is the reason why a real-time face recognition system using the whole-face approach is implemented to recognize people in color video sequences.
Resumo:
This paper describes the successful implementation of a prototype software application that independently and proactively detects whether a mobile phone is lost or misused. When the mobile phone is detected as being lost or misused, the application takes steps to mitigate the impact of loss and to gather evidence. The goal is to aid in the recovery of the mobile phone. The prototype works regardless of the cellular infrastructure the mobile phone is operating in and makes minimum demands on the owner of the mobile phone. The prototype was developed on Nokia 6600 mobile phones that run Symbian Operating System 7.0s. Development was done using Nokia’s Series 60 Developer’s Platform 2.0.
Resumo:
This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.
Resumo:
All intelligence relies on search --- for example, the search for an intelligent agent's next action. Search is only likely to succeed in resource-bounded agents if they have already been biased towards finding the right answer. In artificial agents, the primary source of bias is engineering. This dissertation describes an approach, Behavior-Oriented Design (BOD) for engineering complex agents. A complex agent is one that must arbitrate between potentially conflicting goals or behaviors. Behavior-oriented design builds on work in behavior-based and hybrid architectures for agents, and the object oriented approach to software engineering. The primary contributions of this dissertation are: 1.The BOD architecture: a modular architecture with each module providing specialized representations to facilitate learning. This includes one pre-specified module and representation for action selection or behavior arbitration. The specialized representation underlying BOD action selection is Parallel-rooted, Ordered, Slip-stack Hierarchical (POSH) reactive plans. 2.The BOD development process: an iterative process that alternately scales the agent's capabilities then optimizes the agent for simplicity, exploiting tradeoffs between the component representations. This ongoing process for controlling complexity not only provides bias for the behaving agent, but also facilitates its maintenance and extendibility. The secondary contributions of this dissertation include two implementations of POSH action selection, a procedure for identifying useful idioms in agent architectures and using them to distribute knowledge across agent paradigms, several examples of applying BOD idioms to established architectures, an analysis and comparison of the attributes and design trends of a large number of agent architectures, a comparison of biological (particularly mammalian) intelligence to artificial agent architectures, a novel model of primate transitive inference, and many other examples of BOD agents and BOD development.
Resumo:
In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.
Resumo:
Most psychophysical studies of object recognition have focussed on the recognition and representation of individual objects subjects had previously explicitely been trained on. Correspondingly, modeling studies have often employed a 'grandmother'-type representation where the objects to be recognized were represented by individual units. However, objects in the natural world are commonly members of a class containing a number of visually similar objects, such as faces, for which physiology studies have provided support for a representation based on a sparse population code, which permits generalization from the learned exemplars to novel objects of that class. In this paper, we present results from psychophysical and modeling studies intended to investigate object recognition in natural ('continuous') object classes. In two experiments, subjects were trained to perform subordinate level discrimination in a continuous object class - images of computer-rendered cars - created using a 3D morphing system. By comparing the recognition performance of trained and untrained subjects we could estimate the effects of viewpoint-specific training and infer properties of the object class-specific representation learned as a result of training. We then compared the experimental findings to simulations, building on our recently presented HMAX model of object recognition in cortex, to investigate the computational properties of a population-based object class representation as outlined above. We find experimental evidence, supported by modeling results, that training builds a viewpoint- and class-specific representation that supplements a pre-existing repre-sentation with lower shape discriminability but possibly greater viewpoint invariance.