931 resultados para Learning objects


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents a learning based approach for detecting classes of objects and patterns with variable image appearance but highly predictable image boundaries. It consists of two parts. In part one, we introduce our object and pattern detection approach using a concrete human face detection example. The approach first builds a distribution-based model of the target pattern class in an appropriate feature space to describe the target's variable image appearance. It then learns from examples a similarity measure for matching new patterns against the distribution-based target model. The approach makes few assumptions about the target pattern class and should therefore be fairly general, as long as the target class has predictable image boundaries. Because our object and pattern detection approach is very much learning-based, how well a system eventually performs depends heavily on the quality of training examples it receives. The second part of this thesis looks at how one can select high quality examples for function approximation learning tasks. We propose an {em active learning} formulation for function approximation, and show for three specific approximation function classes, that the active example selection strategy learns its target with fewer data samples than random sampling. We then simplify the original active learning formulation, and show how it leads to a tractable example selection paradigm, suitable for use in many object and pattern detection problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BoostMap is a recently proposed method for efficient approximate nearest neighbor retrieval in arbitrary non-Euclidean spaces with computationally expensive and possibly non-metric distance measures. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. The key idea is formulating embedding construction as a machine learning task, where AdaBoost is used to combine simple, 1D embeddings into a multidimensional embedding that preserves a large amount of the proximity structure of the original space. This paper demonstrates that, using the machine learning formulation of BoostMap, we can optimize embeddings for indexing and classification, in ways that are not possible with existing alternatives for constructive embeddings, and without additional costs in retrieval time. First, we show how to construct embeddings that are query-sensitive, in the sense that they yield a different distance measure for different queries, so as to improve nearest neighbor retrieval accuracy for each query. Second, we show how to optimize embeddings for nearest neighbor classification tasks, by tuning them to approximate a parameter space distance measure, instead of the original feature-based distance measure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object detection and recognition are important problems in computer vision. The challenges of these problems come from the presence of noise, background clutter, large within class variations of the object class and limited training data. In addition, the computational complexity in the recognition process is also a concern in practice. In this thesis, we propose one approach to handle the problem of detecting an object class that exhibits large within-class variations, and a second approach to speed up the classification processes. In the first approach, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly solved with using a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. For applications where explicit parameterization of the within-class states is unavailable, a nonparametric formulation of the kernel can be constructed with a proper foreground distance/similarity measure. Detector training is accomplished via standard Support Vector Machine learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the image masks for foreground objects are provided in training, the detectors can also produce object segmentation. Methods for generating a representative sample set of detectors are proposed that can enable efficient detection and tracking. In addition, because individual detectors verify hypotheses of foreground state, they can also be incorporated in a tracking-by-detection frame work to recover foreground state in image sequences. To run the detectors efficiently at the online stage, an input-sensitive speedup strategy is proposed to select the most relevant detectors quickly. The proposed approach is tested on data sets of human hands, vehicles and human faces. On all data sets, the proposed approach achieves improved detection accuracy over the best competing approaches. In the second part of the thesis, we formulate a filter-and-refine scheme to speed up recognition processes. The binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the face recognition grand challenge version 2 data set, hand shape detection and parameter estimation on a hand data set, and vehicle detection and estimation of the view angle on a multi-pose vehicle data set. On all data sets, our approach is at least five times faster than simply evaluating all foreground state hypotheses with virtually no loss in classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

How do reactive and planned behaviors interact in real time? How are sequences of such behaviors released at appropriate times during autonomous navigation to realize valued goals? Controllers for both animals and mobile robots, or animats, need reactive mechanisms for exploration, and learned plans to reach goal objects once an environment becomes familiar. The SOVEREIGN (Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goaloriented Navigation) animat model embodies these capabilities, and is tested in a 3D virtual reality environment. SOVEREIGN includes several interacting subsystems which model complementary properties of cortical What and Where processing streams and which clarify similarities between mechanisms for navigation and arm movement control. As the animat explores an environment, visual inputs are processed by networks that are sensitive to visual form and motion in the What and Where streams, respectively. Position-invariant and sizeinvariant recognition categories are learned by real-time incremental learning in the What stream. Estimates of target position relative to the animat are computed in the Where stream, and can activate approach movements toward the target. Motion cues from animat locomotion can elicit head-orienting movements to bring a new target into view. Approach and orienting movements are alternately performed during animat navigation. Cumulative estimates of each movement are derived from interacting proprioceptive and visual cues. Movement sequences are stored within a motor working memory. Sequences of visual categories are stored in a sensory working memory. These working memories trigger learning of sensory and motor sequence categories, or plans, which together control planned movements. Predictively effective chunk combinations are selectively enhanced via reinforcement learning when the animat is rewarded. Selected planning chunks effect a gradual transition from variable reactive exploratory movements to efficient goal-oriented planned movement sequences. Volitional signals gate interactions between model subsystems and the release of overt behaviors. The model can control different motor sequences under different motivational states and learns more efficient sequences to rewarded goals as exploration proceeds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Do humans and animals learn exemplars or prototypes when they categorize objects and events in the world? How are different degrees of abstraction realized through learning by neurons in inferotemporal and prefrontal cortex? How do top-down expectations influence the course of learning? Thirty related human cognitive experiments (the 5-4 category structure) have been used to test competing views in the prototype-exemplar debate. In these experiments, during the test phase, subjects unlearn in a characteristic way items that they had learned to categorize perfectly in the training phase. Many cognitive models do not describe how an individual learns or forgets such categories through time. Adaptive Resonance Theory (ART) neural models provide such a description, and also clarify both psychological and neurobiological data. Matching of bottom-up signals with learned top-down expectations plays a key role in ART model learning. Here, an ART model is used to learn incrementally in response to 5-4 category structure stimuli. Simulation results agree with experimental data, achieving perfect categorization in training and a good match to the pattern of errors exhibited by human subjects in the testing phase. These results show how the model learns both prototypes and certain exemplars in the training phase. ART prototypes are, however, unlike the ones posited in the traditional prototype-exemplar debate. Rather, they are critical patterns of features to which a subject learns to pay attention based on past predictive success and the order in which exemplars are experienced. Perturbations of old memories by newly arriving test items generate a performance curve that closely matches the performance pattern of human subjects. The model also clarifies exemplar-based accounts of data concerning amnesia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces ART-EMAP, a neural architecture that uses spatial and temporal evidence accumulation to extend the capabilities of fuzzy ARTMAP. ART-EMAP combines supervised and unsupervised learning and a medium-term memory process to accomplish stable pattern category recognition in a noisy input environment. The ART-EMAP system features (i) distributed pattern registration at a view category field; (ii) a decision criterion for mapping between view and object categories which can delay categorization of ambiguous objects and trigger an evidence accumulation process when faced with a low confidence prediction; (iii) a process that accumulates evidence at a medium-term memory (MTM) field; and (iv) an unsupervised learning algorithm to fine-tune performance after a limited initial period of supervised network training. ART-EMAP dynamics are illustrated with a benchmark simulation example. Applications include 3-D object recognition from a series of ambiguous 2-D views.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a constantly changing world, humans are adapted to alternate routinely between attending to familiar objects and testing hypotheses about novel ones. We can rapidly learn to recognize and narne novel objects without unselectively disrupting our memories of familiar ones. We can notice fine details that differentiate nearly identical objects and generalize across broad classes of dissimilar objects. This chapter describes a class of self-organizing neural network architectures--called ARTMAP-- that are capable of fast, yet stable, on-line recognition learning, hypothesis testing, and naming in response to an arbitrary stream of input patterns (Carpenter, Grossberg, Markuzon, Reynolds, and Rosen, 1992; Carpenter, Grossberg, and Reynolds, 1991). The intrinsic stability of ARTMAP allows the system to learn incrementally for an unlimited period of time. System stability properties can be traced to the structure of its learned memories, which encode clusters of attended features into its recognition categories, rather than slow averages of category inputs. The level of detail in the learned attentional focus is determined moment-by-moment, depending on predictive success: an error due to over-generalization automatically focuses attention on additional input details enough of which are learned in a new recognition category so that the predictive error will not be repeated. An ARTMAP system creates an evolving map between a variable number of learned categories that compress one feature space (e.g., visual features) to learned categories of another feature space (e.g., auditory features). Input vectors can be either binary or analog. Computational properties of the networks enable them to perform significantly better in benchmark studies than alternative machine learning, genetic algorithm, or neural network models. Some of the critical problems that challenge and constrain any such autonomous learning system will next be illustrated. Design principles that work together to solve these problems are then outlined. These principles are realized in the ARTMAP architecture, which is specified as an algorithm. Finally, ARTMAP dynamics are illustrated by means of a series of benchmark simulations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors. To coordinate these processes, the hippocampal formation and cerebellum each contain circuits that learn to adaptively time their outputs. Within the model, hippocampal timing helps to maintain attention on motivationally salient goal objects during variable task-related delays, and cerebellar timing controls the release of conditioned responses. This property is part of the model's description of how cognitive-emotional interactions focus attention on motivationally valued cues, and how this process breaks down due to hippocampal ablation. The model suggests that the hippocampal mechanisms that help to rapidly draw attention to salient cues could prematurely release motor commands were not the release of these commands adaptively timed by the cerebellum. The model hippocampal system modulates cortical recognition learning without actually encoding the representational information that the cortex encodes. These properties avoid the difficulties faced by several models that propose a direct hippocampal role in recognition learning. Learning within the model hippocampal system controls adaptive timing and spatial orientation. Model properties hereby clarify how hippocampal ablations cause amnesic symptoms and difficulties with tasks which combine task delays, novelty detection, and attention towards goal objects amid distractions. When these model recognition, reinforcement, sensory-motor, and timing processes work together, they suggest how the brain can accomplish conditioning of multiple sensory events to delayed rewards, as during serial compound conditioning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes neural network models for adaptive control of arm movement trajectories during visually guided reaching and, more generally, a framework for unsupervised real-time error-based learning. The models clarify how a child, or untrained robot, can learn to reach for objects that it sees. Piaget has provided basic insights with his concept of a circular reaction: As an infant makes internally generated movements of its hand, the eyes automatically follow this motion. A transformation is learned between the visual representation of hand position and the motor representation of hand position. Learning of this transformation eventually enables the child to accurately reach for visually detected targets. Grossberg and Kuperstein have shown how the eye movement system can use visual error signals to correct movement parameters via cerebellar learning. Here it is shown how endogenously generated arm movements lead to adaptive tuning of arm control parameters. These movements also activate the target position representations that are used to learn the visuo-motor transformation that controls visually guided reaching. The AVITE model presented here is an adaptive neural circuit based on the Vector Integration to Endpoint (VITE) model for arm and speech trajectory generation of Bullock and Grossberg. In the VITE model, a Target Position Command (TPC) represents the location of the desired target. The Present Position Command (PPC) encodes the present hand-arm configuration. The Difference Vector (DV) population continuously.computes the difference between the PPC and the TPC. A speed-controlling GO signal multiplies DV output. The PPC integrates the (DV)·(GO) product and generates an outflow command to the arm. Integration at the PPC continues at a rate dependent on GO signal size until the DV reaches zero, at which time the PPC equals the TPC. The AVITE model explains how self-consistent TPC and PPC coordinates are autonomously generated and learned. Learning of AVITE parameters is regulated by activation of a self-regulating Endogenous Random Generator (ERG) of training vectors. Each vector is integrated at the PPC, giving rise to a movement command. The generation of each vector induces a complementary postural phase during which ERG output stops and learning occurs. Then a new vector is generated and the cycle is repeated. This cyclic, biphasic behavior is controlled by a specialized gated dipole circuit. ERG output autonomously stops in such a way that, across trials, a broad sample of workspace target positions is generated. When the ERG shuts off, a modulator gate opens, copying the PPC into the TPC. Learning of a transformation from TPC to PPC occurs using the DV as an error signal that is zeroed due to learning. This learning scheme is called a Vector Associative Map, or VAM. The VAM model is a general-purpose device for autonomous real-time error-based learning and performance of associative maps. The DV stage serves the dual function of reading out new TPCs during performance and reading in new adaptive weights during learning, without a disruption of real-time operation. YAMs thus provide an on-line unsupervised alternative to the off-line properties of supervised error-correction learning algorithms. YAMs and VAM cascades for learning motor-to-motor and spatial-to-motor maps are described. YAM models and Adaptive Resonance Theory (ART) models exhibit complementary matching, learning, and performance properties that together provide a foundation for designing a total sensory-cognitive and cognitive-motor autonomous system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The processes by which humans and other primates learn to recognize objects have been the subject of many models. Processes such as learning, categorization, attention, memory search, expectation, and novelty detection work together at different stages to realize object recognition. In this article, Gail Carpenter and Stephen Grossberg describe one such model class (Adaptive Resonance Theory, ART) and discuss how its structure and function might relate to known neurological learning and memory processes, such as how inferotemporal cortex can recognize both specialized and abstract information, and how medial temporal amnesia may be caused by lesions in the hippocampal formation. The model also suggests how hippocampal and inferotemporal processing may be linked during recognition learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper examines the applicability of an immersive virtual reality (VR) system to the process of organizational learning in a manufacturing context. The work focuses on the extent to which realism has to be represented in a simulated product build scenario in order to give the user an effective learning experience for an assembly task. Current technologies allow the visualization and manipulation of objects in VR systems but physical behaviors such as contact between objects and the effects of gravity are not commonly represented in off the shelf simulation solutions and the computational power required to facilitate these functions remains a challenge. This work demonstrates how physical behaviors can be coded and represented through the development of more effective mechanisms for the computer aided design (CAD) and VR interface.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The increasing complexity of the Information Society and the Bologna Declaration in the European context has led Higher Education (HE) institutions to revise their curricula courses, as far as the adoption of new strategies for teaching and learning as well as evaluation are concerned. It can also be emphasized that there has been a growing use of eLearning and of blended learning (bLearning) in HE, since these modes of training seem to be a very convenient option for lifelong learning. In this context, quality is taken as an essential goal in the development of the European Space for Education and Training, where HE institutions compete among themselves, and where evaluation is determinant as a promoter of this quality. Considering the problems summarized above, the research developed, based on four published scientific papers, intended to answer a set of research questions related to evaluation of bLearning contexts in HE. The study used diverse techniques and instruments (questionnaires, document analysis, and observation mediated technologies) spanning two methodological approaches: i) study of descriptive and exploratory nature and ii) case studies of bLearning modules. In the first approach an evaluation model for bLearning courses was developed, where we collected and analyzed, at a national level, the opinions of teachers with bLearning experience about the model dimensions. The case studies presented are post graduation curricular units, where bLearning teaching, learning and evaluation strategies were explored and evaluated, namely peer assessment. The main contributions of the first approach are: the process of questioning around the evaluation of bLearning courses, namely the quality assurance criteria for bLearning, as well as the model developed, providing a framework of theoretical, methodological and empirical elements that can be adapted in similar contexts. From the case studies emerged: the developed evaluation guidelines and the data collection instruments, in order to disseminate evaluation “best practices” that may be useful for other units in similar contexts. Regarding the recommendations about the evaluation of teaching of bLearning courses we emphasize: the use of versatile evaluation objects; the evaluation throughout the process and not just at the end; and the involvement of multiple evaluators, including students (whose feedback is essential to monitor the quality of teaching and learning). From the case studies we highlight: the need for discussion of evaluation frameworks to explore, and consequent increase in the transparency of the evaluation process; the increased interaction between groups; and the peer assessment as a strategy to promote active and autonomous learning. In addition to the contributions and recommendations for practice and research in the area of evaluation in bLearning contexts in HE, listed above, it also emerged from this study useful guidelines regarding educational evaluation in bLearning contexts, in order to improve the quality of teaching, learning and evaluation in such contexts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Autistic adults with limited speech and additional learning disabilities are people whose perceptions and interactions with their environment are unique, but whose experiences are under-explored in design research. This PhD by Practice investigates how people with autism experience their home environment through a collaboration with the autism charity Kingwood Trust, which gave the designer extensive access to a community of autistic adults that it supports. The PhD reflects upon a neurotypical designer’s approach to working with autistic adults to investigate their relationship with the environment. It identifies and develops collaborative design tools for autistic adults, their support staff and family members to be involved. The PhD presents three design studies that explore a person’s interaction with three environmental contexts of the home i.e. garden, everyday objects and interiors. A strengths-based rather than a deficit-based approach is adopted which draws upon an autistic person’s sensory preferences, special interests and action capabilities, to unravel what discomfort and delight might mean for an autistic person; this approach is translated into three design solutions to enhance their experience at home. By working beyond the boundaries of a neurotypical culture, the PhD bridges the autistic and neurotypical worlds of experience and draws upon what the mainstream design field can learn from designing with autistic people with additional learning disabilities. It also provides insights into the subjective experiences of people who have very different ways of seeing, doing and being in the environment