9 resultados para Invariant tests
em Boston University Digital Common
Resumo:
The performance of different classification approaches is evaluated using a view-based approach for motion representation. The view-based approach uses computer vision and image processing techniques to register and process the video sequence. Two motion representations called Motion Energy Images and Motion History Image are then constructed. These representations collapse the temporal component in a way that no explicit temporal analysis or sequence matching is needed. Statistical descriptions are then computed using moment-based features and dimensionality reduction techniques. For these tests, we used 7 Hu moments, which are invariant to scale and translation. Principal Components Analysis is used to reduce the dimensionality of this representation. The system is trained using different subjects performing a set of examples of every action to be recognized. Given these samples, K-nearest neighbor, Gaussian, and Gaussian mixture classifiers are used to recognize new actions. Experiments are conducted using instances of eight human actions (i.e., eight classes) performed by seven different subjects. Comparisons in the performance among these classifiers under different conditions are analyzed and reported. Our main goals are to test this dimensionality-reduced representation of actions, and more importantly to use this representation to compare the advantages of different classification approaches in this recognition task.
Resumo:
We introduce a view-point invariant representation of moving object trajectories that can be used in video database applications. It is assumed that trajectories lie on a surface that can be locally approximated with a plane. Raw trajectory data is first locally approximated with a cubic spline via least squares fitting. For each sampled point of the obtained curve, a projective invariant feature is computed using a small number of points in its neighborhood. The resulting sequence of invariant features computed along the entire trajectory forms the view invariant descriptor of the trajectory itself. Time parametrization has been exploited to compute cross ratios without ambiguity due to point ordering. Similarity between descriptors of different trajectories is measured with a distance that takes into account the statistical properties of the cross ratio, and its symmetry with respect to the point at infinity. In experiments, an overall correct classification rate of about 95% has been obtained on a dataset of 58 trajectories of players in soccer video, and an overall correct classification rate of about 80% has been obtained on matching partial segments of trajectories collected from two overlapping views of outdoor scenes with moving people and cars.
Resumo:
Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624)
Resumo:
Speech can be understood at widely varying production rates. A working memory is described for short-term storage of temporal lists of input items. The working memory is a cooperative-competitive neural network that automatically adjusts its integration rate, or gain, to generate a short-term memory code for a list that is independent of item presentation rate. Such an invariant working memory model is used to simulate data of Repp (1980) concerning the changes of phonetic category boundaries as a function of their presentation rate. Thus the variability of categorical boundaries can be traced to the temporal in variance of the working memory code.
Resumo:
This paper describes a self-organizing neural network that rapidly learns a body-centered representation of 3-D target positions. This representation remains invariant under head and eye movements, and is a key component of sensory-motor systems for producing motor equivalent reaches to targets (Bullock, Grossberg, and Guenther, 1993).
Resumo:
An extension to the orientational harmonic model is presented as a rotation, translation, and scale invariant representation of geometrical form in biological vision.
Resumo:
The proposed model, called the combinatorial and competitive spatio-temporal memory or CCSTM, provides an elegant solution to the general problem of having to store and recall spatio-temporal patterns in which states or sequences of states can recur in various contexts. For example, fig. 1 shows two state sequences that have a common subsequence, C and D. The CCSTM assumes that any state has a distributed representation as a collection of features. Each feature has an associated competitive module (CM) containing K cells. On any given occurrence of a particular feature, A, exactly one of the cells in CMA will be chosen to represent it. It is the particular set of cells active on the previous time step that determines which cells are chosen to represent instances of their associated features on the current time step. If we assume that typically S features are active in any state then any state has K^S different neural representations. This huge space of possible neural representations of any state is what underlies the model's ability to store and recall numerous context-sensitive state sequences. The purpose of this paper is simply to describe this mechanism.
Resumo:
This article describes two neural network modules that form part of an emerging theory of how adaptive control of goal-directed sensory-motor skills is achieved by humans and other animals. The Vector-Integration-To-Endpoint (VITE) model suggests how synchronous multi-joint trajectories are generated and performed at variable speeds. The Factorization-of-LEngth-and-TEnsion (FLETE) model suggests how outflow movement commands from a VITE model may be performed at variable force levels without a loss of positional accuracy. The invariance of positional control under speed and force rescaling sheds new light upon a familiar strategy of motor skill development: Skill learning begins with performance at low speed and low limb compliance and proceeds to higher speeds and compliances. The VITE model helps to explain many neural and behavioral data about trajectory formation, including data about neural coding within the posterior parietal cortex, motor cortex, and globus pallidus, and behavioral properties such as Woodworth's Law, Fitts Law, peak acceleration as a function of movement amplitude and duration, isotonic arm movement properties before and after arm-deafferentation, central error correction properties of isometric contractions, motor priming without overt action, velocity amplification during target switching, velocity profile invariance across different movement distances, changes in velocity profile asymmetry across different movement durations, staggered onset times for controlling linear trajectories with synchronous offset times, changes in the ratio of maximum to average velocity during discrete versus serial movements, and shared properties of arm and speech articulator movements. The FLETE model provides new insights into how spina-muscular circuits process variable forces without a loss of positional control. These results explicate the size principle of motor neuron recruitment, descending co-contractive compliance signals, Renshaw cells, Ia interneurons, fast automatic reactive control by ascending feedback from muscle spindles, slow adaptive predictive control via cerebellar learning using muscle spindle error signals to train adaptive movement gains, fractured somatotopy in the opponent organization of cerebellar learning, adaptive compensation for variable moment-arms, and force feedback from Golgi tendon organs. More generally, the models provide a computational rationale for the use of nonspecific control signals in volitional control, or "acts of will", and of efference copies and opponent processing in both reactive and adaptive motor control tasks.
Resumo:
A feedforward neural network for invariant image preprocessing is proposed that represents the position1 orientation and size of an image figure (where it is) in a multiplexed spatial map. This map is used to generate an invariant representation of the figure that is insensitive to position1 orientation, and size for purposes of pattern recognition (what it is). A multiscale array of oriented filters followed by competition between orientations and scales is used to define the Where filter.