903 resultados para human motion
Resumo:
This thesis explores the problem of mobile robot navigation in dense human crowds. We begin by considering a fundamental impediment to classical motion planning algorithms called the freezing robot problem: once the environment surpasses a certain level of complexity, the planner decides that all forward paths are unsafe, and the robot freezes in place (or performs unnecessary maneuvers) to avoid collisions. Since a feasible path typically exists, this behavior is suboptimal. Existing approaches have focused on reducing predictive uncertainty by employing higher fidelity individual dynamics models or heuristically limiting the individual predictive covariance to prevent overcautious navigation. We demonstrate that both the individual prediction and the individual predictive uncertainty have little to do with this undesirable navigation behavior. Additionally, we provide evidence that dynamic agents are able to navigate in dense crowds by engaging in joint collision avoidance, cooperatively making room to create feasible trajectories. We accordingly develop interacting Gaussian processes, a prediction density that captures cooperative collision avoidance, and a "multiple goal" extension that models the goal driven nature of human decision making. Navigation naturally emerges as a statistic of this distribution.
Most importantly, we empirically validate our models in the Chandler dining hall at Caltech during peak hours, and in the process, carry out the first extensive quantitative study of robot navigation in dense human crowds (collecting data on 488 runs). The multiple goal interacting Gaussian processes algorithm performs comparably with human teleoperators in crowd densities nearing 1 person/m2, while a state of the art noncooperative planner exhibits unsafe behavior more than 3 times as often as the multiple goal extension, and twice as often as the basic interacting Gaussian process approach. Furthermore, a reactive planner based on the widely used dynamic window approach proves insufficient for crowd densities above 0.55 people/m2. We also show that our noncooperative planner or our reactive planner capture the salient characteristics of nearly any dynamic navigation algorithm. For inclusive validation purposes, we show that either our non-interacting planner or our reactive planner captures the salient characteristics of nearly any existing dynamic navigation algorithm. Based on these experimental results and theoretical observations, we conclude that a cooperation model is critical for safe and efficient robot navigation in dense human crowds.
Finally, we produce a large database of ground truth pedestrian crowd data. We make this ground truth database publicly available for further scientific study of crowd prediction models, learning from demonstration algorithms, and human robot interaction models in general.
Resumo:
Designing for all requires the adaptation and modification of current design best practices to encompass a broader range of user capabilities. This is particularly the case in the design of the human-product interface. Product interfaces exist everywhere and when designing them, there is a very strong temptation to jump to prescribing a solution with only a cursory attempt to understand the nature of the problem. This is particularly the case when attempting to adapt existing designs, optimised for able-bodied users, for use by disabled users. However, such approaches have led to numerous products that are neither usable nor commercially successful. In order to develop a successful design approach it is necessary consider the fundamental structure of the design process being applied. A three stage design process development strategy which includes problem definition, solution development and solution evaluation, should be adopted. This paper describes the development of a new design approach based on the application of usability heuristics to the design of interfaces. This is illustrated by reference to a particular case study of the re-design of a computer interface for controlling an assistive device.
Resumo:
Understanding the mechanisms of enzymes is crucial for our understanding of their role in biology and for designing methods to perturb or harness their activities for medical treatments, industrial processes, or biological engineering. One aspect of enzymes that makes them difficult to fully understand is that they are in constant motion, and these motions and the conformations adopted throughout these transitions often play a role in their function.
Traditionally, it has been difficult to isolate a protein in a particular conformation to determine what role each form plays in the reaction or biology of that enzyme. A new technology, computational protein design, makes the isolation of various conformations possible, and therefore is an extremely powerful tool in enabling a fuller understanding of the role a protein conformation plays in various biological processes.
One such protein that undergoes large structural shifts during different activities is human type II transglutaminase (TG2). TG2 is an enzyme that exists in two dramatically different conformational states: (1) an open, extended form, which is adopted upon the binding of calcium, and (2) a closed, compact form, which is adopted upon the binding of GTP or GDP. TG2 possess two separate active sites, each with a radically different activity. This open, calcium-bound form of TG2 is believed to act as a transglutaminse, where it catalyzes the formation of an isopeptide bond between the sidechain of a peptide-bound glutamine and a primary amine. The closed, GTP-bound conformation is believed to act as a GTPase. TG2 is also implicated in a variety of biological and pathological processes.
To better understand the effects of TG2’s conformations on its activities and pathological processes, we set out to design variants of TG2 isolated in either the closed or open conformations. We were able to design open-locked and closed-biased TG2 variants, and use these designs to unseat the current understanding of the activities and their concurrent conformations of TG2 and explore each conformation’s role in celiac disease models. This work also enabled us to help explain older confusing results in regards to this enzyme and its activities. The new model for TG2 activity has immense implications for our understanding of its functional capabilities in various environments, and for our ability to understand which conformations need to be inhibited in the design of new drugs for diseases in which TG2’s activities are believed to elicit pathological effects.
Resumo:
A study of human eye movements was made in order to elucidate the nature of the control mechanism in the binocular oculomotor system.
We first examined spontaneous eye movements during monocular and binocular fixation in order to determine the corrective roles of flicks and drifts. It was found that both types of motion correct fixational errors, although flicks are somewhat more active in this respect. Vergence error is a stimulus for correction by drifts but not by flicks, while binocular vertical discrepancy of the visual axes does not trigger corrective movements.
Second, we investigated the non-linearities of the oculomotor system by examining the eye movement responses to point targets moving in two dimensions in a subjectively unpredictable manner. Such motions consisted of hand-limited Gaussian random motion and also of the sum of several non-integrally related sinusoids. We found that there is no direct relationship between the phase and the gain of the oculomotor system. Delay of eye movements relative to target motion is determined by the necessity of generating a minimum afferent (input) signal at the retina in order to trigger corrective eye movements. The amplitude of the response is a function of the biological constraints of the efferent (output) portion of the system: for target motions of narrow bandwidth, the system responds preferentially to the highest frequency; for large bandwidth motions, the system distributes the available energy equally over all frequencies. Third, the power spectra of spontaneous eye movements were compared with the spectra of tracking eye movements for Gaussian random target motions of varying bandwidths. It was found that there is essentially no difference among the various curves. The oculomotor system tracks a target, not by increasing the mean rate of impulses along the motoneurons of the extra-ocular muscles, but rather by coordinating those spontaneous impulses which propagate along the motoneurons during stationary fixation. Thus, the system operates at full output at all times.
Fourth, we examined the relative magnitude and phase of motions of the left and the right visual axes during monocular and binocular viewing. We found that the two visual axes move vertically in perfect synchronization at all frequencies for any viewing condition. This is not true for horizontal motions: the amount of vergence noise is highest for stationary fixation and diminishes for tracking tasks as the bandwidth of the target motion increases. Furthermore, movements of the occluded eye are larger than those of the seeing eye in monocular viewing. This effect is more pronounced for horizontal motions, for stationary fixation, and for lower frequencies.
Finally, we have related our findings to previously known facts about the pertinent nerve pathways in order to postulate a model for the neurological binocular control of the visual axes.
Resumo:
Atlases and statistical models play important roles in the personalization and simulation of cardiac physiology. For the study of the heart, however, the construction of comprehensive atlases and spatio-temporal models is faced with a number of challenges, in particular the need to handle large and highly variable image datasets, the multi-region nature of the heart, and the presence of complex as well as small cardiovascular structures. In this paper, we present a detailed atlas and spatio-temporal statistical model of the human heart based on a large population of 3D+time multi-slice computed tomography sequences, and the framework for its construction. It uses spatial normalization based on nonrigid image registration to synthesize a population mean image and establish the spatial relationships between the mean and the subjects in the population. Temporal image registration is then applied to resolve each subject-specific cardiac motion and the resulting transformations are used to warp a surface mesh representation of the atlas to fit the images of the remaining cardiac phases in each subject. Subsequently, we demonstrate the construction of a spatio-temporal statistical model of shape such that the inter-subject and dynamic sources of variation are suitably separated. The framework is applied to a 3D+time data set of 138 subjects. The data is drawn from a variety of pathologies, which benefits its generalization to new subjects and physiological studies. The obtained level of detail and the extendability of the atlas present an advantage over most cardiac models published previously. © 1982-2012 IEEE.
Resumo:
Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.
Resumo:
We present psychophysical experiments that measure the accuracy of perceived 3D structure derived from relative image motion. The experiments are motivated by Ullman's incremental rigidity scheme, which builds up 3D structure incrementally over an extended time. Our main conclusions are: first, the human system derives an accurate model of the relative depths of moving points, even in the presence of noise; second, the accuracy of 3D structure improves with time, eventually reaching a plateau; and third, the 3D structure currently perceived depends on previous 3D models. Through computer simulations, we relate the psychophysical observations to the behavior of Ullman's model.
Resumo:
We address the computational role that the construction of a complete surface representation may play in the recovery of 3--D structure from motion. We present a model that combines a feature--based structure--from- -motion algorithm with smooth surface interpolation. This model can represent multiple surfaces in a given viewing direction, incorporates surface constraints from object boundaries, and groups image features using their 2--D image motion. Computer simulations relate the model's behavior to perceptual observations. In a companion paper, we discuss further perceptual experiments regarding the role of surface reconstruction in the human recovery of 3--D structure from motion.
Resumo:
The performance of different classification approaches is evaluated using a view-based approach for motion representation. The view-based approach uses computer vision and image processing techniques to register and process the video sequence. Two motion representations called Motion Energy Images and Motion History Image are then constructed. These representations collapse the temporal component in a way that no explicit temporal analysis or sequence matching is needed. Statistical descriptions are then computed using moment-based features and dimensionality reduction techniques. For these tests, we used 7 Hu moments, which are invariant to scale and translation. Principal Components Analysis is used to reduce the dimensionality of this representation. The system is trained using different subjects performing a set of examples of every action to be recognized. Given these samples, K-nearest neighbor, Gaussian, and Gaussian mixture classifiers are used to recognize new actions. Experiments are conducted using instances of eight human actions (i.e., eight classes) performed by seven different subjects. Comparisons in the performance among these classifiers under different conditions are analyzed and reported. Our main goals are to test this dimensionality-reduced representation of actions, and more importantly to use this representation to compare the advantages of different classification approaches in this recognition task.
Resumo:
Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences.
Resumo:
A biomechanical model of the human oculomotor plant kinematics in 3-D as a function of muscle length changes is presented. It can represent a range of alternative interpretations of the data as a function of one parameter. The model is free from such deficits as singularities and the nesting of axes found in alternative formulations such as the spherical wrist (Paul, l98l). The equations of motion are defined on a quaternion based representation of eye rotations and are compact atnd computationally efficient.
Resumo:
A model for self-organization of the coordinate transformations required for spatial reaching is presented. During a motor babbling phase, a mapping from spatial coordinate directions to joint motion directions is learned. After learning, the model is able to produce straight-line spatial velocity trajectories with characteristic bell-shaped spatial velocity profiles, as observed in human reaches. Simulation results are presented for transverse plane reaching using a two degree-of-freedom arm.
Resumo:
How do human observers perceive a coherent pattern of motion from a disparate set of local motion measures? Our research has examined how ambiguous motion signals along straight contours are spatially integrated to obtain a globally coherent perception of motion. Observers viewed displays containing a large number of apertures, with each aperture containing one or more contours whose orientations and velocities could be independently specified. The total pattern of the contour trajectories across the individual apertures was manipulated to produce globally coherent motions, such as rotations, expansions, or translations. For displays containing only straight contours extending to the circumferences of the apertures, observers' reports of global motion direction were biased whenever the sampling of contour orientations was asymmetric relative to the direction of motion. Performance was improved by the presence of identifiable features, such as line ends or crossings, whose trajectories could be tracked over time. The reports of our observers were consistent with a pooling process involving a vector average of measures of the component of velocity normal to contour orientation, rather than with the predictions of the intersection-of-constraints analysis in velocity space.