954 resultados para human motion


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The direction and speed of motion of a one-dimensional (1-D) stimulus, such as a grating, presented within a circular aperture is ambiguous. This ambiguity, referred to as the Aperture Problem (Fennema & Thompson, 1979) results from (i) the inability to detect motion parallel to grating orientation, and (ii) the occlusion of border information, such as the ‘ends’ of the grating, by the surface forming the aperture, Adelson and Movshon's (1982) intcrsection-of-constraints (IOC) model of motion perception describes a two-stage method of disambiguating the motion of 1-D moving stimuli (e.g., gratings) to produce unambiguous motion of two-dimensional (2-D) objects (e.g., plaid patterns) made up of several 1-D components. Specifically, in the IOC model ambiguous 1-D motions extracted by Stage 1 component-selective mechanisms are integrated by Stage 2 pattern-selective mechanisms to produce unambiguous 2-D motion signals. ‘Integration’ in the context of the IOC model involves determining the single motion vector (i.e., combination of direction and speed) which is consistent with the I-D components of a 2-D object. Since the IOC model assumes that 2-D objects undergo pure translation (i.e., without distortion, rotation, etc.), the motion vector consistent with all 1-D components describes the motion of the 2-D object itself. Adelson and Movshon (1982) propose that neural implementation of the computation underlying the IOC model is reflected in the perception of coherent 2-D plaid motion reported when two separately-moving ‘component’ gratings are superimposed. Using these plaid patterns the present thesis assesses the IOC model in terms of its ability to account for the perception of 2-D motion in a variety of circumstances. In the first series of experiments it is argued that the unambiguous motion perceived for a single grating presented within a rectangular aperture (i.e., the Barberpole illusion; Wallach, 1976) reflects application of the IOC computation to the moving 1-D grating and the stationary boundary of the aperture. While contrary to the assumption which underlies the IOC model (viz., that integration occurs between moving 1-D stimuli), evidence consistent with the involvement of the IOC computation in mediating the Barberpole illusion (in which there is only one moving stimulus) is obtained by measuring plaid coherence as a function of aperture shape. It is found that rectangular apertures which bias perceived component motions in directions consistent with plaid direction facilitate plaid coherence, while rectangular apertures which bias perceived component motions in directions inconsistent with plaid direction disrupt plaid coherence. In the second series of experiments, perceived directions of motion of type I symmetrical, type I asymmetrical, and type II plaids are measured with the aim of investigating the deviations in plaid directions reported by Ferrera and Wilson (1990) and Yo and Wilson (1992). Perceived directions of both asymmetrical and type II plaids are shown to deviate away from lOC-predicted directions and towards mean component direction. Furthermore, the magnitude of these deviations is being proportional to the difference between lOC-predicted plaid direction and mean component direction. On the basis of these directional deviations, modification to the IOC model is proposed. In the modified IOC model it is argued that plaid perception involves (i) the activity of Stage 2 pattern-selective mechanisms (and the Stage 1 component-selective mechanisms which input into these pattern-selective mechanisms) involved in implementing the IOC computation, and (ii) component-selective mechanisms which influence plaid perception directly, and ‘extraneously’ to the IOC computation. In the third series of experiments the validity of this modified IOC model, as well as the validity of alternative one-stage models of plaid perception are assessed in relation to perceived directions of plaid-induced MAEs as a function of both plaid direction and mean component direction. It is found that plaid-induced MAEs are shifted away from directions opposite to lOC-predicted plaid direction towards the direction opposite to mean component direction. This pattern of results is taken to be consistent with the modified IOC model which predicts the activity, and adaptation both of mechanisms signalling plaid direction (via implementation of the IOC computation), and ‘extraneous-type’ component-selective mechanisms signalling component directions. Alternative one-stage models which predict the adaptation of only mechanisms signalling plaid direction (the feature-tracking model), or the adaptation only of mechanisms signalling component directions (the distribution-of-activity model), cannot account for the directions of plaid-induced MAEs reported. The ability of the modified IOC model to account for the perceived directions of (i) gratings in rectangular apertures, (ii) various types of plaid in circular apertures, and (iii) directions of plaid-induced MAEs, is interpreted as supporting the proposition that human motion perception is based on a parallel and distributed process involving Stage 2 pattern-selective mechanisms (and the Stage 1 component-selective mechanisms which input into these mechanisms) taken to implement the IOC computation, and component-selective mechanisms taken to provide an 'extraneous' direct contribution to motion perception.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Opening keynote address.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

How to recognize human action from videos captured by modern cameras efficiently and effectively is a challenge in real applications. Traditional methods which need professional analysts are facing a bottleneck because of their shortcomings. To cope with the disadvantage, methods based on computer vision techniques, without or with only a few human interventions, have been proposed to analyse human actions in videos automatically. This paper provides a method combining the three dimensional Scale Invariant Feature Transform (SIFT) detector and the Latent Dirichlet Allocation (LDA) model for human motion analysis. To represent videos effectively and robustly, we extract the 3D SIFT descriptor around each interest point, which is sampled densely from 3D Space-time video volumes. After obtaining the representation of each video frame, the LDA model is adopted to discover the underlying structure-the categorization of human actions in the collection of videos. Public available standard datasets are used to test our method. The concluding part discusses the research challenges and future directions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Segmentation of individual actions from a stream of human motion is an open problem in computer vision. This paper approaches the problem of segmenting higher-level activities into their component sub-actions using Hidden Markov Models modified to handle missing data in the observation vector. By controlling the use of missing data, action labels can be inferred from the observation vector during inferencing, thus performing segmentation and classification simultaneously. The approach is able to segment both prominent and subtle actions, even when subtle actions are grouped together. The advantage of this method over sliding windows and Viterbi state sequence interrogation is that segmentation is performed as a trainable task, and the temporal relationship between actions is encoded in the model and used as evidence for action labelling.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we are interested in analyzing behaviour in crowded publicplaces at the level of holistic motion. Our aim is to learn, without user input, strong scene priors or labelled data, the scope of ‘‘normal behaviour’’ for a particular scene and thus alert to novelty in unseen footage. The first contribution is a low-level motion model based on what we term tracklet primitives, which are scenespecific elementary motions. We propose a clustering-based algorithm for tracklet estimation from local approximations to tracks of appearance features. This is followed by two methods for motion novelty inference from tracklet primitives: (a) an approach based on a non-hierarchial ensemble of Markov chains as a means of capturing behavioural characteristics at different scales, and (b) a more flexible alternative which exhibits a higher generalizing power by accounting for constraints introduced by intentionality and goal-oriented planning of human motion in a particular scene. Evaluated on a 2 h long video of a busy city marketplace, both algorithms are shown to be successful at inferring unusual behaviour, the latter model achieving better performance for novelties at a larger spatial scale.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we investigate how a multilinear model can be used to represent human motion data. Based on technical modes (referring to degrees of freedom and number of frames) and natural modes that typically appear in the context of a motion capture session (referring to actor, style, and repetition), the motion data is encoded in form of a high-order tensor. This tensor is then reduced by using N-mode singular value decomposition. Our experiments show that the reduced model approximates the original motion better then previously introduced PCA-based approaches. Furthermore, we discuss how the tensor representation may be used as a valuable tool for the synthesis of new motions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The hallucinogenic serotonin(IA&2A) agonist psilocybin is known for its ability to induce illusions of motion in otherwise stationary objects or textured surfaces. This study investigated the effect of psilocybin on local and global motion processing in nine human volunteers. Using a forced choice direction of motion discrimination task we show that psilocybin selectively impairs coherence sensitivity for random dot patterns, likely mediated by high-level global motion detectors, but not contrast sensitivity for drifting gratings, believed to be mediated by low-level detectors. These results are in line with those observed within schizophrenic populations and are discussed in respect to the proposition that psilocybin may provide a model to investigate clinical psychosis and the pharmacological underpinnings of visual perception in normal populations.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

International audience

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Human and robots have complementary strengths in performing assembly operations. Humans are very good at perception tasks in unstructured environments. They are able to recognize and locate a part from a box of miscellaneous parts. They are also very good at complex manipulation in tight spaces. The sensory characteristics of the humans, motor abilities, knowledge and skills give the humans the ability to react to unexpected situations and resolve problems quickly. In contrast, robots are very good at pick and place operations and highly repeatable in placement tasks. Robots can perform tasks at high speeds and still maintain precision in their operations. Robots can also operate for long periods of times. Robots are also very good at applying high forces and torques. Typically, robots are used in mass production. Small batch and custom production operations predominantly use manual labor. The high labor cost is making it difficult for small and medium manufacturers to remain cost competitive in high wage markets. These manufactures are mainly involved in small batch and custom production. They need to find a way to reduce the labor cost in assembly operations. Purely robotic cells will not be able to provide them the necessary flexibility. Creating hybrid cells where humans and robots can collaborate in close physical proximities is a potential solution. The underlying idea behind such cells is to decompose assembly operations into tasks such that humans and robots can collaborate by performing sub-tasks that are suitable for them. Realizing hybrid cells that enable effective human and robot collaboration is challenging. This dissertation addresses the following three computational issues involved in developing and utilizing hybrid assembly cells: - We should be able to automatically generate plans to operate hybrid assembly cells to ensure efficient cell operation. This requires generating feasible assembly sequences and instructions for robots and human operators, respectively. Automated planning poses the following two challenges. First, generating operation plans for complex assemblies is challenging. The complexity can come due to the combinatorial explosion caused by the size of the assembly or the complex paths needed to perform the assembly. Second, generating feasible plans requires accounting for robot and human motion constraints. The first objective of the dissertation is to develop the underlying computational foundations for automatically generating plans for the operation of hybrid cells. It addresses both assembly complexity and motion constraints issues. - The collaboration between humans and robots in the assembly cell will only be practical if human safety can be ensured during the assembly tasks that require collaboration between humans and robots. The second objective of the dissertation is to evaluate different options for real-time monitoring of the state of human operator with respect to the robot and develop strategies for taking appropriate measures to ensure human safety when the planned move by the robot may compromise the safety of the human operator. In order to be competitive in the market, the developed solution will have to include considerations about cost without significantly compromising quality. - In the envisioned hybrid cell, we will be relying on human operators to bring the part into the cell. If the human operator makes an error in selecting the part or fails to place it correctly, the robot will be unable to correctly perform the task assigned to it. If the error goes undetected, it can lead to a defective product and inefficiencies in the cell operation. The reason for human error can be either confusion due to poor quality instructions or human operator not paying adequate attention to the instructions. In order to ensure smooth and error-free operation of the cell, we will need to monitor the state of the assembly operations in the cell. The third objective of the dissertation is to identify and track parts in the cell and automatically generate instructions for taking corrective actions if a human operator deviates from the selected plan. Potential corrective actions may involve re-planning if it is possible to continue assembly from the current state. Corrective actions may also involve issuing warning and generating instructions to undo the current task.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The computer based human motion tracking systems are widely used in medicine and sports. The accurate determination of limb lengths is crucial for not only constructing the limb motion trajectories which are used for evaluation process of human kinematics, but also individually recognising human beings. Yet, as the common practice, the limb lengths are measured manually which is inconvenient, time-consuming and requires professional knowledge. In this paper, the estimation process of limb lengths is automated with a novel algorithm calculating curvature using the measurements from inertial sensors. The proposed algorithm was validated with computer simulations and experiments conducted with four healthy subjects. The experiment results show the significantly low root mean squared error percentages such as upper arm - 5.16%, upper limbs - 5.09%, upper leg - 2.56% and lower extremities - 6.64% compared to measured lengths.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An approach for estimating 3D body pose from multiple, uncalibrated views is proposed. First, a mapping from image features to 2D body joint locations is computed using a statistical framework that yields a set of several body pose hypotheses. The concept of a "virtual camera" is introduced that makes this mapping invariant to translation, image-plane rotation, and scaling of the input. As a consequence, the calibration matrices (intrinsics) of the virtual cameras can be considered completely known, and their poses are known up to a single angular displacement parameter. Given pose hypotheses obtained in the multiple virtual camera views, the recovery of 3D body pose and camera relative orientations is formulated as a stochastic optimization problem. An Expectation-Maximization algorithm is derived that can obtain the locally most likely (self-consistent) combination of body pose hypotheses. Performance of the approach is evaluated with synthetic sequences as well as real video sequences of human motion.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. The model structure setup and parameter learning are done using a variational Bayesian approach, which enables automatic Bayesian model structure selection, hence solving the problem of over-fitting. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With a significant increment of the number of digital cameras used for various purposes, there is a demanding call for advanced video analysis techniques that can be used to systematically interpret and understand the semantics of video contents, which have been recorded in security surveillance, intelligent transportation, health care, video retrieving and summarization. Understanding and interpreting human behaviours based on video analysis have observed competitive challenges due to non-rigid human motion, self and mutual occlusions, and changes of lighting conditions. To solve these problems, advanced image and signal processing technologies such as neural network, fuzzy logic, probabilistic estimation theory and statistical learning have been overwhelmingly investigated.