645 resultados para Body image, form perception
Resumo:
Recovering the motion of a non-rigid body from a set of monocular images permits the analysis of dynamic scenes in uncontrolled environments. However, the extension of factorisation algorithms for rigid structure from motion to the low-rank non-rigid case has proved challenging. This stems from the comparatively hard problem of finding a linear “corrective transform” which recovers the projection and structure matrices from an ambiguous factorisation. We elucidate that this greater difficulty is due to the need to find multiple solutions to a non-trivial problem, casting a number of previous approaches as alleviating this issue by either a) introducing constraints on the basis, making the problems nonidentical, or b) incorporating heuristics to encourage a diverse set of solutions, making the problems inter-dependent. While it has previously been recognised that finding a single solution to this problem is sufficient to estimate cameras, we show that it is possible to bootstrap this partial solution to find the complete transform in closed-form. However, we acknowledge that our method minimises an algebraic error and is thus inherently sensitive to deviation from the low-rank model. We compare our closed-form solution for non-rigid structure with known cameras to the closed-form solution of Dai et al. [1], which we find to produce only coplanar reconstructions. We therefore make the recommendation that 3D reconstruction error always be measured relative to a trivial reconstruction such as a planar one.
Resumo:
The human visual system has adapted to function in different lighting environments and responds to contrast instead of the amount of light as such. On the one hand, this ensures constancy of perception, for example, white paper looks white both in bright sunlight and in dim moonlight, because contrast is invariant to changes in overall light level. On the other hand, the brightness of the surfaces has to be reconstructed from the contrast signal because no signal from surfaces as such is conveyed to the visual cortex. In the visual cortex, the visual image is decomposed to local features by spatial filters that are selective for spatial frequency, orientation, and phase. Currently it is not known, however, how these features are subsequently integrated to form objects and object surfaces. In this thesis the integration mechanisms of achromatic surfaces were studied by psychophysically measuring the spatial frequency and orientation tuning of brightness perception. In addition, the effect of textures on the spread of brightness and the effect of phase of the inducing stimulus on brightness were measured. The novel findings of the thesis are that (1) a narrow spatial frequency band, independent of stimulus size and complexity, mediates brightness information (2) figure-ground brightness illusions are narrowly tuned for orientation (3) texture borders, without any luminance difference, are able to block the spread of brightness, and (4) edges and even- and odd-symmetric Gabors have a similar antagonistic effect on brightness. The narrow spatial frequency tuning suggests that only a subpopulation of neurons in V1 is involved in brightness perception. The independence of stimulus size and complexity indicates that the narrow tuning reflects hard-wired processing in the visual system. Further, it seems that figure-ground segregation and mechanisms integrating contrast polarities are closely related to the low level mechanisms of brightness perception. In conclusion, the results of the thesis suggest that a subpopulation of neurons in visual cortex selectively integrates information from different contrast polarities to reconstruct surface brightness.
Resumo:
Background The preference amongst parents for heavier infants is in contrast to obesity prevention efforts worldwide. Parents are poor at identifying overweight in older children, but few studies have investigated maternal perception of weight status amongst toddlers and none in the Australian setting. Methods Mothers (n = 290) completed a self-administered questionnaire at child age 12–16 months, defining their child's weight status as underweight, normal weight, somewhat overweight or very overweight. Weight-for-length z-score was derived from measured weight and length, and children categorized as underweight, normal weight, at risk overweight or obese (WHO standards). Objective classification was compared with maternal perception of weight status. Mean weight-for-length z-score was compared across categories of maternal perception using one-way ANOVA. Multinomial logistic regression was used to determine child or maternal characteristics associated with inaccurate weight perception. Results Most children (83%) were perceived as normal weight. Twenty nine were described as underweight, although none were. Sixty-six children were at risk of overweight, but 57 of these perceived as normal weight. Of the 14 children who were overweight, only 4 were identified as somewhat overweight by their mother. Compared with mothers who could accurately classify their normal weight child, mothers who were older had higher odds of perceiving their normal weight child as underweight, while mothers with higher body mass index had slightly higher odds of describing their overweight/at risk child as normal weight. Conclusion The leaner but healthy weight toddler was perceived as underweight, while only the heaviest children were recognized as overweight. Mothers unable to accurately identify children at risk are unlikely to act to prevent further excess weight gain. Practitioners can lead a shift in attitudes towards weight in infants and young children, promoting routine growth monitoring and adequate but not rapid weight gain.
Resumo:
Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.
Resumo:
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.
Resumo:
A fundamental task of vision systems is to infer the state of the world given some form of visual observations. From a computational perspective, this often involves facing an ill-posed problem; e.g., information is lost via projection of the 3D world into a 2D image. Solution of an ill-posed problem requires additional information, usually provided as a model of the underlying process. It is important that the model be both computationally feasible as well as theoretically well-founded. In this thesis, a probabilistic, nonlinear supervised computational learning model is proposed: the Specialized Mappings Architecture (SMA). The SMA framework is demonstrated in a computer vision system that can estimate the articulated pose parameters of a human body or human hands, given images obtained via one or more uncalibrated cameras. The SMA consists of several specialized forward mapping functions that are estimated automatically from training data, and a possibly known feedback function. Each specialized function maps certain domains of the input space (e.g., image features) onto the output space (e.g., articulated body parameters). A probabilistic model for the architecture is first formalized. Solutions to key algorithmic problems are then derived: simultaneous learning of the specialized domains along with the mapping functions, as well as performing inference given inputs and a feedback function. The SMA employs a variant of the Expectation-Maximization algorithm and approximate inference. The approach allows the use of alternative conditional independence assumptions for learning and inference, which are derived from a forward model and a feedback model. Experimental validation of the proposed approach is conducted in the task of estimating articulated body pose from image silhouettes. Accuracy and stability of the SMA framework is tested using artificial data sets, as well as synthetic and real video sequences of human bodies and hands.
Resumo:
A non-linear supervised learning architecture, the Specialized Mapping Architecture (SMA) and its application to articulated body pose reconstruction from single monocular images is described. The architecture is formed by a number of specialized mapping functions, each of them with the purpose of mapping certain portions (connected or not) of the input space, and a feedback matching process. A probabilistic model for the architecture is described along with a mechanism for learning its parameters. The learning problem is approached using a maximum likelihood estimation framework; we present Expectation Maximization (EM) algorithms for two different instances of the likelihood probability. Performance is characterized by estimating human body postures from low level visual features, showing promising results.
Resumo:
How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? A 3D FORMOTION model specifies how 3D boundary representations, which separate figures from backgrounds within cortical area V2, capture motion signals at the appropriate depths in MT; how motion signals in MT disambiguate boundaries in V2 via MT-to-Vl-to-V2 feedback; how sparse feature tracking signals are amplified; and how a spatially anisotropic motion grouping process propagates across perceptual space via MT-MST feedback to integrate feature-tracking and ambiguous motion signals to determine a global object motion percept. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.
Resumo:
How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? Consider, for example, a deer moving behind a bush. Here the partially occluded fragments of motion signals available to an observer must be coherently grouped into the motion of a single object. A 3D FORMOTION model comprises five important functional interactions involving the brain’s form and motion systems that address such situations. Because the model’s stages are analogous to areas of the primate visual system, we refer to the stages by corresponding anatomical names. In one of these functional interactions, 3D boundary representations, in which figures are separated from their backgrounds, are formed in cortical area V2. These depth-selective V2 boundaries select motion signals at the appropriate depths in MT via V2-to-MT signals. In another, motion signals in MT disambiguate locally incomplete or ambiguous boundary signals in V2 via MT-to-V1-to-V2 feedback. The third functional property concerns resolution of the aperture problem along straight moving contours by propagating the influence of unambiguous motion signals generated at contour terminators or corners. Here, sparse “feature tracking signals” from, e.g., line ends, are amplified to overwhelm numerically superior ambiguous motion signals along line segment interiors. In the fourth, a spatially anisotropic motion grouping process takes place across perceptual space via MT-MST feedback to integrate veridical feature-tracking and ambiguous motion signals to determine a global object motion percept. The fifth property uses the MT-MST feedback loop to convey an attentional priming signal from higher brain areas back to V1 and V2. The model's use of mechanisms such as divisive normalization, endstopping, cross-orientation inhibition, and longrange cooperation is described. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.
Resumo:
During lateral leg raising, a synergistic inclination of the supporting leg and trunk in the opposite direction to the leg movement is performed in order to preserve equilibrium. As first hypothesized by Pagano and Turvey (J Exp Psychol Hum Percept Perform, 1995, 21:1070-1087), the perception of limb orientation could be based on the orientation of the limb's inertia tensor. The purpose of this study was thus to explore whether the final upper body orientation (trunk inclination relative to vertical) depends on changes in the trunk inertia tensor. We imposed a loading condition, with total mass of 4 kg added to the subject's trunk in either a symmetrical or asymmetrical configuration. This changed the orientation of the trunk inertia tensor while keeping the total trunk mass constant. In order to separate any effects of the inertia tensor from the effects of gravitational torque, the experiment was carried out in normo- and microgravity. The results indicated that in normogravity the same final upper body orientation was maintained irrespective of the loading condition. In microgravity, regardless of loading conditions the same (but different from the normogravity) orientation of the upper body was achieved through different joint organizations: two joints (the hip and ankle joints of the supporting leg) in the asymmetrical loading condition, and one (hip) in the symmetrical loading condition. In order to determine whether the different orientations of the inertia tensor were perceived during the movement, the interjoint coordination was quantified by performing a principal components analysis (PCA) on the supporting and moving hips and on the supporting ankle joints. It was expected that different loading conditions would modify the principal component of the PCA. In normogravity, asymmetrical loading decreased the coupling between joints, while in microgravity a strong coupling was preserved whatever the loading condition. It was concluded that the trunk inertia tensor did not play a role during the lateral leg raising task because in spite of the absence of gravitational torque the final upper body orientation and the interjoint coupling were not influenced.
Propagation and antennas considerations for internetworking BANs to form body-to-body networks (BBN)
Resumo:
Despite its importance in social interactions, laughter remains little studied in affective computing. Intelligent virtual agents are often blind to users’ laughter and unable to produce convincing laughter themselves. Respiratory, auditory, and facial laughter signals have been investigated but laughter-related body movements have received less attention. The aim of this study is threefold. First, to probe human laughter perception by analyzing patterns of categorisations of natural laughter animated on a minimal avatar. Results reveal that a low dimensional space can describe perception of laughter “types”. Second, to investigate observers’ perception of laughter (hilarious, social, awkward, fake, and non-laughter) based on animated avatars generated from natural and acted motion-capture data. Significant differences in torso and limb movements are found between animations perceived as laughter and those perceived as non-laughter. Hilarious laughter also differs from social laughter. Different body movement features were indicative of laughter in sitting and standing avatar postures. Third, to investigate automatic recognition of laughter to the same level of certainty as observers’ perceptions. Results show recognition rates of the Random Forest model approach human rating levels. Classification comparisons and feature importance analyses indicate an improvement in recognition of social laughter when localized features and nonlinear models are used.
Resumo:
Laughter is a ubiquitous social signal in human interactions yet it remains understudied from a scientific point of view. The need to understand laughter and its role in human interactions has become more pressing as the ability to create conversational agents capable of interacting with humans has come closer to a reality. This paper reports on three aspects of the human perception of laughter when context has been removed and only the body information from the laughter episode remains. We report on ability to categorise the laugh type and the sex of the laugher; the relationship between personality factors with laughter categorisation and perception; and finally the importance of intensity in the perception and categorisation of laughter.
More than just a problem with faces: Altered body perception in a group of congenital prosopagnosics
Resumo:
It has been estimated that one out of forty people in the general population suffer from congenital prosopagnosia (CP), a neurodevelopmental disorder characterized by difficulty identifying people by their faces. CP involves impairment in recognising faces, although the perception of non-face stimuli may also be impaired. Given that social interaction does not only depend on face processing, but also the processing of bodies, it is of theoretical importance to ascertain whether CP is also characterised by body perception impairments. Here, we tested eleven CPs and eleven matched control participants on the Body Identity Recognition Task (BIRT), a forced-choice match-to-sample task, using stimuli that require processing of body, not clothing, specific features. Results indicated that the group of CPs was as accurate as controls on the BIRT, which is in line with the lack of body perception complaints by CPs. However the CPs were slower than controls, and when accuracy and response times were combined into inverse efficiency scores (IES), the group of CPs were impaired, suggesting that the CPs could be using more effortful cognitive mechanisms to be as accurate as controls. In conclusion, our findings demonstrate CP may not generally be limited to face processing difficulties, but may also extend to body perception