885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic gender classification has many security and commercial applications. Various modalities have been investigated for gender classification with face-based classification being the most popular. In some real-world scenarios the face may be partially occluded. In these circumstances a classification based on individual parts of the face known as local features must be adopted. We investigate gender classification using lip movements. We show for the first time that important gender specific information can be obtained from the way in which a person moves their lips during speech. Furthermore our study indicates that the lip dynamics during speech provide greater gender discriminative information than simply lip appearance. We also show that the lip dynamics and appearance contain complementary gender information such that a model which captures both traits gives the highest overall classification result. We use Discrete Cosine Transform based features and Gaussian Mixture Modelling to model lip appearance and dynamics and employ the XM2VTS database for our experiments. Our experiments show that a model which captures lip dynamics along with appearance can improve gender classification rates by between 16-21% compared to models of only lip appearance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Laughter is a frequently occurring social signal and an important part of human non-verbal communication. However it is often overlooked as a serious topic of scientific study. While the lack of research in this area is mostly due to laughter’s non-serious nature, it is also a particularly difficult social signal to produce on demand in a convincing manner; thus making it a difficult topic for study in laboratory settings. In this paper we provide some techniques and guidance for inducing both hilarious laughter and conversational laughter. These techniques were devised with the goal of capturing mo- tion information related to laughter while the person laughing was either standing or seated. Comments on the value of each of the techniques and general guidance as to the importance of atmosphere, environment and social setting are provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The OSCAR test, a clinical device that uses counterphase flicker photometry, is believed to be sensitive to the relative numbers of long-wavelength and middle-wavelength cones in the retina, as well as to individual variations in the spectral positions of the photopigments. As part of a population study of individual variations in perception, we obtained OSCAR settings from 1058 participants. We report the distribution characteristics for this cohort. A randomly selected subset of participants was tested twice at an interval of at least one week: the test-retest reliability (Spearman's rho) was 0.80. In a whole-genome association analysis we found a provisional association with a single nucleotide polymorphism (rs16844995). This marker is close to the gene RXRG, which encodes a nuclear receptor, retinoid X receptor γ. This nuclear receptor is already known to have a role in the differentiation of cones during the development of the eye, and we suggest that polymorphisms in or close to RXRG influence the relative probability with which long-wave and middle-wave opsin genes are expressed in human cones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new type of Flexible Macroblock Ordering (FMO) type for the H.264 Advanced Video Coding (AVC) standard, which can more efficiently flag the position and shape of regions of interest (ROIs) in each frame. In H.264/AVC, 7 types of FMO have been defined, all of which are designed for error resilience. Most previous work related to ROI processing has adopted Type-2 (foreground & background), or Type-6 (explicit), to flag the position and shape of the ROI. However, only rectangular shapes are allowed in Type-2 and for non-rectangular shapes, the non-ROI macroblocks may be wrongly flagged as being within the ROI, which could seriously affect subsequent processing of the ROI. In Type-6, each macroblock in a frame uses fixed-length bits to indicate to its slice group. In general, each ROI is assigned to one slice group identity. Although this FMO type can more accurately flag the position and shape of the ROI, it incurs a significant bitrate overhead. The proposed new FMO type uses the smallest rectangle that covers the ROI to indicate its position and a spiral binary mask is employed within the rectangle to indicate the shape of the ROI. This technique can accurately flag the ROI and provide significantly savings in the bitrate overhead. Compared with Type-6, an 80% to 90% reduction in the bitrate overhead can be obtained while achieving the same accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to use virtual reality as a sport analysis tool, we need to be sure that an immersed athlete reacts realistically in a virtual environment. This has been validated for a real handball goalkeeper facing a virtual thrower. However, we currently ignore which visual variables induce a realistic motor behavior of the immersed handball goalkeeper. In this study, we used virtual reality to dissociate the visual information related to the movements of the player from the visual information related to the trajectory of the ball. Thus, the aim is to evaluate the relative influence of these different visual information sources on the goalkeeper's motor behavior. We tested 10 handball goalkeepers who had to predict the final position of the virtual ball in the goal when facing the following: only the throwing action of the attacking player (TA condition), only the resulting ball trajectory (BA condition), and both the throwing action of the attacking player and the resulting ball trajectory (TB condition). Here we show that performance was better in the BA and TB conditions, but contrary to expectations, performance was substantially worse in the TA condition. A significant effect of ball landing zone does, however, suggest that the relative importance between visual information from the player and the ball depends on the targeted zone in the goal. In some cases, body-based cues embedded in the throwing actions may have a minor influence on the ball trajectory and vice versa. Kinematics analysis was then combined with these results to determine why such differences occur depending on the ball landing zone and consequently how it can clarify the role of different sources of visual information on the motor behavior of an athlete immersed in a virtual environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ce mémoire est composé de trois articles et présente les résultats de travaux de recherche effectués dans le but d'améliorer les techniques actuelles permettant d'utiliser des données associées à certaines tâches dans le but d'aider à l'entraînement de réseaux de neurones sur une tâche différente. Les deux premiers articles présentent de nouveaux ensembles de données créés pour permettre une meilleure évaluation de ce type de techniques d'apprentissage machine. Le premier article introduit une suite d'ensembles de données pour la tâche de reconnaissance automatique de chiffres écrits à la main. Ces ensembles de données ont été générés à partir d'un ensemble de données déjà existant, MNIST, auquel des nouveaux facteurs de variation ont été ajoutés. Le deuxième article introduit un ensemble de données pour la tâche de reconnaissance automatique d'expressions faciales. Cet ensemble de données est composé d'images de visages qui ont été collectées automatiquement à partir du Web et ensuite étiquetées. Le troisième et dernier article présente deux nouvelles approches, dans le contexte de l'apprentissage multi-tâches, pour tirer avantage de données pour une tâche donnée afin d'améliorer les performances d'un modèle sur une tâche différente. La première approche est une généralisation des neurones Maxout récemment proposées alors que la deuxième consiste en l'application dans un contexte supervisé d'une technique permettant d'inciter des neurones à apprendre des fonctions orthogonales, à l'origine proposée pour utilisation dans un contexte semi-supervisé.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Catadioptric sensors are combinations of mirrors and lenses made in order to obtain a wide field of view. In this paper we propose a new sensor that has omnidirectional viewing ability and it also provides depth information about the nearby surrounding. The sensor is based on a conventional camera coupled with a laser emitter and two hyperbolic mirrors. Mathematical formulation and precise specifications of the intrinsic and extrinsic parameters of the sensor are discussed. Our approach overcomes limitations of the existing omni-directional sensors and eventually leads to reduced costs of production