941 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of the current study was to compare the quality of esophageal speech and voice to videofluoroscopic features of the esophagus and pharyngoesophageal (PE) segment. The speech and voice characteristics of 30 laryngectomized patients were rated by 5 speech-language pathologists. Based on these ratings, patients were divided into 3 categories: fluent (n = 9), moderately fluent (n = 10) and nonfluent (n = 11). Videofluoroscopy of the PE region was then performed during both swallowing and voice production. An insufflation test and percutaneous pharyngeal plexus block were required in 9 patients to determine the etiology of poor esophageal voice production. The strongest videofluoroscopic indicators of nonfluent speakers were: (1) small or absent air reservoir and (2) lack of a vibrating PE segment. Fluent speakers presented with shorter PE segments (1.17 mm) compared to moderately fluent speakers (17.1-29.9 mm). Perceptually, fluent speakers presented with a predominantly rough vocal quality. In contrast, moderately fluent speakers presented with a tense quality. In addition, stoma blast noise was reduced in fluent speakers. Videofluoroscopic findings highly correlated with the quality of esophageal speech. Copyright (C) 2009 S. Karger AG, Basel

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Additional neurological features have recently been described in seven families transmitting pathogenic mutations in OPA1, the most common cause of autosomal dominant optic atrophy. However, the frequency of these syndromal `dominant optic atrophy plus` variants and the extent of neurological involvement have not been established. In this large multi-centre study of 104 patients from 45 independent families, including 60 new cases, we show that extra-ocular neurological complications are common in OPA1 disease, and affect up to 20% of all mutational carriers. Bilateral sensorineural deafness beginning in late childhood and early adulthood was a prominent manifestation, followed by a combination of ataxia, myopathy, peripheral neuropathy and progressive external ophthalmoplegia from the third decade of life onwards. We also identified novel clinical presentations with spastic paraparesis mimicking hereditary spastic paraplegia, and a multiple sclerosis-like illness. In contrast to initial reports, multi-system neurological disease was associated with all mutational subtypes, although there was an increased risk with missense mutations [odds ratio = 3.06, 95% confidence interval = 1.44-6.49; P = 0.0027], and mutations located within the guanosine triphosphate-ase region (odds ratio = 2.29, 95% confidence interval = 1.08-4.82; P = 0.0271). Histochemical and molecular characterization of skeletal muscle biopsies revealed the presence of cytochrome c oxidase-deficient fibres and multiple mitochondrial DNA deletions in the majority of patients harbouring OPA1 mutations, even in those with isolated optic nerve involvement. However, the cytochrome c oxidase-deficient load was over four times higher in the dominant optic atrophy + group compared to the pure optic neuropathy group, implicating a causal role for these secondary mitochondrial DNA defects in disease pathophysiology. Individuals with dominant optic atrophy plus phenotypes also had significantly worse visual outcomes, and careful surveillance is therefore mandatory to optimize the detection and management of neurological disability in a group of patients who already have significant visual impairment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Emotions play a central role in our daily lives, influencing the way we think and act, our health and sense of well-being, and films are by excellence the form of art that exploits our affective, perceptual and intellectual activity, holding the potential for a significant impact. Video is becoming a dominant and pervasive medium, and online video a growing entertainment activity on the web and iTV, mainly due to technological developments and the trends for media convergence. In addition, the improvement of new techniques for gathering emotional information about videos, both through content analysis or user implicit feedback through user physiological signals complemented in manual labeling from users, is revealing new ways for exploring emotional information in videos, films or TV series, and brings out new perspectives to enrich and personalize video access. In this work, we reflect on the power that emotions have in our lives, on the emotional impact of movies, and on how to address this emotional dimension in the way we classify and access movies, by exploring and evaluating the design of iFelt in its different ways to classify, access, browse and visualize movies based on their emotional impac

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vivemos cada vez mais numa era de crescentes avanços tecnológicos em diversas áreas. O que há uns anos atrás era considerado como praticamente impossível, em muitos dos casos, já se tornou realidade. Todos usamos tecnologias como, por exemplo, a Internet, Smartphones e GPSs de uma forma natural. Esta proliferação da tecnologia permitiu tanto ao cidadão comum como a organizações a sua utilização de uma forma cada vez mais criativa e simples de utilizar. Além disso, a cada dia que passa surgem novos negócios e startups, o que demonstra o dinamismo que este crescimento veio trazer para a indústria. A presente dissertação incide sobre duas áreas em forte crescimento: Reconhecimento Facial e Business Intelligence (BI), assim como a respetiva combinação das duas com o objetivo de ser criado um novo módulo para um produto já existente. Tratando-se de duas áreas distintas, é primeiramente feito um estudo sobre cada uma delas. A área de Business Intelligence é vocacionada para organizações e trata da recolha de informação sobre o negócio de determinada empresa, seguindo-se de uma posterior análise. A grande finalidade da área de Business Intelligence é servir como forma de apoio ao processo de tomada de decisão por parte dos analistas e gestores destas organizações. O Reconhecimento Facial, por sua vez, encontra-se mais presente na sociedade. Tendo surgido no passado através da ficção científica, cada vez mais empresas implementam esta tecnologia que tem evoluído ao longo dos anos, chegando mesmo a ser usada pelo consumidor final, como por exemplo em Smartphones. As suas aplicações são, portanto, bastante diversas, desde soluções de segurança até simples entretenimento. Para estas duas áreas será assim feito um estudo com base numa pesquisa de publicações de autores da respetiva área. Desde os cenários de utilização, até aspetos mais específicos de cada uma destas áreas, será assim transmitido este conhecimento para o leitor, o que permitirá uma maior compreensão por parte deste nos aspetos relativos ao desenvolvimento da solução. Com o estudo destas duas áreas efetuado, é então feita uma contextualização do problema em relação à área de atuação da empresa e quais as abordagens possíveis. É também descrito todo o processo de análise e conceção, assim como o próprio desenvolvimento numa vertente mais técnica da solução implementada. Por fim, são apresentados alguns exemplos de resultados obtidos já após a implementação da solução.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El reconeixement dels gestos de la mà (HGR, Hand Gesture Recognition) és actualment un camp important de recerca degut a la varietat de situacions en les quals és necessari comunicar-se mitjançant signes, com pot ser la comunicació entre persones que utilitzen la llengua de signes i les que no. En aquest projecte es presenta un mètode de reconeixement de gestos de la mà a temps real utilitzant el sensor Kinect per Microsoft Xbox, implementat en un entorn Linux (Ubuntu) amb llenguatge de programació Python i utilitzant la llibreria de visió artifical OpenCV per a processar les dades sobre un ordinador portàtil convencional. Gràcies a la capacitat del sensor Kinect de capturar dades de profunditat d’una escena es poden determinar les posicions i trajectòries dels objectes en 3 dimensions, el que implica poder realitzar una anàlisi complerta a temps real d’una imatge o d’una seqüencia d’imatges. El procediment de reconeixement que es planteja es basa en la segmentació de la imatge per poder treballar únicament amb la mà, en la detecció dels contorns, per després obtenir l’envolupant convexa i els defectes convexos, que finalment han de servir per determinar el nombre de dits i concloure en la interpretació del gest; el resultat final és la transcripció del seu significat en una finestra que serveix d’interfície amb l’interlocutor. L’aplicació permet reconèixer els números del 0 al 5, ja que s’analitza únicament una mà, alguns gestos populars i algunes de les lletres de l’alfabet dactilològic de la llengua de signes catalana. El projecte és doncs, la porta d’entrada al camp del reconeixement de gestos i la base d’un futur sistema de reconeixement de la llengua de signes capaç de transcriure tant els signes dinàmics com l’alfabet dactilològic.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method to detect patterns in defocused scenes by means of a joint transform correlator. We describe analytically the correlation plane, and we also introduce an original procedure to recognize the target by postprocessing the correlation plane. The performance of the methodology when the defocused images are corrupted by additive noise is also considered.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: To describe the clinical features of idiopathic chiasmal neuritis in a large cohort of patients and to report their visual and neurologic outcomes. DESIGN: A retrospective medical record review of consecutive patients with chiasmal neuritis at a single institution. Patients with clinical or radiographic evidence of inflammation involving the intraorbital optic nerve and patients with a systemic inflammatory or neoplastic disorder were excluded. RESULTS: Twenty patients were identified (14 female, 6 male; mean age, 37 years). Visual acuity at initial examination ranged from 20/15 to light perception. Progressive visual loss beyond 1 month was documented in 1 patient. Twelve of 15 patients who underwent magnetic resonance imaging demonstrated chiasmal enlargement and/or enhancement; 6 patients had 1 or more white matter lesions. Follow-up time ranged from 2 weeks to 22 years, with a mean of 5.7 years. The final median visual acuity was 20/20 (range, 20/15-20/50) and visual fields were normal or improved. Of 15 patients with a minimum follow-up interval of 1 year, 6 developed multiple sclerosis. CONCLUSIONS: The demographic and clinical features of idiopathic chiasmal neuritis resemble those of idiopathic optic neuritis. Visual prognosis is excellent. In this series, 40% of patients subsequently developed multiple sclerosis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El reconeixement dels gestos de la mà (HGR, Hand Gesture Recognition) és actualment un camp important de recerca degut a la varietat de situacions en les quals és necessari comunicar-se mitjançant signes, com pot ser la comunicació entre persones que utilitzen la llengua de signes i les que no. En aquest projecte es presenta un mètode de reconeixement de gestos de la mà a temps real utilitzant el sensor Kinect per Microsoft Xbox, implementat en un entorn Linux (Ubuntu) amb llenguatge de programació Python i utilitzant la llibreria de visió artifical OpenCV per a processar les dades sobre un ordinador portàtil convencional. Gràcies a la capacitat del sensor Kinect de capturar dades de profunditat d’una escena es poden determinar les posicions i trajectòries dels objectes en 3 dimensions, el que implica poder realitzar una anàlisi complerta a temps real d’una imatge o d’una seqüencia d’imatges. El procediment de reconeixement que es planteja es basa en la segmentació de la imatge per poder treballar únicament amb la mà, en la detecció dels contorns, per després obtenir l’envolupant convexa i els defectes convexos, que finalment han de servir per determinar el nombre de dits i concloure en la interpretació del gest; el resultat final és la transcripció del seu significat en una finestra que serveix d’interfície amb l’interlocutor. L’aplicació permet reconèixer els números del 0 al 5, ja que s’analitza únicament una mà, alguns gestos populars i algunes de les lletres de l’alfabet dactilològic de la llengua de signes catalana. El projecte és doncs, la porta d’entrada al camp del reconeixement de gestos i la base d’un futur sistema de reconeixement de la llengua de signes capaç de transcriure tant els signes dinàmics com l’alfabet dactilològic.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature extraction is the part of pattern recognition, where the sensor data is transformed into a more suitable form for the machine to interpret. The purpose of this step is also to reduce the amount of information passed to the next stages of the system, and to preserve the essential information in the view of discriminating the data into different classes. For instance, in the case of image analysis the actual image intensities are vulnerable to various environmental effects, such as lighting changes and the feature extraction can be used as means for detecting features, which are invariant to certain types of illumination changes. Finally, classification tries to make decisions based on the previously transformed data. The main focus of this thesis is on developing new methods for the embedded feature extraction based on local non-parametric image descriptors. Also, feature analysis is carried out for the selected image features. Low-level Local Binary Pattern (LBP) based features are in a main role in the analysis. In the embedded domain, the pattern recognition system must usually meet strict performance constraints, such as high speed, compact size and low power consumption. The characteristics of the final system can be seen as a trade-off between these metrics, which is largely affected by the decisions made during the implementation phase. The implementation alternatives of the LBP based feature extraction are explored in the embedded domain in the context of focal-plane vision processors. In particular, the thesis demonstrates the LBP extraction with MIPA4k massively parallel focal-plane processor IC. Also higher level processing is incorporated to this framework, by means of a framework for implementing a single chip face recognition system. Furthermore, a new method for determining optical flow based on LBPs, designed in particular to the embedded domain is presented. Inspired by some of the principles observed through the feature analysis of the Local Binary Patterns, an extension to the well known non-parametric rank transform is proposed, and its performance is evaluated in face recognition experiments with a standard dataset. Finally, an a priori model where the LBPs are seen as combinations of n-tuples is also presented

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the various functions of proteins in biological systems is the transport of small molecules, for this purpose proteins have naturally evolved special mechanisms to allow both ligand binding and its subsequent release to a target site; a process fundamental to many biological processes. Transport of Vitamin E (a-tocopherol), a lipid soluble antioxidant, to membranes helps in the protection of polyunsaturated fatty acids against peroxidative damage. In this research, the ligand binding characteristics of several members of the CRALTRIO family of lipid binding proteins was examined; the recombinant human a-Tocopherol Transfer Protein (a-TIP), Supernatant Protein Factor (SPF)ffocopherol Associated Protein (TAP), Cellular Retinaldehyde Binding Protein (CRALBP) and the phosphatidylinositol transfer protein from S. cerevisiae Sec 14p. Recombinant Sec 14p was expressed and purified from E. coli for comparison of tocopherol binding to the two other recombinant proteins postulated to traffic a-tocopherol. Competitive binding assays using [3H]-a-tocopherol and Lipidex-l000 resin allowed determination of the dissociation constants ~) of the CRAL-TRIO proteins for a-tocopherol and - 20 hydrophobic ligands for evaluation of the possible biological relevance of the binding interactions observed. The KIs (nM) for RRR-a-tocopherol are: a-TIP: 25.0, Sec 14p: 373, CRALBP: 528 and SPFffAP: 615. This indicates that all proteins recognize tocopherol but not with the same affinity. Sec 14p bound its native ligand PI with a KI of381 whereas SPFffAP bound PI (216) and y-tocopherol (268) similarly in contrast to the preferential binding ofRRR-a-tocopherol by a-TIP. Efforts to adequately represent biologically active SPFff AP involved investigation of tocopherol binding for several different recombinant proteins derived from different constructs and in the presence of different potential modulators (Ca+2, Mg+2, GTP and GDP); none of these conditions enhanced or inhibited a-tocopherol binding to SPF. This work suggests that only aTTP serves as the physiological mediator of a-tocopherol, yet structural homology between proteins allows common recognition of similar ligand features. In addition, several photo-affmity analogs of a-tocopherol were evaluated for their potential utility in further elucidation of a-TTP function or identification of novel tocopherol binding proteins.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Le regroupement des neurones de propriétés similaires est à l’origine de modules permettant d’optimiser l’analyse de l’information. La conséquence est la présence de cartes fonctionnelles dans le cortex visuel primaire de certains mammifères pour de nombreux paramètres tels que l’orientation, la direction du mouvement ou la position des stimuli (visuotopie). Le premier volet de cette thèse est consacré à caractériser l’organisation modulaire dans le cortex visuel primaire pour un paramètre fondamental, la suppression centre / pourtour et au delà du cortex visuel primaire (dans l’aire 21a), pour l’orientation et la direction. Toutes les études ont été effectuées à l’aide de l’imagerie optique des signaux intrinsèques sur le cortex visuel du chat anesthésié. La quantification de la modulation par la taille des stimuli à permis de révéler la présence de modules de forte et de faible suppression par le pourtour dans le cortex visuel primaire (aires 17 et 18). Ce type d’organisation n’avait été observé jusqu’ici que dans une aire de plus haut niveau hiérarchique chez le primate. Une organisation modulaire pour l’orientation, similaire à celle observée dans le cortex visuel primaire a été révélée dans l’aire 21a. Par contre, contrairement à l’aire 18, l’aire 21a ne semblait pas être organisée en domaine de direction. L’ensemble de ces résultats pourront permettre d’alimenter les connaissances sur l’organisation anatomo-fonctionnelle du cortex visuel du chat mais également de mieux comprendre les facteurs qui déterminent la présence d’une organisation modulaire. Le deuxième volet abordé dans cette thèse s’est intéressé à l’amélioration de l’aspect quantitatif apporté par l’analyse temporelle en imagerie optique des signaux intrinsèques. Cette nouvelle approche, basée sur l’analyse de Fourier a permis d’augmenter considérablement le rapport signal / bruit des enregistrements. Toutefois, cette analyse ne s’est basée jusqu’ici que sur la quantification d’une seule harmonique ce qui a limité son emploi à la cartographie de l’orientation et de rétinotopie uniquement. En exploitant les plus hautes harmoniques, un modèle a été proposé afin d’estimer la taille des champs récepteurs et la sélectivité à la direction. Ce modèle a par la suite été validé par des approches conventionnelles dans le cortex visuel primaire.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several recent hypotheses, including sensory drive and sensory exploitation, suggest that receiver biases may drive selection of biological signals in the context of sexual selection. Here we suggest that a similar mechanism may have led to convergence of patterns in flowers, stingless bee nest entrances, and pitchers of insectivorous plants. A survey of these non-related visual stimuli shows that they share features such as stripes, dark centre, and peripheral dots. Next, we experimentally show that in stingless bees the close-up approach to a flower is guided by dark centre preference. Moreover, in the approach towards their nest entrance, they have a spontaneous preference for entrance patterns containing a dark centre and disrupted ornamentation. Together with existing empirical evidence on the honeybee's and other insects' orientation to flowers, this suggests that the signal receivers of the natural patterns we examined, mainly Hymenoptera, have spontaneous preferences for radiating stripes, dark centres, and peripheral dots. These receiver biases may have evolved in other behavioural contexts in the ancestors of Hymenoptera, but our findings suggest that they have triggered the convergent evolution of visual stimuli in floral guides, stingless bee nest entrances, and insectivorous pitchers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we discuss current work concerning Appearance-based and CAD-based vision; two opposing vision strategies. CAD-based vision is geometry based, reliant on having complete object centred models. Appearance-based vision builds view dependent models from training images. Existing CAD-based vision systems that work with intensity images have all used one and zero dimensional features, for example lines, arcs, points and corners. We describe a system we have developed for combining these two strategies. Geometric models are extracted from a commercial CAD library of industry standard parts. Surface appearance characteristics are then learnt automatically by observing actual object instances. This information is combined with geometric information and is used in hypothesis evaluation. This augmented description improves the systems robustness to texture, specularities and other artifacts which are hard to model with geometry alone, whilst maintaining the advantages of a geometric description.