984 resultados para Audiovisual speech recognition


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Perception and recognition of faces are fundamental cognitive abilities that form a basis for our social interactions. Research has investigated face perception using a variety of methodologies across the lifespan. Habituation, novelty preference, and visual paired comparison paradigms are typically used to investigate face perception in young infants. Storybook recognition tasks and eyewitness lineup paradigms are generally used to investigate face perception in young children. These methodologies have introduced systematic differences including the use of linguistic information for children but not infants, greater memory load for children than infants, and longer exposure times to faces for infants than for older children, making comparisons across age difficult. Thus, research investigating infant and child perception of faces using common methods, measures, and stimuli is needed to better understand how face perception develops. According to predictions of the Intersensory Redundancy Hypothesis (IRH; Bahrick & Lickliter, 2000, 2002), in early development, perception of faces is enhanced in unimodal visual (i.e., silent dynamic face) rather than bimodal audiovisual (i.e., dynamic face with synchronous speech) stimulation. The current study investigated the development of face recognition across children of three ages: 5 – 6 months, 18 – 24 months, and 3.5 – 4 years, using the novelty preference paradigm and the same stimuli for all age groups. It also assessed the role of modality (unimodal visual versus bimodal audiovisual) and memory load (low versus high) on face recognition. It was hypothesized that face recognition would improve across age and would be enhanced in unimodal visual stimulation with a low memory load. Results demonstrated a developmental trend (F(2, 90) = 5.00, p = 0.009) with older children showing significantly better recognition of faces than younger children. In contrast to predictions, no differences were found as a function of modality of presentation (bimodal audiovisual versus unimodal visual) or memory load (low versus high). This study was the first to demonstrate a developmental improvement in face recognition from infancy through childhood using common methods, measures and stimuli consistent across age.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Profound hearing loss is a disability that affects personality and when it involves teenagers before language acquisition, these bio-psychosocial conflicts can be exacerbated, requiring careful evaluation and choice of them for cochlear implant. Aim: To evaluate speech perception by adolescents with profound hearing loss, users of cochlear Implants. Study Design: Prospective. Materials and Methods: Twenty-five individuals with severe or profound pre-lingual hearing loss who underwent cochlear implantation during adolescence, between 10 to 17 years and 11 months, who went through speech perception tests before the implant and 2 years after device activation. For comparison and analysis we used the results from tests of four choice, recognition of vowels and recognition of sentences in a closed setting and the open environment. Results: The average percentage of correct answers in the four choice test before the implant was 46.9% and after 24 months of device use, this value went up to 86.1% in the vowels recognition test, the average difference was 45.13% to 83.13% and the sentences recognition test together in closed and open settings was 19.3% to 60.6% and 1.08% to 20.47% respectively. Conclusion: All patients, although with mixed results, achieved statistical improvement in all speech tests that were employed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It gives me great pleasure to accept the invitation to address this conference on “Meeting the Challenges of Cultural Diversity in the Irish Healthcare Sector” which is being organised by the Irish Health Services Management Institute in partnership with the National Consultative Committee on Racism and Interculturalism. The conference provides an important opportunity to develop our knowledge and understanding of the issues surrounding cultural diversity in the health sector from the twin perspectives of patients and staff. Cultural diversity has over recent years become an increasingly visible aspect of Irish society bringing with it both opportunities and challenges. It holds out great possibilities for the enrichment of all who live in Ireland but it also challenges us to adapt creatively to the changes required to realise this potential and to ensure that the experience is a positive one for all concerned but particularly for those in the minority ethnic groups. In the last number of years in particular, the focus has tended to be on people coming to this country either as refugees, asylum seekers or economic migrants. Government figures estimate that as many as 340,000 immigrants are expected in the next six years. However ethnic and cultural diversity are not new phenomena in Ireland. Travellers have a long history as an indigenous minority group in Ireland with a strong culture and identity of their own. The changing experience and dynamics of their relationship with the wider society and its institutions over time can, I think, provide some valuable lessons for us as we seek to address the more numerous and complex issues of cultural diversity which have arisen for us in the last decade. Turning more specifically to the health sector which is the focus of this conference, culture and identity have particular relevance to health service policy and provision in that The first requirement is that we in the health service acknowledge cultural diversity and the differences in behaviours and in the less obvious areas of values and beliefs that this often implies. Only by acknowledging these differences in a respectful way and informing ourselves of them can we address them. Our equality legislation – The Employment Equality Act, 1998 and the Equal Status Act, 2000 – prohibits discrimination on nine grounds including race and membership of the Traveller community. The Equal Status Act prohibits discrimination on an individual basis in relation to the nine grounds while for groups it provides for the promotion of equality of opportunity. The Act applies to the provision of services including health services. I will speak first about cultural diversity in relation to the patient. In this respect it is worth mentioning that the recognition of cultural diversity and appropriate responses to it were issues which were strongly emphasised in the public consultation process which we held earlier this year in the context of developing National Anti-Poverty targets for the health sector and also our new national health strategy. Awareness and sensitivity training for staff is a key requirement for adapting to a culturally diverse patient population. The focus of this training should be the development of the knowledge and skills to provide services sensitive to cultural diversity. Such training can often be most effectively delivered in partnership with members of the minority groups themselves. I am aware that the Traveller community, for example, is involved in in-service training for health care workers. I am also aware that the National Consultative Committee on Racism and Interculturalism has been involved in training with the Eastern Regional Health Authority. We need to have more such initiatives. A step beyond the sensitivity training for existing staff is the training of members of the minority communities themselves as workers in our health services. Again the Traveller community has set an example in this area with its Primary Health Care Project for Travellers. The Primary Health Care for Travellers Project was established in 1994 as a joint partnership initiative with the Eastern Health Board and Pavee Point, with ongoing technical assistance being provided from the Department of Community Health and General Practice, Trinity College, Dublin. This project was the first of its kind in the country and has facilitated The project included a training course which concentrated on skills development, capacity building and the empowerment of Travellers. This confidence and skill allowed the Community Health Workers to go out and conduct a baseline survey to identify and articulate Travellers’ health needs. This was the first time that Travellers were involved in this process; in the past their needs were assumed. The results of the survey were fed back to the community and they prioritised their needs and suggested changes to the health services which would facilitate their access and utilisation. Ongoing monitoring and data collection demonstrates a big improvement in levels of satisfaction and uptake and ulitisation of health services by Travellers in the pilot area. This Primary Health Care for Travellers initiative is being replicated in three other areas around the country and funding has been approved for a further 9 new projects. This pilot project was the recipient of a WHO 50th anniversary commemorative award in 1998. The project is developing as a model of good practice which could inspire further initiatives of this type for other minority groups. Access to information has been identified in numerous consultative processes as a key factor in enabling people to take a proactive approach to managing their own health and that of their families and in facilitating their access to health services. Honouring our commitment to equity in these areas requires that information is provided in culturally appropriate formats. The National Health Promotion Strategy 2000-2005, for example, recognises that there exists within our society many groups with different requirements which need to be identified and accommodated when planning and implementing health promotion interventions. These groups include Travellers, refugees and asylum seekers, people with intellectual, physical or sensory disability and the gay and lesbian community. The Strategy acknowledges the challenge involved in being sensitive to the potential differences in patterns of poor health among these different groups. The Strategic aim is to promote the physical, mental and social well-being of individuals from these groups. The objective of the Strategy on these issues are: While our long term aim may be to mainstream responses so that our health services is truly multicultural, we must recognise the need at this point in time for very specific focused responses particularly for groups with poor health status such as Travellers and also for refugees and asylum seekers. In the case of refugees and asylum seekers examples of targeted services are screening for communicable diseases – offered on a voluntary basis – and psychological support services for those who have suffered trauma before coming here. The two approaches of targeting and mainstreaming are not mutually exclusive. A combination of both is required at this point in time but the balance between them must be kept under constant review in the light of changing needs. A major requirement if we are to meet the challenge of cultural diversity is an appropriate data and research base. I think it is important that we build up our information and research data base in partnership with the minority groups themselves. We must establish what the health needs of diverse groups are; we must monitor uptake of services and how well we are responding to needs and we must monitor outcomes and health status. We must also examine the impact of the policies in other sectors on the health of minority groups. The National Health Information Strategy, currently being developed, and the recently published National Strategy for Health Research – Making Knowledge Work for Health provide important frameworks within which we can improve our data and research base. A culturally diverse health sector workforce – challenges and opportunities The Irish health service can benefit greatly from successful international recruitment. There has been a strong non-national representation amongst the medical profession for more than 30 years. More recently there have been significant increases in other categories of health service workers from overseas. The Department recognises the enormous value that overseas recruitment brings over a wide range of services and supports the development of effective and appropriate recruitment strategies in partnership with health service employers. These changes have made cultural diversity an important issue for all health service organisations. Diversity in the workplace is primarily about creating a culture that seeks, respects, values and harnesses difference. This includes all the differences that when added together make each person unique. So instead of the focus being on particular groups, diversity is about all of us. Change is not about helping “them” to join “us” but about critically looking at “us” and rooting out all aspects of our culture that inappropriately exclude people and prevent us from being inclusive in the way we relate to employees, potential employees and clients of the health service. International recruitment benefits consumers, Irish employees and the overseas personnel alike. Regardless of whether they are employed by the health service, members of minority groups will be clients of our service and consequently we need to be flexible in order to accommodate different cultural needs. For staff, we recognise that coming from other cultures can be a difficult transition. Consequently health service employers have made strong efforts to assist them during this period. Many organisations provide induction courses, religious facilities (such as prayer rooms) and help in finding suitable accommodation. The Health Service Employers Agency (HSEA) is developing an equal opportunities/diversity strategy and action plans as well as training programmes to support their implementation, to ensure that all health service employment policies and practices promote the equality/diversity agenda to continue the development of a culturally diverse health service. The management of this new environment is extremely important for the health service as it offers an opportunity to go beyond set legal requirements and to strive for an acceptance and nurturing of cultural differences. Workforce cultural diversity affords us the opportunity to learn from the working practices and perspectives of others by allowing personnel to present their ideas and experience through teamwork, partnership structures and other appropriate fora, leading to further improvement in the services we provide. It is important to ensure that both personnel units and line managers communicate directly with their staff and demonstrate by their actions that they intend to create an inclusive work place which doesn´t demand that minority staff fit. Contented, valued employees who feel that there is a place for them in the organisation will deliver a high quality health service. Your conference here today has two laudable aims – to heighten awareness and assist health care staff to work effectively with their colleagues from different cultural backgrounds and to gain a greater understanding of the diverse needs of patients from minority ethnic backgrounds. There is a synergy in these aims and in the tasks to which they give rise in the management of our health service. The creative adaptations required for one have the potential to feed into the other. I would like to commend both organisations which are hosting this conference for their initiative in making this event happen, particularly at this time – Racism in the Workplace Week. I look forward very much to hearing the outcome of your deliberations. Thank you.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been demonstrated in earlier studies that patients with a cochlear implant have increased abilities for audio-visual integration because the crude information transmitted by the cochlear implant requires the persistent use of the complementary speech information from the visual channel. The brain network for these abilities needs to be clarified. We used an independent components analysis (ICA) of the activation (H2 (15) O) positron emission tomography data to explore occipito-temporal brain activity in post-lingually deaf patients with unilaterally implanted cochlear implants at several months post-implantation (T1), shortly after implantation (T0) and in normal hearing controls. In between-group analysis, patients at T1 had greater blood flow in the left middle temporal cortex as compared with T0 and normal hearing controls. In within-group analysis, patients at T0 had a task-related ICA component in the visual cortex, and patients at T1 had one task-related ICA component in the left middle temporal cortex and the other in the visual cortex. The time courses of temporal and visual activities during the positron emission tomography examination at T1 were highly correlated, meaning that synchronized integrative activity occurred. The greater involvement of the visual cortex and its close coupling with the temporal cortex at T1 confirm the importance of audio-visual integration in more experienced cochlear implant subjects at the cortical level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The physiological basis of human cerebral asymmetry for language remains mysterious. We have used simultaneous physiological and anatomical measurements to investigate the issue. Concentrating on neural oscillatory activity in speech-specific frequency bands and exploring interactions between gestural (motor) and auditory-evoked activity, we find, in the absence of language-related processing, that left auditory, somatosensory, articulatory motor, and inferior parietal cortices show specific, lateralized, speech-related physiological properties. With the addition of ecologically valid audiovisual stimulation, activity in auditory cortex synchronizes with left-dominant input from the motor cortex at frequencies corresponding to syllabic, but not phonemic, speech rhythms. Our results support theories of language lateralization that posit a major role for intrinsic, hardwired perceptuomotor processing in syllabic parsing and are compatible both with the evolutionary view that speech arose from a combination of syllable-sized vocalizations and meaningful hand gestures and with developmental observations suggesting phonemic analysis is a developmentally acquired process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a series of experiments in which we start with English to French and English to Japanese versions of an Open Source rule-based speech translation system for a medical domain, and bootstrap correspondign statistical systems. Comparative evaluation reveals that the rule-based systems are still significantly better than the statistical ones, despite the fact that considerable effort has been invested in tuning both the recognition and translation components; also, a hybrid system only marginally improved recall at the cost of a los in precision. The result suggests that rule-based architectures may still be preferable to statistical ones for safety-critical speech translation tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from non-invasive intelligent methods. The methods selected in this case are speech biomarkers oriented to Sponta-neous Speech and Emotional Response Analysis. Thus the main goal of the present work is feature search in Spontaneous Speech oriented to pre-clinical evaluation for the definition of test for AD diagnosis by One-class classifier. One-class classifi-cation problem differs from multi-class classifier in one essen-tial aspect. In one-class classification it is assumed that only information of one of the classes, the target class, is available. In this work we explore the problem of imbalanced datasets that is particularly crucial in applications where the goal is to maximize recognition of the minority class as in medical diag-nosis. The use of information about outlier and Fractal Dimen-sion features improves the system performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Top-down contextual influences play a major part in speech understanding, especially in hearing-impaired patients with deteriorated auditory input. Those influences are most obvious in difficult listening situations, such as listening to sentences in noise but can also be observed at the word level under more favorable conditions, as in one of the most commonly used tasks in audiology, i.e., repeating isolated words in silence. This study aimed to explore the role of top-down contextual influences and their dependence on lexical factors and patient-specific factors using standard clinical linguistic material. Spondaic word perception was tested in 160 hearing-impaired patients aged 23-88 years with a four-frequency average pure-tone threshold ranging from 21 to 88 dB HL. Sixty spondaic words were randomly presented at a level adjusted to correspond to a speech perception score ranging between 40 and 70% of the performance intensity function obtained using monosyllabic words. Phoneme and whole-word recognition scores were used to calculate two context-influence indices (the j factor and the ratio of word scores to phonemic scores) and were correlated with linguistic factors, such as the phonological neighborhood density and several indices of word occurrence frequencies. Contextual influence was greater for spondaic words than in similar studies using monosyllabic words, with an overall j factor of 2.07 (SD = 0.5). For both indices, context use decreased with increasing hearing loss once the average hearing loss exceeded 55 dB HL. In right-handed patients, significantly greater context influence was observed for words presented in the right ears than for words presented in the left, especially in patients with many years of education. The correlations between raw word scores (and context influence indices) and word occurrence frequencies showed a significant age-dependent effect, with a stronger correlation between perception scores and word occurrence frequencies when the occurrence frequencies were based on the years corresponding to the patients' youth, showing a "historic" word frequency effect. This effect was still observed for patients with few years of formal education, but recent occurrence frequencies based on current word exposure had a stronger influence for those patients, especially for younger ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A speech by Sean O'Sullivan, given in the House of Commons, "For the Recognition of the Beaver as a Symbol of the Sovereignty of the Dominion of Canada".

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La perception de mouvements est associée à une augmentation de l’excitabilité du cortex moteur humain. Ce système appelé « miroir » sous-tendrait notre habileté à comprendre les gestes posés par une tierce personne puisqu’il est impliqué dans la reconnaissance, la compréhension et l’imitation de ces gestes. Dans cette étude, nous examinons de quelle façon ce système miroir s’implique et se latéralise dans la perception du chant et de la parole. Une stimulation magnétique transcrânienne (TMS) à impulsion unique a été appliquée sur la représentation de la bouche du cortex moteur de 11 participants. La réponse motrice engendrée a été mesurée sous la forme de potentiels évoqués moteurs (PÉMs), enregistrés à partir du muscle de la bouche. Ceux-ci ont été comparés lors de la perception de chant et de parole, dans chaque hémisphère cérébral. Afin d’examiner l’activation de ce système moteur dans le temps, les impulsions de la TMS ont été envoyées aléatoirement à l’intérieur de 7 fenêtres temporelles (500-3500 ms). Les stimuli pour la tâche de perception du chant correspondaient à des vidéos de 4 secondes dans lesquelles une chanteuse produisait un intervalle ascendant de deux notes que les participants devaient juger comme correspondant ou non à un intervalle écrit. Pour la tâche de perception de la parole, les participants regardaient des vidéos de 4 secondes montrant une personne expliquant un proverbe et devaient juger si cette explication correspondait bien à un proverbe écrit. Les résultats de cette étude montrent que les amplitudes des PÉMs recueillis dans la tâche de perception de chant étaient plus grandes après stimulation de l’hémisphère droit que de l’hémisphère gauche, surtout lorsque l’impulsion était envoyée entre 1000 et 1500 ms. Aucun effet significatif n’est ressorti de la condition de perception de la parole. Ces résultats suggèrent que le système miroir de l’hémisphère droit s’active davantage après une présentation motrice audio-visuelle, en comparaison de l’hémisphère gauche.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La Teoría de la Acción Humana desarrollada por Antanas Mockus, parte del principio de que la acción humana es multimotivada y multiregulada. Entre las multiples regulaciones se encuentran los Sistemas de Regulación Social Ley, Moral y Cultura quienes en cierta medida explican el accionar y la interacción social. No obstante dado que la sociedad contemporánea se caracteriza por ubicarse en un contexto intercultural de contactos y fricciones permanentes, entre distintas perspectivas culturales en distintos grados de hibridación, el divorcio entre los sistemas regulatorios parece inevitable. Entendiendo al Espacio Público habermasiano ya no reservado únicamente a los actores institucionales, sino a la sociedad civil y los medios masivos. Es en este nuevo espacio deliberativo donde el Anfibio Cultural se desenvuelve a fin de resolver la problemática del divorcio entre los sistemas regulatorios. El anfibio valiéndose de sus herramientas camaleónicas y traductoras buscará por vías dialógicas la reducción así sea parcial, a mínimos aceptables de discusión e interrelación cultural. La imagen articulada a un mecanismo cultural por un lado reconstruye realidades pero por el otro alimenta el ciclo de producción y aceptación de imaginarios colectivos. De esta manera la presente investigación propone evidenciar los conceptos de Mockus en el desarrollo de una producción audiovisual, así como también realiza un aporte conceptual a la teoría al incluir la Dialéctica del Reconocimiento como una herramienta eficaz a la hora de buscar generar acuerdos. La presente investigación propone evidenciar cómo a partir del análisis de una producción audiovisual se promueven valores, cultural, ideas y costumbres que incentivan tanto el divorcio como la armonización entre los sistemas regulatorios.