983 resultados para Digit speech recognition


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work focuses in the formal and technical analysis of some aspects of a constructed language. As a first part of the work, a possible coding for the language will be studied, emphasizing the pre x coding, for which an extension of the Hu man algorithm from binary to n-ary will be implemented. Because of that in the language we can't know a priori the frequency of use of the words, a study will be done and several strategies will be proposed for an open words system, analyzing previously the existing number of words in current natural languages. As a possible upgrade of the coding, we'll take also a look to the synchronization loss problem, as well as to its solution: the self-synchronization, a t-codes study with the number of possible words for the language, as well as other alternatives. Finally, and from a less formal approach, several applications for the language have been developed: A voice synthesizer, a speech recognition system and a system font for the use of the language in text processors. For each of these applications, the process used for its construction, as well as the problems encountered and still to solve in each will be detailed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Early intervention is the key to spoken language for hearing impaired children. A severe hearing loss diagnosis in young children raises the urgent question on the optimal type of hearing aid device. As there is no recent data on comparing selection criteria for a specific hearing aid device, the goal of the Hearing Evaluation of Auditory Rehabilitation Devices (hEARd) project (Coninx & Vermeulen, 2012) evolved to collect and analyze interlingually comparable normative data on the speech perception performances of children with hearing aids and children with cochlear implants (CI). METHOD: In various institutions for hearing rehabilitation in Belgium, Germany and the Netherlands the Adaptive Auditory Speech Test AAST was used in the hEARd project, to determine speech perception abilities in kindergarten and school aged hearing impaired children. Results in the speech audiometric procedures were matched to the unaided hearing loss values of children using hearing aids and compared to results of children using CI. 277 data sets of hearing impaired children were analyzed. Results of children using hearing aids were summarized in groups as to their unaided hearing loss values. The grouping was related to the World Health Organization’s (WHO) grading of hearing impairment from mild (25–40 dB HL) to moderate (41–60 dB HL), severe (61-80 dB HL) and profound hearing impairment (80 dB HL and higher). RESULTS: AAST speech recognition results in quiet showed a significantly better performance for the CI group in comparison to the group of profoundly impaired hearing aid users as well as the group of severely impaired hearing aid users. However the CI users’ performances in speech perception in noise did not vary from the hearing aid users’ performances. Within the collected data analyses showed that children with a CI show an equivalent performance on speech perception in quiet as children using hearing aids with a “moderate” hearing impairment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis examines the state of audiovisual translation (AVT) in the aftermath of the COVID-19 emergency, highlighting new trends with regards to the implementation of AI technologies as well as their strengths, constraints, and ethical implications. It starts with an overview of the current AVT landscape, focusing on future projections about its evolution and its critical aspects such as the worsening working conditions lamented by AVT professionals – especially freelancers – in recent years and how they might be affected by the advent of AI technologies in the industry. The second chapter delves into the history and development of three AI technologies which are used in combination with neural machine translation in automatic AVT tools: automatic speech recognition, speech synthesis and deepfakes (voice cloning and visual deepfakes for lip syncing), including real examples of start-up companies that utilize them – or are planning to do so – to localize audiovisual content automatically or semi-automatically. The third chapter explores the many ethical concerns around these innovative technologies, which extend far beyond the field of translation; at the same time, it attempts to revindicate their potential to bring about immense progress in terms of accessibility and international cooperation, provided that their use is properly regulated. Lastly, the fourth chapter describes two experiments, testing the efficacy of the currently available tools for automatic subtitling and automatic dubbing respectively, in order to take a closer look at their perks and limitations compared to more traditional approaches. This analysis aims to help discerning legitimate concerns from unfounded speculations with regards to the AI technologies which are entering the field of AVT; the intention behind it is to humbly suggest a constructive and optimistic view of the technological transformations that appear to be underway, whilst also acknowledging their potential risks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Throughout the years, technology has had an undeniable impact on the AVT field. It has revolutionized the way audiovisual content is consumed by allowing audiences to easily access it at any time and on any device. Especially after the introduction of OTT streaming platforms such as Netflix, Amazon Prime Video, Disney+, Apple TV+, and HBO Max, which offer a vast catalog of national and international products, the consumption of audiovisual products has been on a constant rise and, consequently, the demand for localized content too. In turn, the AVT industry resorts to new technologies and practices to handle the ever-growing workload and the faster turnaround times. Due to the numerous implications that it has on the industry, technological advancement can be considered an area of research of particular interest for the AVT studies. However, in the case of dubbing, research and discussion regarding the topic is lagging behind because of the more limited impact that technology has had on the very conservative dubbing industry. Therefore, the aim of the dissertation is to offer an overview of some of the latest technological innovations and practices that have already been implemented (i.e. cloud dubbing and DeepDub technology) or that are still under development and research (i.e. automatic speech recognition and respeaking, machine translation and post-editing, audio-based and visual-based dubbing techniques, text-based editing of talking-head videos, and automatic dubbing), and respectively discuss their reception by the industry professionals, and make assumptions about their future implementation in the dubbing field.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Il lavoro di tesi presentato è nato da una collaborazione con il Politecnico di Macao, i referenti sono: Prof. Rita Tse, Prof. Marcus Im e Prof. Su-Kit Tang. L'obiettivo consiste nella creazione di un modello di traduzione automatica italiano-cinese e nell'osservarne il comportamento, al fine di determinare se sia o meno possibile l'impresa. Il trattato approfondisce l'argomento noto come Neural Language Processing (NLP), rientrando dunque nell'ambito delle traduzioni automatiche. Sono servizi che, attraverso l'ausilio dell'intelligenza artificiale sono in grado di elaborare il linguaggio naturale, per poi interpretarlo e tradurlo. NLP è una branca dell'informatica che unisce: computer science, intelligenza artificiale e studio di lingue. Dal punto di vista della ricerca, le più grandi sfide in questo ambito coinvolgono: il riconoscimento vocale (speech-recognition), comprensione del testo (natural-language understanding) e infine la generazione automatica di testo (natural-language generation). Lo stato dell'arte attuale è stato definito dall'articolo "Attention is all you need" \cite{vaswani2017attention}, presentato nel 2017 a partire da una collaborazione di ricercatori della Cornell University.\\ I modelli di traduzione automatica più noti ed utilizzati al momento sono i Neural Machine Translators (NMT), ovvero modelli che attraverso le reti neurali artificiali profonde, sono in grado effettuare traduzioni o predizioni. La qualità delle traduzioni è particolarmente buona, tanto da arrivare quasi a raggiungere la qualità di una traduzione umana. Il lavoro infatti si concentrerà largamente sullo studio e utilizzo di NMT, allo scopo di proporre un modello funzionale e che sia in grado di performare al meglio nelle traduzioni da italiano a cinese e viceversa.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The recording and processing of voice data raises increasing privacy concerns for users and service providers. One way to address these issues is to move processing on the edge device closer to the recording so that potentially identifiable information is not transmitted over the internet. However, this is often not possible due to hardware limitations. An interesting alternative is the development of voice anonymization techniques that remove individual speakers characteristics while preserving linguistic and acoustic information in the data. In this work, a state-of-the-art approach to sequence-to-sequence speech conversion, ini- tially based on x-vectors and bottleneck features for automatic speech recognition, is explored to disentangle the two acoustic information using different pre-trained speech and speakers representation. Furthermore, different strategies for selecting target speech representations are analyzed. Results on public datasets in terms of equal error rate and word error rate show that good privacy is achieved with limited impact on converted speech quality relative to the original method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper studies the relationship between consonant duration and recognition of these consanants by listeners with high frequency hearing loss.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Profound hearing loss is a disability that affects personality and when it involves teenagers before language acquisition, these bio-psychosocial conflicts can be exacerbated, requiring careful evaluation and choice of them for cochlear implant. Aim: To evaluate speech perception by adolescents with profound hearing loss, users of cochlear Implants. Study Design: Prospective. Materials and Methods: Twenty-five individuals with severe or profound pre-lingual hearing loss who underwent cochlear implantation during adolescence, between 10 to 17 years and 11 months, who went through speech perception tests before the implant and 2 years after device activation. For comparison and analysis we used the results from tests of four choice, recognition of vowels and recognition of sentences in a closed setting and the open environment. Results: The average percentage of correct answers in the four choice test before the implant was 46.9% and after 24 months of device use, this value went up to 86.1% in the vowels recognition test, the average difference was 45.13% to 83.13% and the sentences recognition test together in closed and open settings was 19.3% to 60.6% and 1.08% to 20.47% respectively. Conclusion: All patients, although with mixed results, achieved statistical improvement in all speech tests that were employed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It gives me great pleasure to accept the invitation to address this conference on “Meeting the Challenges of Cultural Diversity in the Irish Healthcare Sector” which is being organised by the Irish Health Services Management Institute in partnership with the National Consultative Committee on Racism and Interculturalism. The conference provides an important opportunity to develop our knowledge and understanding of the issues surrounding cultural diversity in the health sector from the twin perspectives of patients and staff. Cultural diversity has over recent years become an increasingly visible aspect of Irish society bringing with it both opportunities and challenges. It holds out great possibilities for the enrichment of all who live in Ireland but it also challenges us to adapt creatively to the changes required to realise this potential and to ensure that the experience is a positive one for all concerned but particularly for those in the minority ethnic groups. In the last number of years in particular, the focus has tended to be on people coming to this country either as refugees, asylum seekers or economic migrants. Government figures estimate that as many as 340,000 immigrants are expected in the next six years. However ethnic and cultural diversity are not new phenomena in Ireland. Travellers have a long history as an indigenous minority group in Ireland with a strong culture and identity of their own. The changing experience and dynamics of their relationship with the wider society and its institutions over time can, I think, provide some valuable lessons for us as we seek to address the more numerous and complex issues of cultural diversity which have arisen for us in the last decade. Turning more specifically to the health sector which is the focus of this conference, culture and identity have particular relevance to health service policy and provision in that The first requirement is that we in the health service acknowledge cultural diversity and the differences in behaviours and in the less obvious areas of values and beliefs that this often implies. Only by acknowledging these differences in a respectful way and informing ourselves of them can we address them. Our equality legislation – The Employment Equality Act, 1998 and the Equal Status Act, 2000 – prohibits discrimination on nine grounds including race and membership of the Traveller community. The Equal Status Act prohibits discrimination on an individual basis in relation to the nine grounds while for groups it provides for the promotion of equality of opportunity. The Act applies to the provision of services including health services. I will speak first about cultural diversity in relation to the patient. In this respect it is worth mentioning that the recognition of cultural diversity and appropriate responses to it were issues which were strongly emphasised in the public consultation process which we held earlier this year in the context of developing National Anti-Poverty targets for the health sector and also our new national health strategy. Awareness and sensitivity training for staff is a key requirement for adapting to a culturally diverse patient population. The focus of this training should be the development of the knowledge and skills to provide services sensitive to cultural diversity. Such training can often be most effectively delivered in partnership with members of the minority groups themselves. I am aware that the Traveller community, for example, is involved in in-service training for health care workers. I am also aware that the National Consultative Committee on Racism and Interculturalism has been involved in training with the Eastern Regional Health Authority. We need to have more such initiatives. A step beyond the sensitivity training for existing staff is the training of members of the minority communities themselves as workers in our health services. Again the Traveller community has set an example in this area with its Primary Health Care Project for Travellers. The Primary Health Care for Travellers Project was established in 1994 as a joint partnership initiative with the Eastern Health Board and Pavee Point, with ongoing technical assistance being provided from the Department of Community Health and General Practice, Trinity College, Dublin. This project was the first of its kind in the country and has facilitated The project included a training course which concentrated on skills development, capacity building and the empowerment of Travellers. This confidence and skill allowed the Community Health Workers to go out and conduct a baseline survey to identify and articulate Travellers’ health needs. This was the first time that Travellers were involved in this process; in the past their needs were assumed. The results of the survey were fed back to the community and they prioritised their needs and suggested changes to the health services which would facilitate their access and utilisation. Ongoing monitoring and data collection demonstrates a big improvement in levels of satisfaction and uptake and ulitisation of health services by Travellers in the pilot area. This Primary Health Care for Travellers initiative is being replicated in three other areas around the country and funding has been approved for a further 9 new projects. This pilot project was the recipient of a WHO 50th anniversary commemorative award in 1998. The project is developing as a model of good practice which could inspire further initiatives of this type for other minority groups. Access to information has been identified in numerous consultative processes as a key factor in enabling people to take a proactive approach to managing their own health and that of their families and in facilitating their access to health services. Honouring our commitment to equity in these areas requires that information is provided in culturally appropriate formats. The National Health Promotion Strategy 2000-2005, for example, recognises that there exists within our society many groups with different requirements which need to be identified and accommodated when planning and implementing health promotion interventions. These groups include Travellers, refugees and asylum seekers, people with intellectual, physical or sensory disability and the gay and lesbian community. The Strategy acknowledges the challenge involved in being sensitive to the potential differences in patterns of poor health among these different groups. The Strategic aim is to promote the physical, mental and social well-being of individuals from these groups. The objective of the Strategy on these issues are: While our long term aim may be to mainstream responses so that our health services is truly multicultural, we must recognise the need at this point in time for very specific focused responses particularly for groups with poor health status such as Travellers and also for refugees and asylum seekers. In the case of refugees and asylum seekers examples of targeted services are screening for communicable diseases – offered on a voluntary basis – and psychological support services for those who have suffered trauma before coming here. The two approaches of targeting and mainstreaming are not mutually exclusive. A combination of both is required at this point in time but the balance between them must be kept under constant review in the light of changing needs. A major requirement if we are to meet the challenge of cultural diversity is an appropriate data and research base. I think it is important that we build up our information and research data base in partnership with the minority groups themselves. We must establish what the health needs of diverse groups are; we must monitor uptake of services and how well we are responding to needs and we must monitor outcomes and health status. We must also examine the impact of the policies in other sectors on the health of minority groups. The National Health Information Strategy, currently being developed, and the recently published National Strategy for Health Research – Making Knowledge Work for Health provide important frameworks within which we can improve our data and research base. A culturally diverse health sector workforce – challenges and opportunities The Irish health service can benefit greatly from successful international recruitment. There has been a strong non-national representation amongst the medical profession for more than 30 years. More recently there have been significant increases in other categories of health service workers from overseas. The Department recognises the enormous value that overseas recruitment brings over a wide range of services and supports the development of effective and appropriate recruitment strategies in partnership with health service employers. These changes have made cultural diversity an important issue for all health service organisations. Diversity in the workplace is primarily about creating a culture that seeks, respects, values and harnesses difference. This includes all the differences that when added together make each person unique. So instead of the focus being on particular groups, diversity is about all of us. Change is not about helping “them” to join “us” but about critically looking at “us” and rooting out all aspects of our culture that inappropriately exclude people and prevent us from being inclusive in the way we relate to employees, potential employees and clients of the health service. International recruitment benefits consumers, Irish employees and the overseas personnel alike. Regardless of whether they are employed by the health service, members of minority groups will be clients of our service and consequently we need to be flexible in order to accommodate different cultural needs. For staff, we recognise that coming from other cultures can be a difficult transition. Consequently health service employers have made strong efforts to assist them during this period. Many organisations provide induction courses, religious facilities (such as prayer rooms) and help in finding suitable accommodation. The Health Service Employers Agency (HSEA) is developing an equal opportunities/diversity strategy and action plans as well as training programmes to support their implementation, to ensure that all health service employment policies and practices promote the equality/diversity agenda to continue the development of a culturally diverse health service. The management of this new environment is extremely important for the health service as it offers an opportunity to go beyond set legal requirements and to strive for an acceptance and nurturing of cultural differences. Workforce cultural diversity affords us the opportunity to learn from the working practices and perspectives of others by allowing personnel to present their ideas and experience through teamwork, partnership structures and other appropriate fora, leading to further improvement in the services we provide. It is important to ensure that both personnel units and line managers communicate directly with their staff and demonstrate by their actions that they intend to create an inclusive work place which doesn´t demand that minority staff fit. Contented, valued employees who feel that there is a place for them in the organisation will deliver a high quality health service. Your conference here today has two laudable aims – to heighten awareness and assist health care staff to work effectively with their colleagues from different cultural backgrounds and to gain a greater understanding of the diverse needs of patients from minority ethnic backgrounds. There is a synergy in these aims and in the tasks to which they give rise in the management of our health service. The creative adaptations required for one have the potential to feed into the other. I would like to commend both organisations which are hosting this conference for their initiative in making this event happen, particularly at this time – Racism in the Workplace Week. I look forward very much to hearing the outcome of your deliberations. Thank you.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.