930 resultados para Java speech
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Informática
Resumo:
Nos últimos anos começaram a ser vulgares os computadores dotados de multiprocessadores e multi-cores. De modo a aproveitar eficientemente as novas características desse hardware começaram a surgir ferramentas para facilitar o desenvolvimento de software paralelo, através de linguagens e frameworks, adaptadas a diferentes linguagens. Com a grande difusão de redes de alta velocidade, tal como Gigabit Ethernet e a última geração de redes Wi-Fi, abre-se a oportunidade de, além de paralelizar o processamento entre processadores e cores, poder em simultâneo paralelizá-lo entre máquinas diferentes. Ao modelo que permite paralelizar processamento localmente e em simultâneo distribuí-lo para máquinas que também têm capacidade de o paralelizar, chamou-se “modelo paralelo distribuído”. Nesta dissertação foram analisadas técnicas e ferramentas utilizadas para fazer programação paralela e o trabalho que está feito dentro da área de programação paralela e distribuída. Tendo estes dois factores em consideração foi proposta uma framework que tenta aplicar a simplicidade da programação paralela ao conceito paralelo distribuído. A proposta baseia-se na disponibilização de uma framework em Java com uma interface de programação simples, de fácil aprendizagem e legibilidade que, de forma transparente, é capaz de paralelizar e distribuir o processamento. Apesar de simples, existiu um esforço para a tornar configurável de forma a adaptar-se ao máximo de situações possível. Nesta dissertação serão exploradas especialmente as questões relativas à execução e distribuição de trabalho, e a forma como o código é enviado de forma automática pela rede, para outros nós cooperantes, evitando assim a instalação manual das aplicações em todos os nós da rede. Para confirmar a validade deste conceito e das ideias defendidas nesta dissertação foi implementada esta framework à qual se chamou DPF4j (Distributed Parallel Framework for JAVA) e foram feitos testes e retiradas métricas para verificar a existência de ganhos de performance em relação às soluções já existentes.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Informática
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologias da Universidade Nova de Lisboa para a obtenção do Grau de Mestre em Engenharia Informática
Resumo:
As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.
Resumo:
Java! Uma palavra nova e misteriosa que consegue conjugar imagens, movimento, internet, programação, café quente e o nome de uma das 10,000 ilhas dos nossos amigos da Indonésia. No entanto, é sobre os quatro primeiros conceitos que se trata este projecto de “JAVA Applets, um comboio em início de marcha”. Se o HTML vem ligar imagens à internet, o JAVA tenta dar uma certa interactividade utilizando programação mas, tentando também, que os utilizadores finais a possam utilizar sem grande esforço.
Resumo:
In this work an adaptive modeling and spectral estimation scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for speech enhancement. Both speech and noise signals are modeled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. The model parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The speech enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. This approach is particularly useful as a pre-processing module for parametric based speech recognition systems that rely on spectral time dependent models. The system performance has been evaluated by a set of human listeners and by spectral distances. In both cases the use of this pre-processing module has led to improved results.
Resumo:
Speech interfaces for Assistive Technologies are not common and are usually replaced by others. The market they are targeting is not considered attractive and speech technologies are still not well spread. Industry still thinks they present some performance risks, especially Speech Recognition systems. As speech is the most elemental and natural way for communication, it has strong potential for enhancing inclusion and quality of life for broader groups of users with special needs, such as people with cerebral palsy and elderly staying at their homes. This work is a position paper in which the authors argue for the need to make speech become the basic interface in assistive technologies. Among the main arguments, we can state: speech is the easiest way to interact with machines; there is a growing market for embedded speech in assistive technologies, since the number of disabled and elderly people is expanding; speech technology is already mature to be used but needs adaptation to people with special needs; there is still a lot of R&D to be done in this area, especially when thinking about the Portuguese market. The main challenges are presented and future directions are proposed.
Resumo:
In this paper, a rule-based automatic syllabifier for Danish is described using the Maximal Onset Principle. Prior success rates of rule-based methods applied to Portuguese and Catalan syllabification modules were on the basis of this work. The system was implemented and tested using a very small set of rules. The results gave rise to 96.9% and 98.7% of word accuracy rate, contrary to our initial expectations, being Danish a language with a complex syllabic structure and thus difficult to be rule-driven. Comparison with data-driven syllabification system using artificial neural networks showed a higher accuracy rate of the former system.
Resumo:
The recent developments on Hidden Markov Models (HMM) based speech synthesis showed that this is a promising technology fully capable of competing with other established techniques. However some issues still lack a solution. Several authors report an over-smoothing phenomenon on both time and frequencies which decreases naturalness and sometimes intelligibility. In this work we present a new vowel intelligibility enhancement algorithm that uses a discrete Kalman filter (DKF) for tracking frame based parameters. The inter-frame correlations are modelled by an autoregressive structure which provides an underlying time frame dependency and can improve time-frequency resolution. The system’s performance has been evaluated using objective and subjective tests and the proposed methodology has led to improved results.
Resumo:
In this work an adaptive filtering scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for Hidden Markov Model (HMM) based speech synthesis quality enhancement. The objective is to improve signal smoothness across HMMs and their related states and to reduce artifacts due to acoustic model's limitations. Both speech and artifacts are modelled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. Themodel parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The quality enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. The system's performance has been evaluated using mean opinion score tests and the proposed technique has led to improved results.
Resumo:
As plataformas com múltiplos núcleos tornaram a programação paralela/concorrente num tópico de interesse geral. Diversos modelos de programação têm vindo a ser propostos, facilitando aos programadores a identificação de regiões de código potencialmente paralelizáveis, deixando ao sistema operativo a tarefa de as escalonar dinamicamente em tempo de execução, explorando o maior grau possível de paralelismo. O Java não foge a esta tendência, disponibilizando ao programador um número crescente de bibliotecas de mecanismos de sincronização e paralelização de código. Neste contexto, esta tese apresenta e discute um conjunto de resultados obtidos através de testes intensivos à eficiência de algoritmos de ordenação implementados com recurso aos mecanismos de concorrência da API do Java 8 (Threads, Threadpools, ExecutorService, CountdownLach, ExecutorCompletionService e ForkJoinPools) em sistemas com um número de núcleos variável. Para cada um dos mecanismos, são apresentadas conclusões sobre o seu funcionamento e discutidos os cenários em que o seu uso pode ser rentabilizado de modo a serem obtidos melhores tempos de execução.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children’s speech production and limit their intelligibility. We describe the development of a scale to classify children’s speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children’s speech from information in their medical notes. With the exception of parents, raters reclassified children’s speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child’s speech production using Likert scales. Inter-rater reliability was moderate to substantial (k > .58 for all comparisons). Test–retest reliability was substantial to almost perfect for all groups (k > .68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children’s speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review.