871 resultados para Speech enhancement


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The first and second authors would like to thank the support of the PhD grants with references SFRH/BD/28817/2006 and SFRH/PROTEC/49517/2009, respectively, from Fundação para a Ciência e Tecnol ogia (FCT). This work was partially done in the scope of the project “Methodologies to Analyze Organs from Complex Medical Images – Applications to Fema le Pelvic Cavity”, wi th reference PTDC/EEA- CRO/103320/2008, financially supported by FCT.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mechanisms of speech production are complex and have been raising attention from researchers of both medical and computer vision fields. In the speech production mechanism, the articulator’s study is a complex issue, since they have a high level of freedom along this process, namely the tongue, which instigates a problem in its control and observation. In this work it is automatically characterized the tongues shape during the articulation of the oral vowels of Portuguese European by using statistical modeling on MR-images. A point distribution model is built from a set of images collected during artificially sustained articulations of Portuguese European sounds, which can extract the main characteristics of the motion of the tongue. The model built in this work allows under standing more clearly the dynamic speech events involved during sustained articulations. The tongue shape model built can also be useful for speech rehabilitation purposes, specifically to recognize the compensatory movements of the articulators during speech production.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: In Portugal, the routine clinical practice of speech and language therapists (SLTs) in treating children with all types of speech sound disorder (SSD) continues to be articulation therapy (AT). There is limited use of phonological therapy (PT) or phonological awareness training in Portugal. Additionally, at an international level there is a focus on collecting information on and differentiating between the effectiveness of PT and AT for children with different types of phonologically based SSD, as well as on the role of phonological awareness in remediating SSD. It is important to collect more evidence for the most effective and efficient type of intervention approach for different SSDs and for these data to be collected from diverse linguistic and cultural perspectives. Aims: To evaluate the effectiveness of a PT and AT approach for treatment of 14 Portuguese children, aged 4.0–6.7 years, with a phonologically based SSD. Methods & Procedures: The children were randomly assigned to one of the two treatment approaches (seven children in each group). All children were treated by the same SLT, blind to the aims of the study, over three blocks of a total of 25 weekly sessions of intervention. Outcome measures of phonological ability (percentage of consonants correct (PCC), percentage occurrence of different phonological processes and phonetic inventory) were taken before and after intervention. A qualitative assessment of intervention effectiveness from the perspective of the parents of participants was included. Outcomes & Results: Both treatments were effective in improving the participants’ speech, with the children receiving PT showing a more significant improvement in PCC score than those receiving the AT. Children in the PT group also showed greater generalization to untreated words than those receiving AT. Parents reported both intervention approaches to be as effective in improving their children’s speech. Conclusions & Implications: The PT (combination of expressive phonological tasks, phonological awareness, listening and discrimination activities) proved to be an effective integrated method of improving phonological SSD in children. These findings provide some evidence for Portuguese SLTs to employ PT with children with phonologically based SSD

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relation of automatic auditory discrimination, measured with MMN, with the type of stimuli has not been well established in the literature, despite its importance as an electrophysiological measure of central sound representation. In this study, MMN response was elicited by pure-tone and speech binaurally passive auditory oddball paradigm in a group of 8 normal young adult subjects at the same intensity level (75 dB SPL). The frequency difference in pure-tone oddball was 100 Hz (standard = 1 000 Hz; deviant = 1 100 Hz; same duration = 100 ms), in speech oddball (standard /ba/; deviant /pa/; same duration = 175 ms) the Portuguese phonemes are both plosive bi-labial in order to maintain a narrow frequency band. Differences were found across electrode location between speech and pure-tone stimuli. Larger MMN amplitude, duration and higher latency to speech were verified compared to pure-tone in Cz and Fz as well as significance differences in latency and amplitude between mastoids. Results suggest that speech may be processed differently than non-speech; also it may occur in a later stage due to overlapping processes since more neural resources are required to speech processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The effect of monopolar and bipolar shaped pulses in additional yield of apple juice extraction is evaluated. The applied electric field strength, pulsewidth, and number of pulses are assessed for both pulse types, and divergences are analyzed. Variation of electric field strength is ranged from 100 to 1300 V/cm, pulsewidth from 20 to 300 mu s, and the number of pulses from 10 to 200, at a frequency of 200 Hz. Two pulse trains separated by 1 s are applied to apple cubes. Results are plotted against reference untreated samples for all assays. Specific energy consumption is calculated for each experiment as well as qualitative indicators for apple juice of total soluble dry matter and absorbance at 390-nm wavelength. Bipolar pulses demonstrated higher efficiency, and specific energetic consumption has a threshold where higher inputs of energy do not result in higher juice extraction when electric field variation is applied. Total soluble dry matter and absorbance results do not illustrate significant differences between application of monopolar and bipolar pulses, but all values are inside the limits proposed for apple juice intended for human consumption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Undesirable void formation during the injection phase of the liquid composite molding process can be understood as a consequence of the non-uniformity of the flow front progression, caused by the dual porosity of the fiber perform. Therefore the best examination of the void formation physics can be provided by a mesolevel analysis, where the characteristic dimension is given by the fiber tow diameter. In mesolevel analysis, liquid impregnation along two different scales; inside fiber tows and within the spaces between them; must be considered and the coupling between these flow regimes must be addressed. In such case, it is extremely important to account correctly for the surface tension effects, which can be modeled as capillary pressure applied at the flow front. When continues Galerkin method is used, exploiting elements with velocity components and pressure as nodal variables, strong numerical implementation of such boundary conditions leads to ill-posing of the problem, in terms of the weak classical as well as stabilized formulation. As a consequence, there is an error in mass conservation accumulated especially along the free flow front. This article presents a numerical procedure, which was formulated and implemented in the existing Free Boundary Program in order to significantly reduce this error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Undesirable void formation during the injection phase of the liquid composite moulding process can be understood as a consequence of the non-uniformity of the flow front progression, caused by the dual porosity of the fibre perform. Therefore the best examination of the void formation physics can be provided by a mesolevel analysis, where the characteristic dimension is given by the fibre tow diameter. In mesolevel analysis, liquid impregnation along two different scales; inside fibre tows and within the open spaces between them; must be considered and the coupling between these flow regimes must be addressed. In such case, it is extremely important to account correctly for the surface tension effects, which can be modelled as capillary pressure applied at the flow front. Numerical implementation of such boundary conditions leads to ill-posing of the problem, in terms of the weak classical as well as stabilized formulation. As a consequence, there is an error in mass conservation accumulated especially along the free flow front. This contribution presents a numerical procedure, which was formulated and implemented in the existing Free Boundary Program in order to significantly reduce this error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a time of fierce competition between regions, an image serve as a basis to develop a strong sense of community, which fosters trust and cooperation that can be mobilized for regional growth. A positive image and reputation could be used in the promotional activities of the region benefiting all the stakeholders as a whole. Mega cultural events are frequently used to attract tourists and investments to a region, but also to enhance the city’s image. This study adopts a marketing/communication perspective of city’s image, and intends to explain how the image of the city is perceived by their residents. Specifically, we intend to compare the perceptions of residents that effectively participated in the Guimarães European Capital of Culture (ECOC) 2012 (engaged residents), and the residents that only assisted to the event (attendees). Several significant findings are reported and their implications for event managers and public policy administrators presented, along with the limitations of the study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a time of fierce competition between regions, an image serve as a basis to develop a strong sense of community, which fosters trust and cooperation that can be mobilized for regional growth. A positive image and reputation could be used in the promotional activities of the region benefiting all the stakeholders as a whole. Mega cultural events are frequently used to attract tourists and investments to a region, but also to enhance the city’s image. This study adopts a marketing/communication perspective of city’s image, and intends to explain how the image of the city is perceived by their residents. Specifically, we intend to compare the perceptions of residents that effectively participated in the Guimarães European Capital of Culture (ECOC) 2012 (engaged residents), and the residents that only assisted to the event (attendees). Several significant findings are reported and their implications for event managers and public policy administrators presented, along with the limitations of the study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech interfaces for Assistive Technologies are not common and are usually replaced by others. The market they are targeting is not considered attractive and speech technologies are still not well spread. Industry still thinks they present some performance risks, especially Speech Recognition systems. As speech is the most elemental and natural way for communication, it has strong potential for enhancing inclusion and quality of life for broader groups of users with special needs, such as people with cerebral palsy and elderly staying at their homes. This work is a position paper in which the authors argue for the need to make speech become the basic interface in assistive technologies. Among the main arguments, we can state: speech is the easiest way to interact with machines; there is a growing market for embedded speech in assistive technologies, since the number of disabled and elderly people is expanding; speech technology is already mature to be used but needs adaptation to people with special needs; there is still a lot of R&D to be done in this area, especially when thinking about the Portuguese market. The main challenges are presented and future directions are proposed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a rule-based automatic syllabifier for Danish is described using the Maximal Onset Principle. Prior success rates of rule-based methods applied to Portuguese and Catalan syllabification modules were on the basis of this work. The system was implemented and tested using a very small set of rules. The results gave rise to 96.9% and 98.7% of word accuracy rate, contrary to our initial expectations, being Danish a language with a complex syllabic structure and thus difficult to be rule-driven. Comparison with data-driven syllabification system using artificial neural networks showed a higher accuracy rate of the former system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children’s speech production and limit their intelligibility. We describe the development of a scale to classify children’s speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children’s speech from information in their medical notes. With the exception of parents, raters reclassified children’s speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child’s speech production using Likert scales. Inter-rater reliability was moderate to substantial (k > .58 for all comparisons). Test–retest reliability was substantial to almost perfect for all groups (k > .68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children’s speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Doutor em Engenharia Química e Bioquímica