988 resultados para Speech Rate
Resumo:
Hybrid system representations have been applied to many challenging modeling situations. In these hybrid system representations, a mixture of continuous and discrete states is used to capture the dominating behavioural features of a nonlinear, possible uncertain, model under approximation. Unfortunately, the problem of how to best design a suitable hybrid system model has not yet been fully addressed. This paper proposes a new joint state measurement relative entropy rate based approach for this design purpose. Design examples and simulation studies are presented which highlight the benefits of our proposed design approaches.
Resumo:
In a context where over-indebtedness and financial exclusion have been recognised as problems in Australia, it is undesirable that those who can least afford it, pay a high cost for short-term consumer credit. Evidence points to an increase in consumer debt in Australia and consequential over-indebtedness which has been shown to lead to a wide range of social problems.2 There is also evidence of financial exclusion, where consumers suffer a lack of access to mainstream financial services, and in Australia this is particularly the case with regard to access to safe and affordable credit.3 Financial exclusion can only exacerbate over-indebtedness, given that financially excluded, predominantly low income consumers , have been shown to turn to high cost credit to meet their short term credit needs. This is a problem that has been explored most recently in the Victorian Consumer Credit Review...
Resumo:
The performance of automatic speech recognition systems deteriorates in the presence of noise. One known solution is to incorporate video information with an existing acoustic speech recognition system. We investigate the performance of the individual acoustic and visual sub-systems and then examine different ways in which the integration of the two systems may be performed. The system is to be implemented in real time on a Texas Instruments' TMS320C80 DSP.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.
Resumo:
A healthy human would be expected to show periodic blinks, making a brief closure of the eyelids. Most blinks are spontaneous, occurring regularly with no external stimulus. However a reflex blink can occur in response to external stimuli such as a bright light, a sudden loud noise, or an object approaching toward the eyes. A voluntary or forced blink is another type of blink in which the person deliberately closes the eyes and the lower eyelid raises to meet the upper eyelid. A complete blink, in which the upper eyelid touches the lower eyelid, contributes to the health of ocular surface by providing a fresh layer of tears as well as maintaining optical integrity by providing a smooth tear film over the cornea. The rate of blinking and its completeness vary depending on the task undertaken during blink assessment, the direction of gaze, the emotional state of the subjects and the method under which the blink was measured. It is also well known that wearing contact lenses (both rigid and soft lenses) can induce significant changes in blink rate and completeness. It is been established that efficient blinking plays an important role in ocular surface health during contact lens wear and for improving contact lens performance and comfort. Inefficient blinking during contact lens wear may be related to a low blink rate or incomplete blinking and can often be a reason for dry eye symptoms or ocular surface staining. It has previously been shown that upward gaze can affect blink rate, causing it to become faster. In the first experiment, it was decided to expand on previous studies in this area by examining the effect of various gaze directions (i.e. upward gaze, primary gaze, downward gaze and lateral gaze) as well as head angle (recumbent position) on normal subjects’ blink rate and completeness through the use of filming with a high-speed camera. The results of this experiment showed that as the open palpebral aperture (and exposed ocular surface area) increased from downward gaze to upward gaze, the number of blinks significantly increased (p<0.04). Also, the size of closed palpebral aperture significantly increased from downward gaze to upward gaze (p<0.005). A weak positive correlation (R² = 0.18) between the blink rate and ocular surface area was found in this study. Also, it was found that the subjects showed 81% complete blinks, 19% incomplete blinks and 2% of twitch blinks in primary gaze, consistent with previous studies. The difference in the percentage of incomplete blinks between upward gaze and downward gaze was significant (p<0.004), showing more incomplete blinks in upward gaze. The findings of this experiment suggest that while blink rate becomes slower in downward gaze, the completeness of blinking is typically better, thereby potentially reducing the risk of tear instability. On the other hand, in upward gaze while the completeness of blinking becomes worse, this is potentially offset by increased blink frequency. In addition, blink rate and completeness were not affected by lateral gaze or head angle, possibly because these conditions have similar size of the open palpebral aperture compared with primary gaze. In the second experiment, an investigation into the changes in blink rate and completeness was carried out in primary gaze and downward gaze with soft and rigid contact lenses in unadapted wearers. Not surprisingly, rigid lens wear caused a significant increase in the blink rate in both primary (p<0.001) and downward gaze (p<0.02). After fitting rigid contact lenses, the closed palpebral aperture (blink completeness) did not show any changes but the open palpebral aperture showed a significant narrowing (p<0.04). This might occur from the subjects’ attempt to avoid interaction between the upper eyelid and the edge of the lens to minimize discomfort. After applying topical anaesthetic eye drops in the eye fitted with rigid lenses, the increased blink rate dropped to values similar to that before lens insertion and the open palpebral aperture returned to baseline values, suggesting that corneal and/or lid margin sensitivity was mediating the increased blink rate and narrowed palpebral aperture. We also investigated the changes in the blink rate and completeness with soft contact lenses including a soft sphere, double slab-off toric design and periballast toric design. Soft contact lenses did not cause any significant changes in the blink rate, closed palpebral aperture, open palpebral aperture and the percentage of incomplete blinks in either primary gaze or downward gaze. After applying anaesthetic eye drops, the blink rate reduced in both primary gaze and downward gaze, however this difference was not statistically significant. The size of the closed palpebral aperture and open palpebral aperture did not show any significant changes after applying anaesthetic eye drops. However it should be noted that the effects of rigid and soft contact lenses that we observed in these studies were only the immediate reaction to contact lenses and in the longer term, it is likely that these responses will vary as the eye adapts to the presence of the lenses.
Resumo:
Background: High-flow nasal cannulae (HFNC) create positive oropharyngeal airway pressure but it is unclear how their use affects lung volume. Electrical impedance tomography (EIT) allows assessment of changes in lung volume by measuring changes in lung impedance. Primary objectives were to investigate the effects of HFNC on airway pressure (Paw) and end-expiratory lung volume (EELV), and to identify any correlation between the two. Secondary objectives were to investigate the effects of HFNC on respiratory rate (RR), dyspnoea, tidal volume and oxygenation; and the interaction between body mass index (BMI) and EELV. Methods: Twenty patients prescribed HFNC post-cardiac surgery were investigated. Impedance measures, Paw, PaO2/FiO2 ratio, RR and modified Borg scores were recorded first on low flow oxygen (nasal cannula or Hudson face mask) and then on HFNC. Results: A strong and significant correlation existed between Paw and end-expiratory lung impedance (EELI) (r=0.7, p<0.001). Compared with low flow oxygen, HFNC significantly increased EELI by 25.6% (95% CI 24.3, 26.9) and Paw by 3.0 cmH2O (95% CI 2.4, 3.7). RR reduced by 3.4 breaths per minute (95% CI 1.7, 5.2) with HFNC use, tidal impedance variation increased by 10.5% (95% CI 6.1, 18.3) and PaO2/FiO2 ratio improved by 30.6 mmHg (95% CI 17.9, 43.3). HFNC improved subjective dyspnoea scoring (p=0.023). Increases in EELI were significantly influenced by BMI, with larger increases associated with higher BMIs (p<0.001). Conclusions: This study suggests that HFNC improve dyspnoea and oxygenation by increasing both EELV and tidal volume, and are most beneficial in patients with higher BMIs.
Resumo:
PURPOSE: To examine the visual predictors of falls and injurious falls among older adults with glaucoma. METHODS: Prospective falls data were collected for 71 community-dwelling adults with primary open-angle glaucoma, mean age 73.9 ± 5.7 years, for one year using monthly falls diaries. Baseline assessment of central visual function included high-contrast visual acuity and Pelli-Robson contrast sensitivity. Binocular integrated visual fields were derived from monocular Humphrey Field Analyser plots. Rate ratios (RR) for falls and injurious falls with 95% confidence intervals (CIs) were based on negative binomial regression models. RESULTS: During the one year follow-up, 31 (44%) participants experienced at least one fall and 22 (31%) experienced falls that resulted in an injury. Greater visual impairment was associated with increased falls rate, independent of age and gender. In a multivariate model, more extensive field loss in the inferior region was associated with higher rate of falls (RR 1.57, 95%CI 1.06, 2.32) and falls with injury (RR 1.80, 95%CI 1.12, 2.98), adjusted for all other vision measures and potential confounding factors. Visual acuity, contrast sensitivity, and superior field loss were not associated with the rate of falls; topical beta-blocker use was also not associated with increased falls risk. CONCLUSIONS: Falls are common among older adults with glaucoma and occur more frequently in those with greater visual impairment, particularly in the inferior field region. This finding highlights the importance of the inferior visual field region in falls risk and assists in identifying older adults with glaucoma at risk of future falls, for whom potential interventions should be targeted. KEY WORDS: glaucoma, visual field, visual impairment, falls, injury