295 resultados para Medical Speech
Resumo:
This study aims to stimulate thought, debate and action for change on this question of more vigorous philanthropic funding of Australian health and medical research (HMR). It sharpens the argument with some facts and ideas about HMR funding from overseas sources. It also reports informed opinions from those working, giving and innovating in this area. It pinpoints the range of attitudes to HMR giving, both positive and negative. The study includes some aspects of Government funding as part of the equation, viewing Government as major HMR givers, with particular ability to partner, leverage and create incentives. Stimulating new philanthropy takes active outreach. The opportunity to build more dialogue between the HMR industry and the wider community is timely given the ‘licence to practice’ issues and questioned trust that applies currently somewhat both to science and to the charitable sector. This interest in improving HMR philanthropy also coincides with the launch last year by the Federal Government of Nonprofit Australia Limited (NAL), a group currently assessing infrastructure improvements to the charitable sector. History suggests no one will create this change if Research Australia does not. However, interest in change exists in various quarters. For Research Australia to successfully change the culture of Australian HMR giving, the process will drive the outcomes. Obviously stakeholder buy-in and partners will be needed and the ultimate blueprint for greater philanthropic HMR funding here will not be this document. Instead it will be the one that wears the handprint and ‘mindprint’ of the many architects and implementers interested in promoting HMR philanthropy, from philanthropists to nonprofit peaks to government policy arms. As the African proverb says, ‘If you want to go fast, go alone; but if you want to go far, go with others’.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
DIRECTOR’S OVERVIEW by Professor Mark Pearcy This report for 2009 is the first full year report for MERF. The development of our activities in 2009 has been remarkable and is testament to the commitment of the staff to the vision of MERF as a premier training and research facility. From the beginnings in 2003, when a need was identified for the provision of specialist research and training facilities to enable close collaboration between researchers and clinicians, to the realisation of the vision in 2009 has been an amazing journey. However, we have learnt that there is much more that can be achieved and the emphasis will be on working with the university, government and external partners to realise the full potential of MERF by further development of the Facility. In 2009 we conducted 28 workshops in the Anatomical and Surgical Skills Laboratory providing training for surgeons in the latest techniques. This was an excellent achievement for the first full year as our reputation for delivering first class facilities and support grows. The highlight, perhaps, was a course run via our video link by a surgeon in the USA directing the participants in MERF. In addition, we have continued to run a small number of workshops in the operating theatre and this promises to be an avenue that will be of growing interest. Final approval was granted for the QUT Body Bequest Program late in 2009 following the granting of an Anatomical Accepting Licence. This will enable us to expand our capabilities by provide better material for the workshops. The QUT Body Bequest Program will be launched early in 2010. The Biological Research Facility (BRF) conducted over 270 procedures in 2009. This is a wonderful achievement considering less then 40 were performed in 2008. The staff of the BRF worked very hard to improve the state of the old animal house and this resulted in approval for expanded use by the ethics committees of both QUT and the University of Queensland. An external agency conducted an Occupational Health and Safety Audit of MERF in 2009. While there were a number of small issues that require attention, the auditor congratulated the staff of MERF on achieving a good result, particularly for such an early stage in the development of MERF. The journey from commissioning of MERF in 2008 to the full implementation of its activities in 2009 has demonstrated the potential of this facility and 2010 will be an exciting year as its activities are recognised and further expanded building development is pursued.
Resumo:
Non-driving related cognitive load and variations of emotional state may impact a driver’s capability to control a vehicle and introduces driving errors. Availability of reliable cognitive load and emotion detection in drivers would benefit the design of active safety systems and other intelligent in-vehicle interfaces. In this study, speech produced by 68 subjects while driving in urban areas is analyzed. A particular focus is on speech production differences in two secondary cognitive tasks, interactions with a co-driver and calls to automated spoken dialog systems (SDS), and two emotional states during the SDS interactions - neutral/negative. A number of speech parameters are found to vary across the cognitive/emotion classes. Suitability of selected cepstral- and production-based features for automatic cognitive task/emotion classification is investigated. A fusion of GMM/SVM classifiers yields an accuracy of 94.3% in cognitive task and 81.3% in emotion classification.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
• At common law, a competent adult can refuse life-sustaining medical treatment, either contemporaneously or through an advance directive which will operate at a later time when the adult’s capacity is lost. • Legislation in most Australian jurisdictions also provides for a competent adult to complete an advance directive that refuses life-sustaining medical treatment. • At common law, a court exercising its parens patriae jurisdiction can consent to, or authorise, the withdrawal or withholding of life-sustaining medical treatment from an adult or child who lacks capacity if that is in the best interests of the person. A court may also declare that the withholding or withdrawal of treatment is lawful. • Guardianship legislation in most jurisdictions allows a substitute decision-maker, in an appropriate case, to refuse life-sustaining medical treatment for an adult who lacks capacity. • In terms of children, a parent may refuse life-sustaining medical treatment for his or her child if it is in the child’s best interests. • While a refusal of life-sustaining medical treatment by a competent child may be valid, this decision can be overturned by a court. • At common law and generally under guardianship statutes, demand for futile treatment need not be complied with by doctors.
Resumo:
We propose a digital rights management approach for sharing electronic health records for research purposes and argue advantages of the approach. We give an outline of our implementation, discuss challenges that we faced and future directions.
Resumo:
What really changed for Australian Aboriginal and Torres Strait Islander people between Paul Keating’s Redfern Park Speech (Keating 1992) and Kevin Rudd’s Apology to the stolen generations (Rudd 2008)? What will change between the Apology and the next speech of an Australian Prime Minister? The two speeches were intricately linked, and they were both personal and political. But do they really signify change at the political level? This paper reflects my attempt to turn the gaze away from Aboriginal and Torres Strait Islander people, and back to where the speeches originated: the Australian Labor Party (ALP). I question whether the changes foreshadowed in the two speeches – including changes by the Australian public and within Australian society – are evident in the internal mechanisms of the ALP. I also seek to understand why non-Indigenous women seem to have given in to the existing ways of the ALP instead of challenging the status quo which keeps Aboriginal and Torres Strait Islander peoples marginalised. I believe that, without a thorough examination and a change in the ALP’s practices, the domination and subjugation of Indigenous peoples will continue – within the Party, through the Australian political process and, therefore, through governments.
Resumo:
While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but such approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks on the other hand, optimise the parameters of speech enhancement algorithms based on state sequences generated by a speech recogniser for utterances of known transcriptions. Previous applications of LIMA frameworks have generated a set of global enhancement parameters for all model states without taking in account the distribution of model occurrence, making optimisation susceptible to favouring frequently occurring models, in particular silence. In this paper, we demonstrate the existence of highly disproportionate phonetic distributions on two corpora with distinct speech tasks, and propose to normalise the influence of each phone based on a priori occurrence probabilities. Likelihood analysis and speech recognition experiments verify this approach for improving ASR performance in noisy environments.
Resumo:
In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) technology has made it viable for use in a number of commercial products. Unfortunately, these types of applications are limited to only a few of the world’s languages, primarily because ASR development is reliant on the availability of large amounts of language specific resources. This motivates the need for techniques which reduce this language-specific, resource dependency. Ideally, these approaches should generalise across languages, thereby providing scope for rapid creation of ASR capabilities for resource poor languages. Cross Lingual ASR emerges as a means for addressing this need. Underpinning this approach is the observation that sound production is largely influenced by the physiological construction of the vocal tract, and accordingly, is human, and not language specific. As a result, a common inventory of sounds exists across languages; a property which is exploitable, as sounds from a resource poor, target language can be recognised using models trained on resource rich, source languages. One of the initial impediments to the commercial uptake of ASR technology was its fragility in more challenging environments, such as conversational telephone speech. Subsequent improvements in these environments has gained consumer confidence. Pragmatically, if cross lingual techniques are to considered a viable alternative when resources are limited, they need to perform under the same types of conditions. Accordingly, this thesis evaluates cross lingual techniques using two speech environments; clean read speech and conversational telephone speech. Languages used in evaluations are German, Mandarin, Japanese and Spanish. Results highlight that previously proposed approaches provide respectable results for simpler environments such as read speech, but degrade significantly when in the more taxing conversational environment. Two separate approaches for addressing this degradation are proposed. The first is based on deriving better target language lexical representation, in terms of the source language model set. The second, and ultimately more successful approach, focuses on improving the classification accuracy of context-dependent (CD) models, by catering for the adverse influence of languages specific phonotactic properties. Whilst the primary research goal in this thesis is directed towards improving cross lingual techniques, the catalyst for investigating its use was based on expressed interest from several organisations for an Indonesian ASR capability. In Indonesia alone, there are over 200 million speakers of some Malay variant, provides further impetus and commercial justification for speech related research on this language. Unfortunately, at the beginning of the candidature, limited research had been conducted on the Indonesian language in the field of speech science, and virtually no resources existed. This thesis details the investigative and development work dedicated towards obtaining an ASR system with a 10000 word recognition vocabulary for the Indonesian language.