405 resultados para Patrick Rothfus
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
Urban expansion continues to encroach on once isolated sewerage infrastructure. In this context,legislation and guidelines provide limited direction to the amenity allocation of appropriate buffer distances for land use planners and infrastructure providers. Topography, wind speed and direction,temperature, humidity, existing land uses and vegetation profiles are some of the factors that require investigation in analytically determining a basis for buffer separations. This paper discusses the compilation and analysis of six years of Logan sewerage odour complaint data. Graphically,relationships between the complaints, topographical features and meteorological data are presented. Application of a buffer sizing process could assist planners and infrastructure designers alike, whilst automatically providing extra green spaces. Establishing a justifiable criterion for buffer zone allocations can only assist in promoting manageable growth for healthier and more sustainable communities.
Resumo:
In late 2009, Health Libraries Australia (HLA) received a small grant to undertake a national research project to determine the future requirements for health librarians in the workforce in Australia and develop a structured, modular education framework (post-graduate qualification and continuing professional development structure) to meet these requirements. The main objective was to consider the education and professional development framework that would ensure that health librarians have a clearly defined scope of practice and the specific competency based knowledge and skills that enable them to contribute to the design and delivery of high quality health services in this country. The final report presents a detailed discussion of the changing Australian healthcare environment and the resulting impact on the health library sector, as well as an overview of international trends in health libraries and the implications for Australian health librarianship education. The research methodology is outlined, followed by an analysis of the findings from the two surveys with health librarians and health library managers and the semi-structured interviews conducted with employers. The Medical Library Association (MLA) in the United States had developed a policy document detailing the competencies required by health librarians. It was found that the MLA competencies represented an accepted professional framework of skills which could be used objectively in the survey instrument to measure the areas of professional knowledge and responsibilities that were relevant in the current workplace, and to identify how these requirements might change in the next three to five years. The research results underscore the imperative for health librarians to engage in regular, relevant professional development activities that will enable them to stay abreast with the rapid contextual changes impacting on their practice. In order to be accepted as key members of the multi-disciplinary health professional team, it is strongly believed that health librarians should commit to establishing the mechanisms for specialist certification maintained through compulsory CPD in an ongoing three-year cycle of revalidation. This development would align ALIA and health librarians with other health sector professional associations which are responsible for the self regulation of entry to and continuation in their profession.
Resumo:
In automatic facial expression recognition, an increasing number of techniques had been proposed for in the literature that exploits the temporal nature of facial expressions. As all facial expressions are known to evolve over time, it is crucially important for a classifier to be capable of modelling their dynamics. We establish that the method of sparse representation (SR) classifiers proves to be a suitable candidate for this purpose, and subsequently propose a framework for expression dynamics to be efficiently incorporated into its current formulation. We additionally show that for the SR method to be applied effectively, then a certain threshold on image dimensionality must be enforced (unlike in facial recognition problems). Thirdly, we determined that recognition rates may be significantly influenced by the size of the projection matrix \Phi. To demonstrate these, a battery of experiments had been conducted on the CK+ dataset for the recognition of the seven prototypic expressions - anger, contempt, disgust, fear, happiness, sadness and surprise - and comparisons have been made between the proposed temporal-SR against the static-SR framework and state-of-the-art support vector machine.
Resumo:
Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker’s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker’s face may not always be frontal.