989 resultados para Recognition ethics
Resumo:
Midwives are involved in a very dynamic profession. As they face their everyday tasks they encounter many different situations and a variety of people which results in a vast number of interactions. This narrative research project sought to identify some of the ‘ordinary’ encounters and interactions that midwives working in a hospital environment experience in their daily work and explore them from an ethical perspective. It found that many ethical decisions have to be made ‘on-the-run’, with no time to contemplate or decide what the best course of action might be. As ethics is embedded within every encounter a midwife has, it is essential that all midwives have an awareness and understanding of their own value systems, professional ethical codes and ethical principles that can act as guides when they have to make choices in these situations, which are frequently challenging.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
Note: see later edition of this work at http://eprints.qut.edu.au/47632/ This chapter introduces you to the basic ethical principles that underpin public health practice. The themes to be considered in this chapter include: the characteristics of ‘ethics’, the justification for reflecting on ethics and values, the foundations of public health ethics, whether and how we can incorporate ethics and values into our practice and the nature of some of the potential ethical complications of public health practice.
Resumo:
Several approaches have been proposed to recognize handwritten Bengali characters using different curve fitting algorithms and curvature analysis. In this paper, a new algorithm (Curve-fitting Algorithm) to identify various strokes of a handwritten character is developed. The curve-fitting algorithm helps recognizing various strokes of different patterns (line, quadratic curve) precisely. This reduces the error elimination burden heavily. Implementation of this Modified Syntactic Method demonstrates significant improvement in the recognition of Bengali handwritten characters.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
When classifying a signal, ideally we want our classifier to trigger a large response when it encounters a positive example and have little to no response for all other examples. Unfortunately in practice this does not occur with responses fluctuating, often causing false alarms. There exists a myriad of reasons why this is the case, most notably not incorporating the dynamics of the signal into the classification. In facial expression recognition, this has been highlighted as one major research question. In this paper we present a novel technique which incorporates the dynamics of the signal which can produce a strong response when the peak expression is found and essentially suppresses all other responses as much as possible. We conducted preliminary experiments on the extended Cohn-Kanade (CK+) database which shows its benefits. The ability to automatically and accurately recognize facial expressions of drivers is highly relevant to the automobile. For example, the early recognition of “surprise” could indicate that an accident is about to occur; and various safeguards could immediately be deployed to avoid or minimize injury and damage. In this paper, we conducted initial experiments on the extended Cohn-Kanade (CK+) database which shows its benefits.
Resumo:
Abstract Providing water infrastructure in times of accelerating climate change presents interesting new problems. Expanding demands must be met or managed in contexts of increasingly constrained sources of supply, raising ethical questions of equity and participation. Loss of agricultural land and natural habitats, the coastal impacts of desalination plants and concerns over re-use of waste water must be weighed with demand management issues of water rationing, pricing mechanisms and inducing behaviour change. This case study examines how these factors impact on infrastructure planning in South East Queensland, Australia: a region with one of the developed world’s most rapidly growing populations, which has recently experienced the most severe drought in its recorded history. Proposals to match forecast demands and potential supplies for water over a 20 year period are reviewed by applying ethical principles to evaluate practical plans to meet the water needs of the region’s activities and settlements.
Resumo:
The paper documents the development of an ethical framework for my current PhD project. I am a practice-led researcher with a background in creative writing. My project invovles conducting a number of oral history interviews with individuals living in Brisbane, Queensland, Australia. I use the interviews to inform a novel set in Brisbane. In doing so, I hope to provide a lens into a cultural and historical space by creating a rich, textured and vivid narrative while still retaining some of the essential aspects of the oral history. While developing a methodology for fictionalising these oral histories, I have encountered a derserve range of ethical issues. In particular I have had to confront my role as a writer and researcher working with other people’s stories. In order to grapple with the complex ethics of such an engagment, I examine the devices and stratedgies employed by other creative practioners working in similar fields. I focus chielfy on Miguel Barnet’s Biography of a Runaway Slave (published in English in 1968) Dave Eggers’What is the what: The autobiography of Valentino Achek Deng, a novel (2005) in order to understand the complex processes of mediation invloved in the artful shaping of oral histories. The paper explores how I have confronted and resolved ethical considerations in my theoretical and creative work.
Resumo:
While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.
Resumo:
In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) technology has made it viable for use in a number of commercial products. Unfortunately, these types of applications are limited to only a few of the world’s languages, primarily because ASR development is reliant on the availability of large amounts of language specific resources. This motivates the need for techniques which reduce this language-specific, resource dependency. Ideally, these approaches should generalise across languages, thereby providing scope for rapid creation of ASR capabilities for resource poor languages. Cross Lingual ASR emerges as a means for addressing this need. Underpinning this approach is the observation that sound production is largely influenced by the physiological construction of the vocal tract, and accordingly, is human, and not language specific. As a result, a common inventory of sounds exists across languages; a property which is exploitable, as sounds from a resource poor, target language can be recognised using models trained on resource rich, source languages. One of the initial impediments to the commercial uptake of ASR technology was its fragility in more challenging environments, such as conversational telephone speech. Subsequent improvements in these environments has gained consumer confidence. Pragmatically, if cross lingual techniques are to considered a viable alternative when resources are limited, they need to perform under the same types of conditions. Accordingly, this thesis evaluates cross lingual techniques using two speech environments; clean read speech and conversational telephone speech. Languages used in evaluations are German, Mandarin, Japanese and Spanish. Results highlight that previously proposed approaches provide respectable results for simpler environments such as read speech, but degrade significantly when in the more taxing conversational environment. Two separate approaches for addressing this degradation are proposed. The first is based on deriving better target language lexical representation, in terms of the source language model set. The second, and ultimately more successful approach, focuses on improving the classification accuracy of context-dependent (CD) models, by catering for the adverse influence of languages specific phonotactic properties. Whilst the primary research goal in this thesis is directed towards improving cross lingual techniques, the catalyst for investigating its use was based on expressed interest from several organisations for an Indonesian ASR capability. In Indonesia alone, there are over 200 million speakers of some Malay variant, provides further impetus and commercial justification for speech related research on this language. Unfortunately, at the beginning of the candidature, limited research had been conducted on the Indonesian language in the field of speech science, and virtually no resources existed. This thesis details the investigative and development work dedicated towards obtaining an ASR system with a 10000 word recognition vocabulary for the Indonesian language.
Resumo:
This thesis provides a behavioural perspective to the problem of collusive tendering in the construction market by examining the decision making factors of individuals potentially involved in such agreements using marketing ethics theory and techniques. The findings of a cross disciplinary literature review were synthesised into a model of factors theoretically expected to determine the individual's behavioural intent towards a set of collusive tendering agreements and the means of reaching them. The factors were grouped as internal cognitive (the individuals' value systems) and affective (demographic and psychographic characteristics) as well as external environmental (legal, industrial and organisational codes and norms) and situational (company, market and economic conditions). The model was tested using empirical data collected through a questionnaire survey of estimators employed in the largest Australian construction firms. All forms of explicit collusive tendering agreements were considered as having a prohibitive moral content by the majority of respondents who also clearly differentiated between agreements and discussions of contract terms (which they found to be a moral concern but not prohibitive) or of prices. The comparisons between those of the respondents that would never participate in a collusive agreement and the potential offenders clearly showed two distinctly different groups. The law abiding estimators are less reliant on situational factors, happier and more comfortable in their work environments and they live according to personal value and belief systems. The potential offenders on the other hand are mistrustful of colleagues, feel their values are not respected, put company priorities above principles and none of them is religious or a member of a professional body. The research results indicate that Australian estimators are, overall law abiding and principled and accept the existing codification of collusion as morally defensible and binding. Professional bodies' and organisational codes of conduct as well as personal value and belief systems that guide one's own conduct appear to be deterrents to collusive tendering intent and so are moral comfort and work satisfaction. These observations are potential indicators of areas where intervention and behaviour modification can increase individuals' resistance to collusion.