283 resultados para free speech
Resumo:
The use of animal sera for the culture of therapeutically important cells impedes the clinical use of the cells. We sought to characterize the functional response of human mesenchymal stem cells (hMSCs) to specific proteins known to exist in bone tissue with a view to eliminating the requirement of animal sera. Insulin-like growth factor-I (IGF-I), via IGF binding protein-3 or -5 (IGFBP-3 or -5) and transforming growth factor-beta 1 (TGF-beta(1)) are known to associate with the extracellular matrix (ECM) protein vitronectin (VN) and elicit functional responses in a range of cell types in vitro. We found that specific combinations of VN, IGFBP-3 or -5, and IGF-I or TGF-beta(1) could stimulate initial functional responses in hMSCs and that IGF-I or TGF-beta(1) induced hMSC aggregation, but VN concentration modulated this effect. We speculated that the aggregation effect may be due to endogenous protease activity, although we found that neither IGF-I nor TGF-beta(1) affected the functional expression of matrix metalloprotease-2 or -9, two common proteases expressed by hMSCs. In summary, combinations of the ECM and growth factors described herein may form the basis of defined cell culture media supplements, although the effect of endogenous protease expression on the function of such proteins requires investigation.
Resumo:
This paper studies the incentives for credence goods experts to invest effort in diagnosis if effort is both costly and unobservable, and if they face competition by discounters who are not able to perform a diagnosis. The unobservability of diagnosis effort and the credence characteristic of the good induce experts to choose incentive compatible tariff structures. This makes them vulnerable to competition by discounters. We explore the conditions under which honestly diagnosing experts survive competition by discounters; we identify situations in which experts misdiagnose consumers in order to prevent them from free-riding on experts' advice; and we discuss policy options to solve the free-riding consumers–cheating experts problem.
Resumo:
In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.
Resumo:
Secondary tasks such as cell phone calls or interaction with automated speech dialog systems (SDSs) increase the driver’s cognitive load as well as the probability of driving errors. This study analyzes speech production variations due to cognitive load and emotional state of drivers in real driving conditions. Speech samples were acquired from 24 female and 17 male subjects (approximately 8.5 h of data) while talking to a co-driver and communicating with two automated call centers, with emotional states (neutral, negative) and the number of necessary SDS query repetitions also labeled. A consistent shift in a number of speech production parameters (pitch, first format center frequency, spectral center of gravity, spectral energy spread, and duration of voiced segments) was observed when comparing SDS interaction against co-driver interaction; further increases were observed when considering negative emotion segments and the number of requested SDS query repetitions. A mel frequency cepstral coefficient based Gaussian mixture classifier trained on 10 male and 10 female sessions provided 91% accuracy in the open test set task of distinguishing co-driver interactions from SDS interactions, suggesting—together with the acoustic analysis—that it is possible to monitor the level of driver distraction directly from their speech.
Resumo:
The purpose of this chapter is to describe the use of caricatured contrasting scenarios (Bødker, 2000) and how they can be used to consider potential designs for disruptive technologies. The disruptive technology in this case is Automatic Speech Recognition (ASR) software in workplace settings. The particular workplace is the Magistrates Court of the Australian Capital Territory.----- Caricatured contrasting scenarios are ideally suited to exploring how ASR might be implemented in a particular setting because they allow potential implementations to be “sketched” quickly and with little effort. This sketching of potential interactions and the emphasis of both positive and negative outcomes allows the benefits and pitfalls of design decisions to become apparent.----- A brief description of the Court is given, describing the reasons for choosing the Court for this case study. The work of the Court is framed as taking place in two modes: Front of house, where the courtroom itself is, and backstage, where documents are processed and the business of the court is recorded and encoded into various systems.----- Caricatured contrasting scenarios describing the introduction of ASR to the front of house are presented and then analysed. These scenarios show that the introduction of ASR to the court would be highly problematic.----- The final section describes how ASR could be re-imagined in order to make it useful for the court. A final scenario is presented that describes how this re-imagined ASR could be integrated into both the front of house and backstage of the court in a way that could strengthen both processes.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
Non-driving related cognitive load and variations of emotional state may impact a driver’s capability to control a vehicle and introduces driving errors. Availability of reliable cognitive load and emotion detection in drivers would benefit the design of active safety systems and other intelligent in-vehicle interfaces. In this study, speech produced by 68 subjects while driving in urban areas is analyzed. A particular focus is on speech production differences in two secondary cognitive tasks, interactions with a co-driver and calls to automated spoken dialog systems (SDS), and two emotional states during the SDS interactions - neutral/negative. A number of speech parameters are found to vary across the cognitive/emotion classes. Suitability of selected cepstral- and production-based features for automatic cognitive task/emotion classification is investigated. A fusion of GMM/SVM classifiers yields an accuracy of 94.3% in cognitive task and 81.3% in emotion classification.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
Objective: To assess the effect of graded increases in exercised-induced energy expenditure (EE) on appetite, energy intake (EI), total daily EE and body weight in men living in their normal environment and consuming their usual diets. Design: Within-subject, repeated measures design. Six men (mean (s.d.) age 31.0 (5.0) y; weight 75.1 (15.96) kg; height 1.79 (0.10) m; body mass index (BMI) 23.3(2.4) kg/m2), were each studied three times during a 9 day protocol, corresponding to prescriptions of no exercise, (control) (Nex; 0 MJ/day), medium exercise level (Mex; ~1.6 MJ/day) and high exercise level (Hex; ~3.2 MJ/day). On days 1-2 subjects were given a medium fat (MF) maintenance diet (1.6 ´ resting metabolic rate (RMR)). Measurements: On days 3-9 subjects self-recorded dietary intake using a food diary and self-weighed intake. EE was assessed by continual heart rate monitoring, using the modified FLEX method. Subjects' HR (heart rate) was individually calibrated against submaximal VO2 during incremental exercise tests at the beginning and end of each 9 day study period. Respiratory exchange was measured by indirect calorimetry. Subjects completed hourly hunger ratings during waking hours to record subjective sensations of hunger and appetite. Body weight was measured daily. Results: EE amounted to 11.7, 12.9 and 16.8 MJ/day (F(2,10)=48.26; P<0.001 (s.e.d=0.55)) on the Nex, Mex and Hex treatments, respectively. The corresponding values for EI were 11.6, 11.8 and 11.8 MJ/day (F(2,10)=0.10; P=0.910 (s.e.d.=0.10)), respectively. There were no treatment effects on hunger, appetite or body weight, but there was evidence of weight loss on the Hex treatment. Conclusion: Increasing EE did not lead to compensation of EI over 7 days. However, total daily EE tended to decrease over time on the two exercise treatments. Lean men appear able to tolerate a considerable negative energy balance, induced by exercise, over 7 days without invoking compensatory increases in EI.
Resumo:
What really changed for Australian Aboriginal and Torres Strait Islander people between Paul Keating’s Redfern Park Speech (Keating 1992) and Kevin Rudd’s Apology to the stolen generations (Rudd 2008)? What will change between the Apology and the next speech of an Australian Prime Minister? The two speeches were intricately linked, and they were both personal and political. But do they really signify change at the political level? This paper reflects my attempt to turn the gaze away from Aboriginal and Torres Strait Islander people, and back to where the speeches originated: the Australian Labor Party (ALP). I question whether the changes foreshadowed in the two speeches – including changes by the Australian public and within Australian society – are evident in the internal mechanisms of the ALP. I also seek to understand why non-Indigenous women seem to have given in to the existing ways of the ALP instead of challenging the status quo which keeps Aboriginal and Torres Strait Islander peoples marginalised. I believe that, without a thorough examination and a change in the ALP’s practices, the domination and subjugation of Indigenous peoples will continue – within the Party, through the Australian political process and, therefore, through governments.
Resumo:
While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.