931 resultados para Volitive modality
Resumo:
Surveillance systems such as object tracking and abandoned object detection systems typically rely on a single modality of colour video for their input. These systems work well in controlled conditions but often fail when low lighting, shadowing, smoke, dust or unstable backgrounds are present, or when the objects of interest are a similar colour to the background. Thermal images are not affected by lighting changes or shadowing, and are not overtly affected by smoke, dust or unstable backgrounds. However, thermal images lack colour information which makes distinguishing between different people or objects of interest within the same scene difficult. ----- By using modalities from both the visible and thermal infrared spectra, we are able to obtain more information from a scene and overcome the problems associated with using either modality individually. We evaluate four approaches for fusing visual and thermal images for use in a person tracking system (two early fusion methods, one mid fusion and one late fusion method), in order to determine the most appropriate method for fusing multiple modalities. We also evaluate two of these approaches for use in abandoned object detection, and propose an abandoned object detection routine that utilises multiple modalities. To aid in the tracking and fusion of the modalities we propose a modified condensation filter that can dynamically change the particle count and features used according to the needs of the system. ----- We compare tracking and abandoned object detection performance for the proposed fusion schemes and the visual and thermal domains on their own. Testing is conducted using the OTCBVS database to evaluate object tracking, and data captured in-house to evaluate the abandoned object detection. Our results show that significant improvement can be achieved, and that a middle fusion scheme is most effective.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.
Resumo:
Objective: To determine the effect of zinc supplementation on taste perception in a group of hemodialysis patients. Design and Setting: Double-blind randomized placebo-controlled study in a teaching hospital dialysis unit. Patients: Fifteen stable hemodialysis patients randomized to placebo (6 male, 2 female; median age, 67; range, 30 to 72 years) or treatment (5 male, 2 female; median age, 60; range, 31 to 76 years). Intervention: Treatment group received zinc sulfate 220 mg per day for 6 weeks, and the placebo group received an apparently identical dummy pill. Main Outcome Measures: Taste scores by visual analogue scales, normalized protein catabolic rate and plasma, whole blood and red cell zinc levels. Results: At baseline, sweet and salt tastes were identified correctly by both groups. Sour was often confused with salt. Sour solutions of different concentrations were not distinguishable. Taste scores were not different after 6 weeks for either group. There was no significant increment in zinc levels or normalized protein catabolic rate for either group. Conclusion: We found a disturbance of taste perception in hemodialysis patients, particularly for the sour modality, which was not corrected by this regimen of zinc supplementation. These results cast doubts on the conclusions of earlier studies that indicated an improvement in taste after zinc supplementation.
Resumo:
Intelligent surveillance systems typically use a single visual spectrum modality for their input. These systems work well in controlled conditions, but often fail when lighting is poor, or environmental effects such as shadows, dust or smoke are present. Thermal spectrum imagery is not as susceptible to environmental effects, however thermal imaging sensors are more sensitive to noise and they are only gray scale, making distinguishing between objects difficult. Several approaches to combining the visual and thermal modalities have been proposed, however they are limited by assuming that both modalities are perfuming equally well. When one modality fails, existing approaches are unable to detect the drop in performance and disregard the under performing modality. In this paper, a novel middle fusion approach for combining visual and thermal spectrum images for object tracking is proposed. Motion and object detection is performed on each modality and the object detection results for each modality are fused base on the current performance of each modality. Modality performance is determined by comparing the number of objects tracked by the system with the number detected by each mode, with a small allowance made for objects entering and exiting the scene. The tracking performance of the proposed fusion scheme is compared with performance of the visual and thermal modes individually, and a baseline middle fusion scheme. Improvement in tracking performance using the proposed fusion approach is demonstrated. The proposed approach is also shown to be able to detect the failure of an individual modality and disregard its results, ensuring performance is not degraded in such situations.
Resumo:
Purpose: The aim was to construct and advise on the use of a cost-per-wear model based on contact lens replacement frequency, to form an equitable basis for cost comparison. ---------- Methods: The annual cost of professional fees, contact lenses and solutions when wearing daily, two-weekly and monthly replacement contact lenses is determined in the context of the Australian market for spherical, toric and multifocal prescription types. This annual cost is divided by the number of times lenses are worn per year, resulting in a ‘cost-per-wear’. The model is presented graphically as the cost-per-wear versus the number of times lenses are worn each week for daily replacement and reusable (two-weekly and monthly replacement) lenses.---------- Results: The cost-per-wear for two-weekly and monthly replacement spherical lenses is almost identical but decreases with increasing frequency of wear. The cost-per-wear of daily replacement spherical lenses is lower than for reusable spherical lenses, when worn from one to four days per week but higher when worn six or seven days per week. The point at which the cost-per-wear is virtually the same for all three spherical lens replacement frequencies (approximately AUD$3.00) is five days of lens wear per week. A similar but upwardly displaced (higher cost) pattern is observed for toric lenses, with the cross-over point occurring between three and four days of wear per week (AUD$4.80). Multifocal lenses have the highest price, with cross-over points for daily versus two-weekly replacement lenses at between four and five days of wear per week (AUD$5.00) and for daily versus monthly replacement lenses at three days per week (AUD$5.50).---------- Conclusions: This cost-per-wear model can be used to assist practitioners and patients in making an informed decision in relation to the cost of contact lens wear as one of many considerations that must be taken into account when deciding on the most suitable lens replacement modality.
Resumo:
Extended wear has long been the ‘holy grail’ of contact lenses by virtue of the increased convenience and freedom of lifestyle which they accord; however, this modality enjoyed only limited market success during the last quarter of the 20th century. The introduction of silicone hydrogel materials into the market at the beginning of this century heralded the promise of successful extended wear due to the superior oxygen performance of this lens type. To assess patterns of contact lens fitting, including extended wear, over the past decade, up to 1000 survey forms were sent to contact lens fitters in Australia, Canada, Japan, the Netherlands, Norway, the UK and the USA each year between 2000 and 2009. Practitioners were asked to record data relating to the first 10 contact lens fits or refits performed after receiving the survey form. Analysis of returned forms revealed that, averaged over this period, 9% of all soft lenses prescribed were for extended wear, with national figures ranging from 2% in Japan to 17% in Norway. The trend over the past decade has been for an increase from about 5% of all soft lens fits in 2000 to a peak of between 9 and 12% between 2002 and 2007, followed by a decline to around 7% in 2009. A person receiving extended wear lenses is likely to be an older female who is being refitted with silicone hydrogel lenses for full-time wear. Although extended wear has yet again failed to fulfil the promise of being the dominant contact lens wearing modality, it is still a viable option for many people.
Resumo:
Cryopreservation plays a significant function in tissue banking and will presume yet larger value when more and more tissue-engineered products will routinely enter the clinical arena. The most common concept underlying tissue engineering is to combine a scaffold (cellular solids) or matrix (hydrogels) with living cells to form a tissue-engineered construct (TEC) to promote the repair and regeneration of tissues. The scaffold and matrix are expected to support cell colonization, migration, growth and differentiation, and to guide the development of the required tissue. The promises of tissue engineering, however, depend on the ability to physically distribute the products to patients in need. For this reason, the ability to cryogenically preserve not only cells, but also TECs, and one day even whole laboratory-produced organs, may be indispensable. Cryopreservation can be achieved by conventional freezing and vitrification (ice-free cryopreservation). In this publication we try to define the needs versus the desires of vitrifying TECs, with particular emphasis on the cryoprotectant properties, suitable materials and morphology. It is concluded that the formation of ice, through both direct and indirect effects, is probably fundamental to these difficulties, and this is why vitrification seems to be the most promising modality of cryopreservation
Resumo:
Interacting with technology within a vehicle environment using a voice interface can greatly reduce the effects of driver distraction. Most current approaches to this problem only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to circumvent this is to use the visual modality in addition. However, capturing, storing and distributing audio-visual data in a vehicle environment is very costly and difficult. One current dataset available for such research is the AVICAR [1] database. Unfortunately this database is largely unusable due to timing mismatch between the two streams and in addition, no protocol is available. We have overcome this problem by re-synchronising the streams on the phone-number portion of the dataset and established a protocol for further research. This paper presents the first audio-visual results on this dataset for speaker-independent speech recognition. We hope this will serve as a catalyst for future research in this area.
Resumo:
Background: There is a growing trend for individuals to seek health information from online sources. Alcohol and other drug (AOD) use is a significant health problem worldwide, but access and use of AOD websites is poorly understood. ----- ----- Objective: To investigate content and functionality preferences for AOD and other health websites. Methods: An anonymous online survey examined general Internet and AOD-specific usage and search behaviors, valued features of AOD and health-related websites (general and interactive website features), indicators of website trustworthiness, valued AOD website tools or functions, and treatment modality preferences. ----- ----- Results: Surveys were obtained from 1214 drug (n = 766) and alcohol website users (n = 448) (mean age 26.2 years, range 16-70). There were no significant differences between alcohol and drug groups on demographic variables, Internet usage, indicators of website trustworthiness, or on preferences for AOD website functionality. A robust website design/navigation, open access, and validated content provision were highly valued by both groups. While attractiveness and pictures or graphics were also valued, high-cost features (videos, animations, games) were minority preferences. Almost half of respondents in both groups were unable to readily access the information they sought. Alcohol website users placed greater importance on several AOD website tools and functions than did those accessing other drug websites: online screening tools (χ²2 = 15.8, P < .001, n = 985); prevention programs (χ²2 = 27.5, P < .001, n = 981); tracking functions (χ²2 = 11.5, P = .003, n = 983); self help treatment programs (χ²2 = 8.3, P = .02, n = 984); downloadable fact sheets for friends (χ²2 = 11.6, P = .003, n = 981); or family (χ²2 = 12.7, P = .002, n = 983). The most preferred online treatment option for both the user groups was an Internet site with email therapist support. Explorations of demographic differences were also performed. While gender did not affect survey responses, younger respondents were more likely to value interactive and social networking features, whereas downloading of credible information was most highly valued by older respondents. ----- ----- Conclusions: Significant deficiencies in the provision of accessible information on AOD websites were identified, an important problem since information seeking was the most common reason for accessing these websites, and, therefore, may be a key avenue for engaging website users in behaviour change. The few differences between AOD website users suggested that both types of websites may have similar features, although alcohol website users may more readily be engaged in screening, prevention and self-help programs, tracking change, and may value fact sheets more highly. While the sociodemographic differences require replication and clarification, these differences support the notion that the design and features of AOD websites should target specific audiences to have maximal impact.
Resumo:
The Australian e-Health Research Centre in collaboration with the Queensland University of Technology's Paediatric Spine Research Group is developing software for visualisation and manipulation of large three-dimensional (3D) medical image data sets. The software allows the extraction of anatomical data from individual patients for use in preoperative planning. State-of-the-art computer technology makes it possible to slice through the image dataset at any angle, or manipulate 3D representations of the data instantly. Although the software was initially developed to support planning for scoliosis surgery, it can be applied to any dataset whether obtained from computed tomography, magnetic resonance imaging or any other imaging modality.
Resumo:
Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise