929 resultados para deontic modality
Resumo:
Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
RÉSUMÉ. La prise en compte des troubles de la communication dans l’utilisation des systèmes de recherche d’information tels qu’on peut en trouver sur le Web est généralement réalisée par des interfaces utilisant des modalités n’impliquant pas la lecture et l’écriture. Peu d’applications existent pour aider l’utilisateur en difficulté dans la modalité textuelle. Nous proposons la prise en compte de la conscience phonologique pour assister l’utilisateur en difficulté d’écriture de requêtes (dysorthographie) ou de lecture de documents (dyslexie). En premier lieu un système de réécriture et d’interprétation des requêtes entrées au clavier par l’utilisateur est proposé : en s’appuyant sur les causes de la dysorthographie et sur les exemples à notre disposition, il est apparu qu’un système combinant une approche éditoriale (type correcteur orthographique) et une approche orale (système de transcription automatique) était plus approprié. En second lieu une méthode d’apprentissage automatique utilise des critères spécifiques , tels que la cohésion grapho-phonémique, pour estimer la lisibilité d’une phrase, puis d’un texte. ABSTRACT. Most applications intend to help disabled users in the information retrieval process by proposing non-textual modalities. This paper introduces specific parameters linked to phonological awareness in the textual modality. This will enhance the ability of systems to deal with orthographic issues and with the adaptation of results to the reader when for example the reader is dyslexic. We propose a phonology based sentence level rewriting system that combines spelling correction, speech synthesis and automatic speech recognition. This has been evaluated on a corpus of questions we get from dyslexic children. We propose a specific sentence readability measure that involves phonetic parameters such as grapho-phonemic cohesion. This has been learned on a corpus of reading time of sentences read by dyslexic children.
Resumo:
Radiotherapy is a cancer treatment modality in which a dose of ionising radiation is delivered to a tumour. The accurate calculation of the dose to the patient is very important in the design of an effective therapeutic strategy. This study aimed to systematically examine the accuracy of the radiotherapy dose calculations performed by clinical treatment planning systems by comparison againstMonte Carlo simulations of the treatment delivery. A suite of software tools known as MCDTK (Monte Carlo DICOM ToolKit) was developed for this purpose, and is capable of: • Importing DICOM-format radiotherapy treatment plans and producing Monte Carlo simulation input files (allowing simple simulation of complex treatments), and calibrating the results; • Analysing the predicted doses of and deviations between the Monte Carlo simulation results and treatment planning system calculations in regions of interest (tumours and organs-at-risk) and generating dose-volume histograms, so that conformity with dose prescriptions can be evaluated. The code has been tested against various treatment planning systems, linear acceleratormodels and treatment complexities. Six clinical head and neck cancer treatments were simulated and the results analysed using this software. The deviations were greatest where the treatment volume encompassed tissues on both sides of an air cavity. This was likely due to the method the planning system used to model low density media.
Resumo:
The availability and use of online counseling approaches has increased rapidly over the last decade. While research has suggested a range of potential affordances and limitations of online counseling modalities, very few studies have offered detailed examinations of how counselors and clients manage asynchronous email counseling exchanges. In this paper we examine email exchanges involving clients and counselors through Kids Helpline, a national Australian counseling service that offers free online, email and telephone counseling for young people up to the age of 25. We employ tools from the traditions of ethnomethodology and conversation analysis to analyze the ways in which counselors from Kids Helpline request that their clients call them, and hence change the modality of their counseling relationship, from email to telephone counseling. This paper shows the counselors’ three multi-layered approaches in these emails as they negotiate the potentially delicate task of requesting and persuading a client to change the trajectory of their counseling relationship from text to talk without placing that relationship in jeopardy.
Resumo:
PURPOSE: The purpose of this study is to identify risk factors for developing complications following treatment of refractory glaucoma with transscleral diode laser cyclophotocoagulation (cyclodiode), to improve the safety profile of this treatment modality. METHOD: A retrospective analysis of 72 eyes from 70 patients who were treated with cyclodiode. RESULTS: The mean pre-treatment IOP was 37.0 mmHg (SD 11.0), with a mean post-treatment reduction in intraocular pressure (IOP) of 19.8 mmHg, and a mean IOP at last follow-up of 17.1 mmHg (SD 9.7). Mean total power delivered during treatment was 156.8 Joules (SD 82.7) over a mean of 1.3 treatments (SD 0.6). Sixteen eyes (22.2% of patients) developed complications from the treatment, with the most common being hypotony, occurring in 6 patients, including 4 with neovascular glaucoma. A higher pre-treatment IOP and higher mean total power delivery also were associated with higher complications. CONCLUSIONS: Cyclodiode is an effective treatment option for glaucoma that is refractory to other treatment options. By identifying risk factors for potential complications, cyclodiode can be modified accordingly for each patient to improve safety and efficacy.
Resumo:
The interactive effects of emotion and attention on attentional startle modulation were investigated in two experiments. Participants performed a discrimination and counting task with two visual stimuli during which acoustic eyeblink startle-eliciting probes were presented at long lead intervals. In Experiment 1, this task was combined with aversive Pavlovian conditioning. In Group Attend CS+, the attended stimulus was followed by an aversive unconditional stimulus (US) and the ignored stimulus was presented alone whereas the ignored stimulus was paired with the US in Group Attend CS−. In Experiment 2, a non-aversive reaction time task US replaced the aversive US. Regardless of the conditioning manipulation and consistent with a modality non-specific account of attentional startle modulation, startle magnitude was larger during attended than ignored stimuli in both experiments. Blink latency shortening was differentially affected by the conditioning manipulations suggesting additive effects of conditioning and discrimination and counting task on blink startle.
Resumo:
Accurate and efficient thermal-infrared (IR) camera calibration is important for advancing computer vision research within the thermal modality. This paper presents an approach for geometrically calibrating individual and multiple cameras in both the thermal and visible modalities. The proposed technique can be used to correct for lens distortion and to simultaneously reference both visible and thermal-IR cameras to a single coordinate frame. The most popular existing approach for the geometric calibration of thermal cameras uses a printed chessboard heated by a flood lamp and is comparatively inaccurate and difficult to execute. Additionally, software toolkits provided for calibration either are unsuitable for this task or require substantial manual intervention. A new geometric mask with high thermal contrast and not requiring a flood lamp is presented as an alternative calibration pattern. Calibration points on the pattern are then accurately located using a clustering-based algorithm which utilizes the maximally stable extremal region detector. This algorithm is integrated into an automatic end-to-end system for calibrating single or multiple cameras. The evaluation shows that using the proposed mask achieves a mean reprojection error up to 78% lower than that using a heated chessboard. The effectiveness of the approach is further demonstrated by using it to calibrate two multiple-camera multiple-modality setups. Source code and binaries for the developed software are provided on the project Web site.
Resumo:
Volume measurements are useful in many branches of science and medicine. They are usually accomplished by acquiring a sequence of cross sectional images through the object using an appropriate scanning modality, for example x-ray computed tomography (CT), magnetic resonance (MR) or ultrasound (US). In the cases of CT and MR, a dividing cubes algorithm can be used to describe the surface as a triangle mesh. However, such algorithms are not suitable for US data, especially when the image sequence is multiplanar (as it usually is). This problem may be overcome by manually tracing regions of interest (ROIs) on the registered multiplanar images and connecting the points into a triangular mesh. In this paper we describe and evaluate a new discreet form of Gauss’ theorem which enables the calculation of the volume of any enclosed surface described by a triangular mesh. The volume is calculated by summing the vector product of the centroid, area and normal of each surface triangle. The algorithm was tested on computer-generated objects, US-scanned balloons, livers and kidneys and CT-scanned clay rocks. The results, expressed as the mean percentage difference ± one standard deviation were 1.2 ± 2.3, 5.5 ± 4.7, 3.0 ± 3.2 and −1.2 ± 3.2% for balloons, livers, kidneys and rocks respectively. The results compare favourably with other volume estimation methods such as planimetry and tetrahedral decomposition.
Resumo:
The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.
Resumo:
Normal thoracic kyphosis Cobb angle for T5-T12 is most commonly reported as a range of 20-40º [1]. Patients with adolescent idiopathic scoliosis (AIS) exhibit a reduced thoracic kyphosis or hypokyphosis [2] accompanying the coronal and rotary distortion components. As a result, surgical restoration of the thoracic kyphosis while maintaining lumbar lordosis and overall sagittal balance is a critical aspect of achieving good clinical outcomes in AIS patients. Previous studies report an increase in thoracic kyphosis after anterior surgical approaches [3] and a flattening of sagittal contours following posterior approaches [4]. Difficulties with measuring sagittal parameters on radiographs are avoided with reformatted sagittal CT reconstructions due to the superior endplate clarity afforded by this imaging modality and are the subject of analysis in this study.
Resumo:
Background. Previous studies report an increase in thoracic kyphosis after anterior approaches and a flattening of sagittal contours following posterior approaches. Difficulties with measuring sagittal parameters on radiographs are avoided with reformatted sagittal CT reconstructions due to the superior endplate clarity afforded by this imaging modality. Methods. A prospective study of 30 Lenke 1 adolescent idiopathic scoliosis (AIS) patients receiving selective thoracoscopic anterior spinal fusion (TASF) was performed. Participants had ethically approved low dose CT scans at minimum 24 months after surgery in addition to their standard care following surgery. The change in sagittal contours on supine CT was compared to standing radiographic measurements of the same patients and with previous studies. Inter-observer variability was assessed as well as whether hypokyphotic and normokyphotic patient groups responded differently to the thoracoscopic anterior approach. Results. Mean T5-12 kyphosis Cobb angle increased by 11.8 degrees and lumbar lordosis increased by 5.9 degrees on standing radiographs two years after surgery. By comparison, CT measurements of kyphosis and lordosis increased by 12.3 degrees and 7.0 degrees respectively. 95% confidence intervals for inter-observer variability of sagittal contour measurements on supine CT ranged between 5-8 degrees. TASF had a slightly greater corrective effect on patients who were hypokyphotic before surgery compared with those who were normokyphotic. Conclusions. Restoration of sagittal profile is an important goal of scoliosis surgery, but reliable measurement with radiographs suffers from poor endplate clarity. TASF significantly improves thoracic kyphosis and lumbar lordosis while preserving proximal and distal junctional alignment in thoracic AIS patients. Supine CT allows greater endplate clarity for sagittal Cobb measurements and linear relationships were found between supine CT and standing radiographic measurements. In this study, improvements in sagittal kyphosis and lordosis following surgery were in agreement with prior anterior surgery studies, and add to the current evidence suggesting that anterior correction is more capable than posterior approaches of addressing the sagittal component of both the instrumented and adjacent non instrumented segments following surgical correction of progressive Lenke 1 idiopathic scoliosis.
Resumo:
Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.