863 resultados para Score metric


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates the use of the FAB-MAP appearance-only SLAM algorithm as a method for performing visual data association for RatSLAM, a semi-metric full SLAM system. While both systems have shown the ability to map large (60-70km) outdoor locations of approximately the same scale, for either larger areas or across longer time periods both algorithms encounter difficulties with false positive matches. By combining these algorithms using a mapping between appearance and pose space, both false positives and false negatives generated by FAB-MAP are significantly reduced during outdoor mapping using a forward-facing camera. The hybrid FAB-MAP-RatSLAM system developed demonstrates the potential for successful SLAM over large periods of time.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a vision-based method of vehicle localisation that has been developed and tested on a large forklift type robotic vehicle which operates in a mainly outdoor industrial setting. The localiser uses a sparse 3D edgemap of the environment and a particle filter to estimate the pose of the vehicle. The vehicle operates in dynamic and non-uniform outdoor lighting conditions, an issue that is addressed by using knowledge of the scene to intelligently adjust the camera exposure and hence improve the quality of the information in the image. Results from the industrial vehicle are shown and compared to another laser-based localiser which acts as a ground truth. An improved likelihood metric, using peredge calculation, is presented and has shown to be 40% more accurate in estimating rotation. Visual localization results from the vehicle driving an arbitrary 1.5km path during a bright sunny period show an average position error of 0.44m and rotation error of 0.62deg.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Older adults may find it problematic to attend hospital appointments due to the difficulty associated with travelling to, within and from a hospital facility for the purpose of a face-to-face assessment. This study aims to investigate equivalence between telephone and face-to-face administration for the Frenchay Activities Index (FAI) and the Euroqol-5D (EQ-5D) generic health-related quality of life instrument amongst an older adult population. Methods Patients aged >65 (n = 53) who had been discharged to the community following an acute hospital admission underwent telephone administration of the FAI and EQ-5D instruments seven days prior to attending a hospital outpatient appointment where they completed a face-to-face administration of these instruments. Results Overall, 40 subjects' datasets were complete for both assessments and included in analysis. The FAI items had high levels of agreement between the two modes of administration (item kappa's ranged 0.73 to 1.00) as did the EQ-5D (item kappa's ranged 0.67–0.83). For the FAI, EQ-5D VAS and EQ-5D utility score, intraclass correlation coefficients were 0.94, 0.58 and 0.82 respectively with paired t-tests indicating no significant systematic difference (p = 0.100, p = 0.690 and p = 0.290 respectively). Conclusion Telephone administration of the FAI and EQ-5D instruments provides comparable results to face-to-face administration amongst older adults deemed to have cognitive functioning intact at a basic level, indicating that this is a suitable alternate approach for collection of this information.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: The Functional Capacity Index (FCI) was designed to predict physical function 12 months after injury. We report a validation study of the FCI. Methods: This was a consecutive case series registered in the Queensland Trauma Registry who consented to the prospective 12-month telephone-administered follow-up study. FCI scores measured at 12 months were compared with those originally predicted. Results: Complete Abbreviated Injury Scale score information was available for 617 individuals, of whom 587 (95%) could be assigned at least one FCI score (range, 1-17). Agreement between the largest predicted FCI and observed FCI score was poor ([kappa] = 0.05; 95% confidence interval, 0.00-0.10) and explained only 1% of the variability in observed FCI. Using an encompassing model that included all FCI assignments, agreement remained poor ([kappa] = 0.05; 95% confidence interval, -0.02-0.12), and the model explained only 9% of the variability in observed FCI. Conclusion: The predicted functional capacity poorly agrees with actual functional outcomes. Further research should consider including other (noninjury) explanatory factors in predicting FCI at 12 months.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

At a time when global uncertainty is paramount and when a new form or re-form of curriculum is emerging – with content displaced by skills and knowledge acquisition by learning - assessment, too, begins to take on a new from or re-form. The focus for assessment has shifted to that which engages and promotes learning as s process rather than an assessment that focuses solely on measuring and reporting learning as product or score. The use of the portfolio for assessment offers the potential for the process and progress – integral to learning - to be included.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objectives: To evaluate the validity, reliability and responsiveness of EDC using the WOMAC® NRS 3.1 Index on Motorola V3 mobile phones. ---------- Methods: Patients with osteoarthritis (OA) undergoing primary unilateral hip or knee joint replacement surgery were assessed pre-operatively and 3-4 months post-operatively. Patients completed the WOMAC® Index in paper (p-WOMAC®) and electronic (m-WOMAC®) format in random order. ---------- Results: 24 men and 38 women with hip and knee OA participated and successfully completed the m-WOMAC® questionnaire. Pearson correlations between the summated total index scores for the p-WOMAC® and m-WOMAC® pre- and post-surgery were 0.98 and 0.99 (p<0.0001). There was no clinically important or statistically significant between-method difference in the adjusted total summated scores, pre- and post-surgery (adjusted mean difference = 4.44, p = 0.474 and 1.73, p = 0.781). Internal consistency estimates of m-WOMAC® reliability were 0.87 – 0.98. The m-WOMAC® detected clinically important, statistically significant (p<0.0001) improvements in pain, stiffness, function and total index score. ---------- Conclusions: Sixty-two patients with hip and knee OA successfully completed EDC by Motorola V3 mobile phone using the m-WOMAC® NRS3.1 Index; completion times averaging only 1-1.5 minutes longer than the p-WOMAC® Index. Data were successfully and securely transmitted from patients in Australia to a server in the USA. There was close agreement and no significant differences between m-WOMAC® and p-WOMAC® scores. This study confirms the validity, reliability and responsiveness of the Exco InTouch engineered, Java-based m-WOMAC® Index application. EDC with the m-WOMAC® Index provides unique opportunities for using quantitative measurement in clinical research and practice.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Emerging data streaming applications in Wireless Sensor Networks require reliable and energy-efficient Transport Protocols. Our recent Wireless Sensor Network deployment in the Burdekin delta, Australia, for water monitoring [T. Le Dinh, W. Hu, P. Sikka, P. Corke, L. Overs, S. Brosnan, Design and deployment of a remote robust sensor network: experiences from an outdoor water quality monitoring network, in: Second IEEE Workshop on Practical Issues in Building Sensor Network Applications (SenseApp 2007), Dublin, Ireland, 2007] is one such example. This application involves streaming sensed data such as pressure, water flow rate, and salinity periodically from many scattered sensors to the sink node which in turn relays them via an IP network to a remote site for archiving, processing, and presentation. While latency is not a primary concern in this class of application (the sampling rate is usually in terms of minutes or hours), energy-efficiency is. Continuous long-term operation and reliable delivery of the sensed data to the sink are also desirable. This paper proposes ERTP, an Energy-efficient and Reliable Transport Protocol for Wireless Sensor Networks. ERTP is designed for data streaming applications, in which sensor readings are transmitted from one or more sensor sources to a base station (or sink). ERTP uses a statistical reliability metric which ensures the number of data packets delivered to the sink exceeds the defined threshold. Our extensive discrete event simulations and experimental evaluations show that ERTP is significantly more energyefficient than current approaches and can reduce energy consumption by more than 45% when compared to current approaches. Consequently, sensor nodes are more energy-efficient and the lifespan of the unattended WSN is increased.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: The Brief Michigan Alcoholism Screening Test (bMAST) is a 10-item test derived from the 25-item Michigan Alcoholism Screening Test (MAST). It is widely used in the assessment of alcohol dependence. In the absence of previous validation studies, the principal aim of this study was to assess the validity and reliability of the bMAST as a measure of the severity of problem drinking. Method: There were 6,594 patients (4,854 men, 1,740 women) who had been referred for alcohol-use disorders to a hospital alcohol and drug service who voluntarily participated in this study. Results: An exploratory factor analysis defined a two-factor solution, consisting of Perception of Current Drinking and Drinking Consequences factors. Structural equation modeling confirmed that the fit of a nine-item, two-factor model was superior to the original one-factor model. Concurrent validity was assessed through simultaneous administration of the Alcohol Use Disorders Identification Test (AUDIT) and associations with alcohol consumption and clinically assessed features of alcohol dependence. The two-factor bMAST model showed moderate correlations with the AUDIT. The two-factor bMAST and AUDIT were similarly associated with quantity of alcohol consumption and clinically assessed dependence severity features. No differences were observed between the existing weighted scoring system and the proposed simple scoring system. Conclusions: In this study, both the existing bMAST total score and the two-factor model identified were as effective as the AUDIT in assessing problem drinking severity. There are additional advantages of employing the two-factor bMAST in the assessment and treatment planning of patients seeking treatment for alcohol-use disorders. (J. Stud. Alcohol Drugs 68: 771-779,2007)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Malnutrition is common among dialysis patients and is associated with an adverse outcome. One cause of this is a persistent reduction in nutrient intake, suggesting an abnormality of appetite regulation. Methods We used a novel technique to describe the appetite profile in 46 haemodialysis (HD) patients and 40 healthy controls. The Electronic Appetite Rating System (EARS) employs a palmtop computer to collect hourly ratings of motivation to eat and mood. We collected data on hunger, desire to eat, fullness, and tiredness. HD subjects were monitored on the dialysis day and the interdialytic day. Controls were monitored for 1 or 2 days. Results Temporal profiles of motivation to eat for the controls were similar on both days. Temporal profiles of motivation to eat for the HD group were lower on the dialysis day. Mean HD scores were not significantly different from controls. Dietary records indicated that dialysis patients consumed less food than controls. Conclusions Our data indicate that the EARS can be used to monitor subjective appetite states continuously in a group of HD patients. A HD session reduces hunger and desire to eat. Patients feel more tired after dialysis. This does not correlate with their hunger score, but does correlate with their fullness rating. Nutrient intake is reduced, suggesting a resetting of appetite control for the HD group. The EARS may be useful for intervention studies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Transport regulators consider that, with respect to pavement damage, heavy vehicles (HVs) are the riskiest vehicles on the road network. That HV suspension design contributes to road and bridge damage has been recognised for some decades. This thesis deals with some aspects of HV suspension characteristics, particularly (but not exclusively) air suspensions. This is in the areas of developing low-cost in-service heavy vehicle (HV) suspension testing, the effects of larger-than-industry-standard longitudinal air lines and the characteristics of on-board mass (OBM) systems for HVs. All these areas, whilst seemingly disparate, seek to inform the management of HVs, reduce of their impact on the network asset and/or provide a measurement mechanism for worn HV suspensions. A number of project management groups at the State and National level in Australia have been, and will be, presented with the results of the project that resulted in this thesis. This should serve to inform their activities applicable to this research. A number of HVs were tested for various characteristics. These tests were used to form a number of conclusions about HV suspension behaviours. Wheel forces from road test data were analysed. A “novel roughness” measure was developed and applied to the road test data to determine dynamic load sharing, amongst other research outcomes. Further, it was proposed that this approach could inform future development of pavement models incorporating roughness and peak wheel forces. Left/right variations in wheel forces and wheel force variations for different speeds were also presented. This led on to some conclusions regarding suspension and wheel force frequencies, their transmission to the pavement and repetitive wheel loads in the spatial domain. An improved method of determining dynamic load sharing was developed and presented. It used the correlation coefficient between two elements of a HV to determine dynamic load sharing. This was validated against a mature dynamic loadsharing metric, the dynamic load sharing coefficient (de Pont, 1997). This was the first time that the technique of measuring correlation between elements on a HV has been used for a test case vs. a control case for two different sized air lines. That dynamic load sharing was improved at the air springs was shown for the test case of the large longitudinal air lines. The statistically significant improvement in dynamic load sharing at the air springs from larger longitudinal air lines varied from approximately 30 percent to 80 percent. Dynamic load sharing at the wheels was improved only for low air line flow events for the test case of larger longitudinal air lines. Statistically significant improvements to some suspension metrics across the range of test speeds and “novel roughness” values were evident from the use of larger longitudinal air lines, but these were not uniform. Of note were improvements to suspension metrics involving peak dynamic forces ranging from below the error margin to approximately 24 percent. Abstract models of HV suspensions were developed from the results of some of the tests. Those models were used to propose further development of, and future directions of research into, further gains in HV dynamic load sharing. This was from alterations to currently available damping characteristics combined with implementation of large longitudinal air lines. In-service testing of HV suspensions was found to be possible within a documented range from below the error margin to an error of approximately 16 percent. These results were in comparison with either the manufacturer’s certified data or test results replicating the Australian standard for “road-friendly” HV suspensions, Vehicle Standards Bulletin 11. OBM accuracy testing and development of tamper evidence from OBM data were detailed for over 2000 individual data points across twelve test and control OBM systems from eight suppliers installed on eleven HVs. The results indicated that 95 percent of contemporary OBM systems available in Australia are accurate to +/- 500 kg. The total variation in OBM linearity, after three outliers in the data were removed, was 0.5 percent. A tamper indicator and other OBM metrics that could be used by jurisdictions to determine tamper events were developed and documented. That OBM systems could be used as one vector for in-service testing of HV suspensions was one of a number of synergies between the seemingly disparate streams of this project.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Relatively little information has been reported about foot and ankle problems experienced by nurses, despite anecdotal evidence which suggests they are common ailments. The purpose of this study was to improve knowledge about the prevalence of foot and ankle musculoskeletal disorders (MSDs) and to explore relationships between these MSDs and proposed risk factors. A review of the literature relating to work-related MSDs, MSDs in nursing, foot and lower-limb MSDs, screening for work-related MSDs, foot discomfort, footwear and the prevalence of foot problems in the community was undertaken. Based on the review, theoretical risk factors were proposed that pertained to the individual characteristics of the nurses, their work activity or their work environment. Three studies were then undertaken. A cross-sectional survey of 304 nurses, working in a large tertiary paediatric hospital, established the prevalence of foot and ankle MSDs. The survey collected information about self-reported risk factors of interest. The second study involved the clinical examination of a subgroup of 40 nurses, to examine changes in body discomfort, foot discomfort and postural sway over the course of a single work shift. Objective measurements of additional risk factors, such as individual foot posture (arch index) and the hardness of shoe midsoles, were performed. A final study was used to confirm the test-retest reliability of important aspects of the survey and key clinical measurements. Foot and ankle problems were the most common MSDs experienced by nurses in the preceding seven days (42.7% of nurses). They were the second most common MSDs to cause disability in the last 12 months (17.4% of nurses), and the third most common MSDs experienced by nurses in the last 12 months (54% of nurses). Substantial foot discomfort (Visual Analogue Scale (VAS) score of 50mm or more) was experienced by 48.5% of nurses at sometime in the last 12 months. Individual risk factors, such as obesity and the number of self-reported foot conditions (e.g., callouses, curled toes, flat feet) were strongly associated with the likelihood of experiencing foot problems in the last seven days or during the last 12 months. These risk factors showed consistent associations with disabling foot conditions and substantial foot discomfort. Some of these associations were dependent upon work-related risk factors, such as the location within the hospital and the average hours worked per week. Working in the intensive care unit was associated with higher odds of experiencing foot problems within the last seven days, foot problems in the last 12 months and foot problems that impaired activity in the last 12 months. Changes in foot discomfort experienced within a day, showed large individual variability. Fifteen of the forty nurses experienced moderate/substantial foot discomfort at the end of their shift (VAS 25+mm). Analysis of the association between risk factors and moderate/substantial foot discomfort revealed that foot discomfort was less likely for nurses who were older, had greater BMI or had lower foot arches, as indicated by higher arch index scores. The nurses’ postural sway decreased over the course of the work shift, suggesting improved body balance by the end of the day. These findings were unexpected. Further clinical studies examining individual nurses on several work shifts are needed to confirm these results, particularly due to the small sample size and the single measurement occasion. There are more than 280,000 nurses registered to practice in Australia. The nursing workforce is ageing and the prevalence of foot problems will increase. If the prevalence estimates from this study are extrapolated to the profession generally, more than 70,000 hospital nurses have experienced substantial foot discomfort and 25-30,000 hospital nurses have been limited in their activity due to foot problems during the last 12 months. Nurses with underlying foot conditions were more likely to report having foot problems at work. Strategies to prevent or manage foot conditions exist and they should be disseminated to nurses. Obesity is a significant risk factor for foot and ankle MSDs and these nurses may need particular assistance to manage foot problems. The risk of foot problems for particular groups of nurses, e.g. obese nurses, may vary depending upon the location within the hospital. Further research is needed to confirm the findings of this study. Similar studies should be conducted in other occupational groups that require workers to stand for prolonged periods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speaker verification is the process of verifying the identity of a person by analysing their speech. There are several important applications for automatic speaker verification (ASV) technology including suspect identification, tracking terrorists and detecting a person’s presence at a remote location in the surveillance domain, as well as person authentication for phone banking and credit card transactions in the private sector. Telephones and telephony networks provide a natural medium for these applications. The aim of this work is to improve the usefulness of ASV technology for practical applications in the presence of adverse conditions. In a telephony environment, background noise, handset mismatch, channel distortions, room acoustics and restrictions on the available testing and training data are common sources of errors for ASV systems. Two research themes were pursued to overcome these adverse conditions: Modelling mismatch and modelling uncertainty. To directly address the performance degradation incurred through mismatched conditions it was proposed to directly model this mismatch. Feature mapping was evaluated for combating handset mismatch and was extended through the use of a blind clustering algorithm to remove the need for accurate handset labels for the training data. Mismatch modelling was then generalised by explicitly modelling the session conditions as a constrained offset of the speaker model means. This session variability modelling approach enabled the modelling of arbitrary sources of mismatch, including handset type, and halved the error rates in many cases. Methods to model the uncertainty in speaker model estimates and verification scores were developed to address the difficulties of limited training and testing data. The Bayes factor was introduced to account for the uncertainty of the speaker model estimates in testing by applying Bayesian theory to the verification criterion, with improved performance in matched conditions. Modelling the uncertainty in the verification score itself met with significant success. Estimating a confidence interval for the "true" verification score enabled an order of magnitude reduction in the average quantity of speech required to make a confident verification decision based on a threshold. The confidence measures developed in this work may also have significant applications for forensic speaker verification tasks.