918 resultados para Supervised classifiers
Resumo:
Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.
Resumo:
Graduated licensing has been identified as the most promising approach to reducing the crash risk of novice drivers. However, research suggests that the effectiveness of graduated licensing appears to differ between urban and rural novice drivers and according to race or ethnicity. Extensive supervised driving practice as a learner driver is an important component of graduated licensing systems in Australia and many other countries. Earlier CARRS-Q research identified that falsification of logbooks was more common among particular demographic groups. The factors underlying this are not well understood. It is unclear whether this reflects a lack of understanding of the importance of supervised practice (given that it is not a licensing requirement in many countries of origin), or it reflects lack of access to vehicles and supervising drivers, or whether there is less respect for driver licensing requirements among some groups. It is possible that the importance of these factors may differ across ethnic groups, depending on socioeconomic factors and cultural attitudes to road safety. In an attempt to better understand these issues, this study presents some preliminary results of focus groups examining the experience of the Queensland Graduated Driver Licensing System by Korean-Australian novice drivers and their parents.
Resumo:
BACKGROUND/OBJECTIVEs A decline in resting energy expenditure (REE) beyond that predicted from changes in body composition has been noted following dietary-induced weight loss. However, it is unknown whether a compensatory downregulation in REE also accompanies exercise (EX)-induced weight loss, or whether this adaptive metabolic response influences energy intake (EI). SUBJECTS/METHODS Thirty overweight and obese women (body mass index (BMI)=30.6±3.6 kg/m2) completed 12 weeks of supervised aerobic EX. Body composition, metabolism, EI and metabolic-related hormones were measured at baseline, week 6 and post intervention. The metabolic adaptation (MA), that is, difference between predicted and measured REE was also calculated post intervention (MApost), with REE predicted using a regression equation generated in an independent sample of 66 overweight and obese women (BMI=31.0±3.9 kg/m2). RESULTS Although mean predicted and measured REE did not differ post intervention, 43% of participants experienced a greater-than-expected decline in REE (−102.9±77.5 kcal per day). MApost was associated with the change in leptin (r=0.47; P=0.04), and the change in resting fat (r=0.52; P=0.01) and carbohydrate oxidation (r=−0.44; P=0.02). Furthermore, MApost was also associated with the change in EI following EX (r=−0.44; P=0.01). CONCLUSIONS Marked variability existed in the adaptive metabolic response to EX. Importantly, those who experienced a downregulation in REE also experienced an upregulation in EI, indicating that the adaptive metabolic response to EX influences both physiological and behavioural components of energy balance.
Resumo:
The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.
Resumo:
Field robots often rely on laser range finders (LRFs) to detect obstacles and navigate autonomously. Despite recent progress in sensing technology and perception algorithms, adverse environmental conditions, such as the presence of smoke, remain a challenging issue for these robots. In this paper, we investigate the possibility to improve laser-based perception applications by anticipating situations when laser data are affected by smoke, using supervised learning and state-of-the-art visual image quality analysis. We propose to train a k-nearest-neighbour (kNN) classifier to recognise situations where a laser scan is likely to be affected by smoke, based on visual data quality features. This method is evaluated experimentally using a mobile robot equipped with LRFs and a visual camera. The strengths and limitations of the technique are identified and discussed, and we show that the method is beneficial if conservative decisions are the most appropriate.
Resumo:
The vast majority of current robot mapping and navigation systems require specific well-characterized sensors that may require human-supervised calibration and are applicable only in one type of environment. Furthermore, if a sensor degrades in performance, either through damage to itself or changes in environmental conditions, the effect on the mapping system is usually catastrophic. In contrast, the natural world presents robust, reasonably well-characterized solutions to these problems. Using simple movement behaviors and neural learning mechanisms, rats calibrate their sensors for mapping and navigation in an incredibly diverse range of environments and then go on to adapt to sensor damage and changes in the environment over the course of their lifetimes. In this paper, we introduce similar movement-based autonomous calibration techniques that calibrate place recognition and self-motion processes as well as methods for online multisensor weighting and fusion. We present calibration and mapping results from multiple robot platforms and multisensory configurations in an office building, university campus, and forest. With moderate assumptions and almost no prior knowledge of the robot, sensor suite, or environment, the methods enable the bio-inspired RatSLAM system to generate topologically correct maps in the majority of experiments.
Resumo:
The purpose of this study was to contrast the role of parental and non-parental (sibling, other family and non-family) supervisors in the supervision of learner drivers in graduated driver licensing systems. The sample consisted of 522 supervisors from the Australian states of Queensland (n = 204, 39%) and New South Wales (n = 318, 61%). The learner licence requirements in these two states are similar, although learners in Queensland are required to accrue 100 h of supervision in a log book while those in New South Wales are required to accrue 120 h. Approximately 50 per cent of the sample (n = 255) were parents of the learner driver while the remainder of the sample were either siblings (n = 72, 13.8%), other family members (n = 153, 29.3%) or non-family (n = 114, 21.8%). Parents were more likely than siblings, other family or non-family members to be the primary supervisor of the learner driver. Siblings provided fewer hours of practice when compared with other supervisor types while the median and mode suggest that parents provided the most hours of practice to learner drivers. This study demonstrates that non-parental supervisors, such as siblings, other family members and non-family, at least in jurisdictions that require 100 or 120 h of practice, are important in facilitating learner drivers to accumulate sufficient supervised driving practice.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
Objective Evaluate the effectiveness and robustness of Anonym, a tool for de-identifying free-text health records based on conditional random fields classifiers informed by linguistic and lexical features, as well as features extracted by pattern matching techniques. De-identification of personal health information in electronic health records is essential for the sharing and secondary usage of clinical data. De-identification tools that adapt to different sources of clinical data are attractive as they would require minimal intervention to guarantee high effectiveness. Methods and Materials The effectiveness and robustness of Anonym are evaluated across multiple datasets, including the widely adopted Integrating Biology and the Bedside (i2b2) dataset, used for evaluation in a de-identification challenge. The datasets used here vary in type of health records, source of data, and their quality, with one of the datasets containing optical character recognition errors. Results Anonym identifies and removes up to 96.6% of personal health identifiers (recall) with a precision of up to 98.2% on the i2b2 dataset, outperforming the best system proposed in the i2b2 challenge. The effectiveness of Anonym across datasets is found to depend on the amount of information available for training. Conclusion Findings show that Anonym compares to the best approach from the 2006 i2b2 shared task. It is easy to retrain Anonym with new datasets; if retrained, the system is robust to variations of training size, data type and quality in presence of sufficient training data.
Resumo:
Background Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. Aims In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Results Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. Conclusion The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.
Resumo:
Objective While many jurisdictions internationally now require learner drivers to complete a specified number of hours of supervised driving practice before being able to drive unaccompanied, very few require learner drivers to complete a log book to record this practice and then present it to the licensing authority. Learner drivers in most Australian jurisdictions must complete a log book that records their practice thereby confirming to the licensing authority that they have met the mandated hours of practice requirement. These log books facilitate the management and enforcement of minimum supervised hours of driving requirements. Method Parents of learner drivers in two Australian states, Queensland and New South Wales, completed an online survey assessing a range of factors, including their perceptions of the accuracy of their child’s learner log book and the effectiveness of the log book system. Results The study indicates that the large majority of parents believe that their child’s learner log book is accurate. However, they generally report that the log book system is only moderately effective as a system to measure the number of hours of supervised practice a learner driver has completed. Conclusions The results of this study suggest the presence of a paradox with many parents possibly believing that others are not as diligent in the use of log books as they are or that the system is too open to misuse. Given that many parents report that their child’s log book is accurate, this study has important implications for the development and ongoing monitoring of hours of practice requirements in graduated driver licensing systems.
Resumo:
A high-level relationPopper dimension—( Exclusion dimension—( VC dimension—( between Karl Popper’s ideas on “falsifiability of scientific theories” and the notion of “overfitting”Overfitting in statistical learning theory can be easily traced. However, it was pointed out that at the level of technical details the two concepts are significantly different. One possible explanation that we suggest is that the process of falsification is an active process, whereas statistical learning theory is mainly concerned with supervised learningSupervised learning, which is a passive process of learning from examples arriving from a stationary distribution. We show that concepts that are closer (although still distant) to Karl Popper’s definitions of falsifiability can be found in the domain of learning using membership queries, and derive relations between Popper’s dimension, exclusion dimension, and the VC-dimensionVC dimension.
Resumo:
Clinical experience, or experience in the ‘real world’ of practice, is a fundamental component of many health professional courses. It often involves students undertaking practical experience in clinical workplace settings, typically referred to as clinical placements, under the supervision of health professionals. Broadly speaking, the role of clinical supervisors, or teachers, is aimed at assisting students to integrate the theoretical and skills based components of the curriculum within the context of patient/client care (Erstzen et al 2009). Clinical experience also provides students with the opportunity to assimilate the attitudes, values and skills which they require to become appropriately skilled professionals in the environments in which they will eventually practise. However, clinical settings are particularly challenging learning environments for students. Unlike classroom learning, students in the clinical setting frequently find themselves involved in unplanned and often complex activities with patients and other health care providers, being supervised by a variety of clinical staff who have very different methods and styles of teaching, and negotiating bureaucratic or hierarchical structures in busy clinical workplaces where they may only be spending a limited amount of time. Kilminster et al (2007) also draw attention to tensions that may exist between the learning needs of students and the provision of quality care or need to prevent harm to the patient (e.g. Elkind et al 2007). All of these factors complicate the realisation of clinical education goals and underscore the need for effective clinical teaching practices that maximise student learning in clinical environments. This report provides a summary of work that has been achieved in relation to ALTC projects and fellowships associated with clinical teaching, and a review of scholarly publications relevant to this field. The report also makes recommendations based on issues identified and/or where further work is indicated. The projects and fellowships reviewed cover a range of discipline areas including Biology, Paramedic Practice, Clinical Exercise Physiology, Occupational Therapy, Speech Pathology, Physiotherapy, Pharmacy, Nursing and Veterinary Science. The main areas of focus cover issues related to curriculum, particularly in relation to industry expectations of ‘work-ready’ graduates and the implications for theoretical and practical, or clinical preparation; development of competency assessment tools that are nationally applicable across discipline-specific courses; and improvement of clinical learning through strategies targeting the clinical learning environment, building the teaching capacity of clinical supervisors and/or enhancing the clinical learning/teaching process.
Resumo:
Objectives To review the effects of physical activity on health and behavior outcomes and develop evidence-based recommendations for physical activity in youth. Study design A systematic literature review identified 850 articles; additional papers were identified by the expert panelists. Articles in the identified outcome areas were reviewed, evaluated and summarized by an expert panelist. The strength of the evidence, conclusions, key issues, and gaps in the evidence were abstracted in a standardized format and presented and discussed by panelists and organizational representatives. Results Most intervention studies used supervised programs of moderate to vigorous physical activity of 30 to 45 minutes duration 3 to 5 days per week. The panel believed that a greater amount of physical activity would be necessary to achieve similar beneficial effects on health and behavioral outcomes in ordinary daily circumstances (typically intermittent and unsupervised activity). Conclusion School-age youth should participate daily in 60 minutes or more of moderate to vigorous physical activity that is developmentally appropriate, enjoyable, and involves a variety of activities.
Resumo:
Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.