918 resultados para Supervised classifiers
Resumo:
This paper seeks to re-conceptualize the research supervision relationship. The literature has tended to view doctoral study in four ways: (i) as an exercise in self-management; (ii) as a research experience; (iii) as training for research, or; (iv) as an instance of student-centred learning. Although each of these approaches has their merits, they also suffer from conceptual weaknesses. This paper seeks to harness the merits — and minimize the disadvantages — by re-conceptualizing doctoral research as a ‘writing journey’. The paper utilizes the insights of new rhetoric in linguistic theory to defend a writing-centered conception of supervised research and offers some practical strategies on how it might be put into effect.
Resumo:
Flos Chrysanthemum is a generic name for a particular group of edible plants, which also have medicinal properties. There are, in fact, twenty to thirty different cultivars, which are commonly used in beverages and for medicinal purposes. In this work, four Flos Chrysanthemum cultivars, Hangju, Taiju, Gongju, and Boju, were collected and chromatographic fingerprints were used to distinguish and assess these cultivars for quality control purposes. Chromatography fingerprints contain chemical information but also often have baseline drifts and peak shifts, which complicate data processing, and adaptive iteratively reweighted, penalized least squares, and correlation optimized warping were applied to correct the fingerprint peaks. The adjusted data were submitted to unsupervised and supervised pattern recognition methods. Principal component analysis was used to qualitatively differentiate the Flos Chrysanthemum cultivars. Partial least squares, continuum power regression, and K-nearest neighbors were used to predict the unknown samples. Finally, the elliptic joint confidence region method was used to evaluate the prediction ability of these models. The partial least squares and continuum power regression methods were shown to best represent the experimental results.
Resumo:
Debates on gene patents have necessitated the analysis of patents that disclose and reference human sequences. In this study, we built an automated classifier that assigns sequences to one of nine predefined categories according to their functional roles in patent claims by applying natural language processing and supervised learning techniques. To improve its correctness, we experimented with various feature mappings, resulting in the maximal accuracy of 79%.
Resumo:
Within online learning communities, receiving timely and meaningful insights into the quality of learning activities is an important part of an effective educational experience. Commonly adopted methods – such as the Community of Inquiry framework – rely on manual coding of online discussion transcripts, which is a costly and time consuming process. There are several efforts underway to enable the automated classification of online discussion messages using supervised machine learning, which would enable the real-time analysis of interactions occurring within online learning communities. This paper investigates the importance of incorporating features that utilise the structure of on-line discussions for the classification of "cognitive presence" – the central dimension of the Community of Inquiry framework focusing on the quality of students' critical thinking within online learning communities. We implemented a Conditional Random Field classification solution, which incorporates structural features that may be useful in increasing classification performance over other implementations. Our approach leads to an improvement in classification accuracy of 5.8% over current existing techniques when tested on the same dataset, with a precision and recall of 0.630 and 0.504 respectively.
Resumo:
Companies such as NeuroSky and Emotiv Systems are selling non-medical EEG devices for human computer interaction. These devices are significantly more affordable than their medical counterparts, and are mainly used to measure levels of engagement, focus, relaxation and stress. This information is sought after for marketing research and games. However, these EEG devices have the potential to enable users to interact with their surrounding environment using thoughts only, without activating any muscles. In this paper, we present preliminary results that demonstrate that despite reduced voltage and time sensitivity compared to medical-grade EEG systems, the quality of the signals of the Emotiv EPOC neuroheadset is sufficiently good in allowing discrimina tion between imaging events. We collected streams of EEG raw data and trained different types of classifiers to discriminate between three states (rest and two imaging events). We achieved a generalisation error of less than 2% for two types of non-linear classifiers.
Resumo:
This study investigated the development and operation of Learner Driver Mentor Programs (LDMPs). LDMPs are used throughout Australia to assist young learner drivers to gain supervised on-road driving experience through coordinated access to vehicles and supervisors. There is a significant lack of research regarding these programs. In this study, 41 stakeholders including representatives from existing or ceased LDMPs as well as representatives of other groups completed a questionnaire in either survey or interview format. The questionnaire sought information about the objectives of LDMPs, any social problems that were targeted as well as the characteristics of an ideal program and what could be done to improve them. Stakeholders indicated that LDMPs were targeted at local communities and, therefore, there should be a clear local need for the program as well as community ownership and involvement in the program. Additionally, the program needed to be accessible and provide clear positive outcomes for mentees. The most common suggestion to improve LDMPs related to the provision of greater funding and sponsorship, particularly in relation to the vehicles used within the programs. LDMPs appear to have an important role in facilitating young learner drivers to acquire the appropriate number of supervised hours of driving practice. However, while a number of factors appear related to a successful program, the program must remain flexible and suitable for its local community. There is a clear need to complete evaluations of existing programs to ensure that future LDMPs and modifications to existing programs are evidence-based.
Resumo:
Learner Driver Mentor Programs (LDMPs) assist disadvantaged learner drivers to gain supervised on-road driving experience by providing access to vehicles and volunteer mentors. In the absence of existing research investigating the implementation of Best Practice principles in LDMPs, this case study examines successful program operation in the context of a rural town setting. The study is based on an existing Best Practice model for LDMPs, and triangulation of data from a mentor focus group (n = 7), interviews with program stakeholders (n = 9), and an in-depth interview with the site-based program development officer. The data presented is based upon selected findings of the broader evaluation study. Preliminary findings regarding driving session management, support of mentors and mentees, and building and maintaining relationships with program stakeholders, are discussed. Key findings relate to the importance of relationships in engagement with the program and collaborating across sectors to achieve a range of positive outcomes for learners. The findings highlight the need for the program to be relevant and responsive to the requirements of the population and the context in which it is operating.
Resumo:
Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.
Resumo:
The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.
Resumo:
Consumer risk assessment is a crucial step in the regulatory approval of pesticide use on food crops. Recently, an additional hurdle has been added to the formal consumer risk assessment process with the introduction of short-term intake or exposure assessment and a comparable short-term toxicity reference, the acute reference dose. Exposure to residues during one meal or over one day is important for short-term or acute intake. Exposure in the short term can be substantially higher than average because the consumption of a food on a single occasion can be very large compared with typical long-term or mean consumption and the food may have a much larger residue than average. Furthermore, the residue level in a single unit of a fruit or vegetable may be higher by a factor (defined as the variability factor, which we have shown to be typically ×3 for the 97.5th percentile unit) than the average residue in the lot. Available marketplace data and supervised residue trial data are examined in an investigation of the variability of residues in units of fruit and vegetables. A method is described for estimating the 97.5th percentile value from sets of unit residue data. Variability appears to be generally independent of the pesticide, the crop, crop unit size and the residue level. The deposition of pesticide on the individual unit during application is probably the most significant factor. The diets used in the calculations ideally come from individual and household surveys with enough consumers of each specific food to determine large portion sizes. The diets should distinguish the different forms of a food consumed, eg canned, frozen or fresh, because the residue levels associated with the different forms may be quite different. Dietary intakes may be calculated by a deterministic method or a probabilistic method. In the deterministic method the intake is estimated with the assumptions of large portion consumption of a ‘high residue’ food (high residue in the sense that the pesticide was used at the highest recommended label rate, the crop was harvested at the smallest interval after treatment and the residue in the edible portion was the highest found in any of the supervised trials in line with these use conditions). The deterministic calculation also includes a variability factor for those foods consumed as units (eg apples, carrots) to allow for the elevated residue in some single units which may not be seen in composited samples. In the probabilistic method the distribution of dietary consumption and the distribution of possible residues are combined in repeated probabilistic calculations to yield a distribution of possible residue intakes. Additional information such as percentage commodity treated and combination of residues from multiple commodities may be incorporated into probabilistic calculations. The IUPAC Advisory Committee on Crop Protection Chemistry has made 11 recommendations relating to acute dietary exposure.
Resumo:
Loads that miss in L1 or L2 caches and waiting for their data at the head of the ROB cause significant slow down in the form of commit stalls. We identify that most of these commit stalls are caused by a small set of loads, referred to as LIMCOS (Loads Incurring Majority of COmmit Stalls). We propose simple history-based classifiers that track commit stalls suffered by loads to help us identify this small set of loads. We study an application of these classifiers to prefetching. The classifiers are used to train the prefetcher to focus on the misses suffered by LIMCOS. This, referred to as focused prefetching, results in a 9.8% gain in IPC over naive GHB based delta correlation prefetcher along with a 20.3% reduction in memory traffic for a set of 17 memory-intensive SPEC2000 benchmarks. Another important impact of focused prefetching is a 61% improvement in the accuracy of prefetches. We demonstrate that the proposed classification criterion performs better than other existing criteria like criticality and delinquent loads. Also we show that the criterion of focusing on commit stalls is robust enough across cache levels and can be applied to any prefetcher without any modifications to the prefetcher.
Resumo:
Gaussian processes (GPs) are promising Bayesian methods for classification and regression problems. Design of a GP classifier and making predictions using it is, however, computationally demanding, especially when the training set size is large. Sparse GP classifiers are known to overcome this limitation. In this letter, we propose and study a validation-based method for sparse GP classifier design. The proposed method uses a negative log predictive (NLP) loss measure, which is easy to compute for GP models. We use this measure for both basis vector selection and hyperparameter adaptation. The experimental results on several real-world benchmark data sets show better orcomparable generalization performance over existing methods.
Resumo:
Over 1 billion ornamental fish comprising more than 4000 freshwater and 1400 marine species are traded internationally each year, with 8-10 million imported into Australia alone. Compared to other commodities, the pathogens and disease translocation risks associated with this pattern of trade have been poorly documented. The aim of this study was to conduct an appraisal of the effectiveness of risk analysis and quarantine controls as they are applied according to the Sanitary and Phytosanitary (SPS) agreement in Australia. Ornamental fish originate from about 100 countries and hazards are mostly unknown; since 2000 there have been 16-fold fewer scientific publications on ornamental fish disease compared to farmed fish disease, and 470 fewer compared to disease in terrestrial species (cattle). The import quarantine policies of a range of countries were reviewed and classified as stringent or non-stringent based on the levels of pre-border and border controls. Australia has a stringent policy which includes pre-border health certification and a mandatory quarantine period at border of 1-3 weeks in registered quarantine premises supervised by government quarantine staff. Despite these measures there have been many disease incursions as well as establishment of significant exotic viral, bacterial, fungal, protozoal and metazoan pathogens from ornamental fish in farmed native Australian fish and free-living introduced species. Recent examples include Megalocytivirus and Aeromonas salmonicida atypical strain. In 2006, there were 22 species of alien ornamental fish with established breeding populations in waterways in Australia and freshwater plants and molluscs have also been introduced, proving a direct transmission pathway for establishment of pathogens in native fish species. Australia's stringent quarantine policies for imported ornamental fish are based on import risk analysis under the SPS agreement but have not provided an acceptable level of protection (ALOP) consistent with government objectives to prevent introduction of pests and diseases, promote development of future aquaculture industries or maintain biodiversity. It is concluded that the risk analysis process described by the Office International des Epizooties under the SPS agreement cannot be used in a meaningful way for current patterns of ornamental fish trade. Transboundary disease incursions will continue and exotic pathogens will become established in new regions as a result of the ornamental fish trade, and this will be an international phenomenon. Ornamental fish represent a special case in live animal trade where OIE guidelines for risk analysis need to be revised. Alternatively, for countries such as Australia with implied very high ALOP, the number of species traded and the number of sources permitted need to be dramatically reduced to facilitate hazard identification, risk assessment and import quarantine controls. Lead papers of the eleventh symposium of the International Society for Veterinary Epidemiology and Economics (ISVEE), Cairns, Australia
Resumo:
With improving survival rates following HSCT in children, QOL and management of short- and long-term effects need to be considered. Exercise may help mitigate fatigue and declines in fitness and strength. The aims of this study were to assess the feasibility of an inpatient exercise intervention for children undergoing HSCT and observe the changes in physical and psychological health. Fourteen patients were recruited, mean age 10 yr. A 6MWT, isometric upper and lower body strength, balance, fatigue, and QOL were assessed prior to Tx and six wk post-Tx. A supervised exercise program was offered five days per week during the inpatient period and feasibility assessed through uptake rate. The study had 100% program completion and 60% uptake rate of exercise sessions. The mean (±s.d.) weekly activity was 117.5 (±79.3) minutes. Younger children performed significantly more minutes of exercise than adolescents. At reassessment, strength and fatigue were stabilized while aerobic fitness and balance decreased. QOL revealed a non-statistical trend towards improvement. No exercise-related adverse events were reported. A supervised inpatient exercise program is safe and feasible, with potential physiological and psychosocial benefits.
Resumo:
Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.