101 resultados para ensemble classifiers
Resumo:
Diffusion weighted magnetic resonance imaging is a powerful tool that can be employed to study white matter microstructure by examining the 3D displacement profile of water molecules in brain tissue. By applying diffusion-sensitized gradients along a minimum of six directions, second-order tensors (represented by three-by-three positive definite matrices) can be computed to model dominant diffusion processes. However, conventional DTI is not sufficient to resolve more complicated white matter configurations, e.g., crossing fiber tracts. Recently, a number of high-angular resolution schemes with more than six gradient directions have been employed to address this issue. In this article, we introduce the tensor distribution function (TDF), a probability function defined on the space of symmetric positive definite matrices. Using the calculus of variations, we solve the TDF that optimally describes the observed data. Here, fiber crossing is modeled as an ensemble of Gaussian diffusion processes with weights specified by the TDF. Once this optimal TDF is determined, the orientation distribution function (ODF) can easily be computed by analytic integration of the resulting displacement probability function. Moreover, a tensor orientation distribution function (TOD) may also be derived from the TDF, allowing for the estimation of principal fiber directions and their corresponding eigenvalues.
Resumo:
Companies such as NeuroSky and Emotiv Systems are selling non-medical EEG devices for human computer interaction. These devices are significantly more affordable than their medical counterparts, and are mainly used to measure levels of engagement, focus, relaxation and stress. This information is sought after for marketing research and games. However, these EEG devices have the potential to enable users to interact with their surrounding environment using thoughts only, without activating any muscles. In this paper, we present preliminary results that demonstrate that despite reduced voltage and time sensitivity compared to medical-grade EEG systems, the quality of the signals of the Emotiv EPOC neuroheadset is sufficiently good in allowing discrimina tion between imaging events. We collected streams of EEG raw data and trained different types of classifiers to discriminate between three states (rest and two imaging events). We achieved a generalisation error of less than 2% for two types of non-linear classifiers.
Resumo:
This book is a collection of three large-cast plays written in response to a very specific problem. My work as a teacher of drama often required me to locate a script that would somehow miraculously work for a cast of unknown number and gender, and most likely uneven skills and enthusiasm, who I hadn’t even met yet. It’s a familiar dilemma for teachers and students of drama in education contexts, at whatever level you’re teaching. I’d first addressed this creative problem with scripts such as Gate 38 (2010). I had tried using scripts that already existed, but found they required such extensive editing to suit the parameters of cast and performance duration that I may as well have been writing them myself. Even in the setting of a closed studio, in altering these plays I felt I was bending the vision of the playwright, and certainly their narrative structure, out of shape. Everyone who’s attempted to stage a performance with a large cast of students in an educational setting knows it takes time to truly connect with a play, its social contexts, themes and characters. It also takes a lot of time to get on top of the practicalities of learning, rehearsing, directing and running a performance with young people. Often the curtain goes up on something unfinished and unstable. I was looking for ways to reduce the complexity of staging a script, while maintaining the potential of this process as a site of rich, enjoyable learning. Two of the plays (Duty Free and Please Be Seated) are comprised of multiple monologues, combined with music-driven ensemble sequences. The monologues enable individuals to develop and polish their own performances, work in small groups, and cut down on the laborious detail of directing naturalistic scenes based in character interaction. The third (Australian Drama) involves a lot of duologues, meaning that its rehearsal process can happily employ that mainstay of the drama classroom: small group work. There’s plenty of room to move in terms of gender-blind casting as well. Please be Seated is mainly young women. The scripts also contain ensemble-based interludes which are non-verbal, music driven, with a choreographic element. They have also springboarded further explorations in form. The ethical and aesthetic complexities of verbatim works; the interaction between music and theatre; and meta-concerns related to the performing of performance: ‘how can the act of acting ‘acted’. The narratives of all three of these plays are deliberately open, enabling the flexible casting and on-the hop editing that large-group, time-poor processes sometimes necessitate. Duty Free is about the overseas ‘adventures’ of young people. Please Be Seated is based in verbatim text about young people falling in and out of love. Australian Drama is about young people in a drama classroom trying to connect with each other and put their own shine on dull fragments of the theatrical canon. The plays were published as a collection in hardcopy and digital editions by Playlab Press in 2015. Please be Seated is a co-write with a large group. These co-author’s names are listed in the publication, and below in ‘additional information’.
Resumo:
Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.
Resumo:
Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.
Resumo:
In competitive combat sporting environments like boxing, the statistics on a boxer's performance, including the amount and type of punches thrown, provide a valuable source of data and feedback which is routinely used for coaching and performance improvement purposes. This paper presents a robust framework for the automatic classification of a boxer's punches. Overhead depth imagery is employed to alleviate challenges associated with occlusions, and robust body-part tracking is developed for the noisy time-of-flight sensors. Punch recognition is addressed through both a multi-class SVM and Random Forest classifiers. A coarse-to-fine hierarchical SVM classifier is presented based on prior knowledge of boxing punches. This framework has been applied to shadow boxing image sequences taken at the Australian Institute of Sport with 8 elite boxers. Results demonstrate the effectiveness of the proposed approach, with the hierarchical SVM classifier yielding a 96% accuracy, signifying its suitability for analysing athletes punches in boxing bouts.
Resumo:
Background The Global Burden of Diseases (GBD), Injuries, and Risk Factors study used the disability-adjusted life year (DALY) to quantify the burden of diseases, injuries, and risk factors. This paper provides an overview of injury estimates from the 2013 update of GBD, with detailed information on incidence, mortality, DALYs and rates of change from 1990 to 2013 for 26 causes of injury, globally, by region and by country. Methods Injury mortality was estimated using the extensive GBD mortality database, corrections for ill-defined cause of death and the cause of death ensemble modelling tool. Morbidity estimation was based on inpatient and outpatient data sets, 26 cause-of-injury and 47 nature-of-injury categories, and seven follow-up studies with patient-reported long-term outcome measures. Results In 2013, 973 million (uncertainty interval (UI) 942 to 993) people sustained injuries that warranted some type of healthcare and 4.8 million (UI 4.5 to 5.1) people died from injuries. Between 1990 and 2013 the global age-standardised injury DALY rate decreased by 31% (UI 26% to 35%). The rate of decline in DALY rates was significant for 22 cause-of-injury categories, including all the major injuries. Conclusions Injuries continue to be an important cause of morbidity and mortality in the developed and developing world. The decline in rates for almost all injuries is so prominent that it warrants a general statement that the world is becoming a safer place to live in. However, the patterns vary widely by cause, age, sex, region and time and there are still large improvements that need to be made.
Resumo:
Generating discriminative input features is a key requirement for achieving highly accurate classifiers. The process of generating features from raw data is known as feature engineering and it can take significant manual effort. In this paper we propose automated feature engineering to derive a suite of additional features from a given set of basic features with the aim of both improving classifier accuracy through discriminative features, and to assist data scientists through automation. Our implementation is specific to HTTP computer network traffic. To measure the effectiveness of our proposal, we compare the performance of a supervised machine learning classifier built with automated feature engineering versus one using human-guided features. The classifier addresses a problem in computer network security, namely the detection of HTTP tunnels. We use Bro to process network traffic into base features and then apply automated feature engineering to calculate a larger set of derived features. The derived features are calculated without favour to any base feature and include entropy, length and N-grams for all string features, and counts and averages over time for all numeric features. Feature selection is then used to find the most relevant subset of these features. Testing showed that both classifiers achieved a detection rate above 99.93% at a false positive rate below 0.01%. For our datasets, we conclude that automated feature engineering can provide the advantages of increasing classifier development speed and reducing development technical difficulties through the removal of manual feature engineering. These are achieved while also maintaining classification accuracy.
Resumo:
BACKGROUND Polygenic risk scores comprising established susceptibility variants have shown to be informative classifiers for several complex diseases including prostate cancer. For prostate cancer it is unknown if inclusion of genetic markers that have so far not been associated with prostate cancer risk at a genome-wide significant level will improve disease prediction. METHODS We built polygenic risk scores in a large training set comprising over 25,000 individuals. Initially 65 established prostate cancer susceptibility variants were selected. After LD pruning additional variants were prioritized based on their association with prostate cancer. Six-fold cross validation was performed to assess genetic risk scores and optimize the number of additional variants to be included. The final model was evaluated in an independent study population including 1,370 cases and 1,239 controls. RESULTS The polygenic risk score with 65 established susceptibility variants provided an area under the curve (AUC) of 0.67. Adding an additional 68 novel variants significantly increased the AUC to 0.68 (P = 0.0012) and the net reclassification index with 0.21 (P = 8.5E-08). All novel variants were located in genomic regions established as associated with prostate cancer risk. CONCLUSIONS Inclusion of additional genetic variants from established prostate cancer susceptibility regions improves disease prediction. Prostate 75:1467–1474, 2015. © 2015 Wiley Periodicals, Inc.
Resumo:
This investigation aimed to quantify metabolic rate when wearing an explosive ordnance disposal (EOD) ensemble (~33kg) during standing and locomotion; and determine whether the Pandolf load carriage equation accurately predicts metabolic rate when wearing an EOD ensemble during standing and locomotion. Ten males completed 8 trials with metabolic rate measured through indirect calorimetry. Walking in EOD at 2.5, 4.0 and 5.5km·h−1 was significantly (p < 0.05) greater than matched trials without the EOD ensemble by 49% (127W), 65% (213W) and 78% (345W), respectively. Mean bias (95% limits of agreement) between predicted and measured metabolism during standing, 2.5, 4 and 5.5km·h−1 were 47W (19 to 75W); −111W (−172 to −49W); −122W (−189 to −54W) and −158W (−245 to −72W), respectively. The Pandolf equation significantly underestimated measured metabolic rate during locomotion. These findings have practical implications for EOD technicians during training and operation and should be considered when developing maximum workload duration models and work-rest schedules.
Resumo:
In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.