195 resultados para Naive Bayes classifier
Resumo:
Analysis of the particulate size and number concentration emissions from a fleet of inner city medium duty CNG buses was conducted using the newly available Diffusion Size Classifier in comparison with more traditional SMPS's and CPC's. Studies were conducted at both steady state and transient driving modes on a vehicle dynamometer utilising a CVS dilution system. Comparative analysis of the results showed that the DiSC provided equivalent information during steady state conditions and was able to provide additional information during transient conditions, namely, the modal diameter of the particle size distribution.
Resumo:
The graft-versus-myeloma (GVM) effect represents a powerful form of immune attack exerted by alloreactive T cells against multiple myeloma cells, which leads to clinical responses in multiple myeloma transplant recipients. Whether myeloma cells are themselves able to induce alloreactive T cells capable of the GVM effect is not defined. Using adoptive transfer of T naive cells into myeloma-bearing mice (established by transplantation of human RPMI8226-TGL myeloma cells into CD122(+) cell-depleted NOD/SCID hosts), we found that myeloma cells induced alloreactive T cells that suppressed myeloma growth and prolonged survival of T cell recipients. Myeloma-induced alloreactive T cells arising in the myeloma-infiltrated bones exerted cytotoxic activity against resident myeloma cells, but limited activity against control myeloma cells obtained from myeloma-bearing mice that did not receive T naive cells. These myeloma-induced alloreactive T cells were derived through multiple CD8(+) T cell divisions and enriched in double-positive (DP) T cells coexpressing the CD8alphaalpha and CD4 coreceptors. MHC class I expression on myeloma cells and contact with T cells were required for CD8(+) T cell divisions and DP-T cell development. DP-T cells present in myeloma-infiltrated bones contained a higher proportion of cells expressing cytotoxic mediators IFN-gamma and/or perforin compared with single-positive CD8(+) T cells, acquired the capacity to degranulate as measured by CD107 expression, and contributed to an elevated perforin level seen in the myeloma-infiltrated bones. These observations suggest that myeloma-induced alloreactive T cells arising in myeloma-infiltrated bones are enriched with DP-T cells equipped with cytotoxic effector functions that are likely to be involved in the GVM effect.
Resumo:
We propose a method of representing audience behavior through facial and body motions from a single video stream, and use these features to predict the rating for feature-length movies. This is a very challenging problem as: i) the movie viewing environment is dark and contains views of people at different scales and viewpoints; ii) the duration of feature-length movies is long (80-120 mins) so tracking people uninterrupted for this length of time is still an unsolved problem, and; iii) expressions and motions of audience members are subtle, short and sparse making labeling of activities unreliable. To circumvent these issues, we use an infrared illuminated test-bed to obtain a visually uniform input. We then utilize motion-history features which capture the subtle movements of a person within a pre-defined volume, and then form a group representation of the audience by a histogram of pair-wise correlations over a small-window of time. Using this group representation, we learn our movie rating classifier from crowd-sourced ratings collected by rottentomatoes.com and show our prediction capability on audiences from 30 movies across 250 subjects (> 50 hrs).
Resumo:
Tobacco smoking, alcohol drinking, and occupational exposures to polycyclic aromatic hydrocarbons are the major proven risk factors for human head and neck squamous-cell cancer (HNSCC). Major research focus on gene-environment interactions concerning HNSCC has been on genes encoding enzymes of metabolism for tobacco smoke constituents and repair enzymes. To investigate the role of genetically determined individual predispositions in enzymes of xenobiotic metabolism and in repair enzymes under the exogenous risk factor tobacco smoke in the carcinogenesis of HNSCC, we conducted a case-control study on 312 cases and 300 noncancer controls. We focused on the impact of 22 sequence variations in CYP1A1, CYP1B1, CYP2E1, ERCC2/XPD, GSTM1, GSTP1, GSTT1, NAT2, NQO1, and XRCC1. To assess relevant main and interactive effects of polymorphic genes on the susceptibility to HNSCC we used statistical models such as logic regression and a Bayesian version of logic regression. In subgroup analysis of nonsmokers, main effects in ERCC2 (Lys751Gln) C/C genotype and combined ERCC2 (Arg156Arg) C/A and A/A genotypes were predominant. When stratifying for smokers, the data revealed main effects on combined CYP1B1 (Leu432Val) C/G and G/G genotypes, followed by CYP1B1 (Leu432Val) G/G genotype and CYP2E1 (-70G>T) G/T genotype. When fitting logistic regression models including relevant main effects and interactions in smokers, we found relevant associations of CYP1B1 (Leu432Val) C/G genotype and CYP2E1 (-70G>T) G/T genotype (OR, 10.84; 95% CI, 1.64-71.53) as well as CYP1B1 (Leu432Val) G/G genotype and GSTM1 null/null genotype (OR, 11.79; 95% CI, 2.18-63.77) with HNSCC. The findings underline the relevance of genotypes of polymorphic CYP1B1 combined with exposures to tobacco smoke.
Resumo:
Bayesian experimental design is a fast growing area of research with many real-world applications. As computational power has increased over the years, so has the development of simulation-based design methods, which involve a number of algorithms, such as Markov chain Monte Carlo, sequential Monte Carlo and approximate Bayes methods, facilitating more complex design problems to be solved. The Bayesian framework provides a unified approach for incorporating prior information and/or uncertainties regarding the statistical model with a utility function which describes the experimental aims. In this paper, we provide a general overview on the concepts involved in Bayesian experimental design, and focus on describing some of the more commonly used Bayesian utility functions and methods for their estimation, as well as a number of algorithms that are used to search over the design space to find the Bayesian optimal design. We also discuss other computational strategies for further research in Bayesian optimal design.
Resumo:
BACKGROUND: Effective diagnosis of malaria is a major component of case management. Rapid diagnostic tests (RDTs) based on Plasmodium falciparumhistidine-rich protein 2 (PfHRP2) are popular for diagnosis of this most virulent malaria infection. However, concerns have been raised about the longevity of the PfHRP2 antigenaemia following curative treatment in endemic regions. METHODS: A model of PfHRP2 production and decay was developed to mimic the kinetics of PfHRP2 antigenaemia during infections. Data from two human infection studies was used to fit the model, and to investigate PfHRP2 kinetics. Four malaria RDTs were assessed in the laboratory to determine the minimum detectable concentration of PfHRP2. RESULTS: Fitting of the PfHRP2 dynamics model indicated that in malaria naive hosts, P. falciparum parasites of the 3D7 strain produce 1.4 x 10(-)(1)(3) g of PfHRP2 per parasite per replication cycle. The four RDTs had minimum detection thresholds between 6.9 and 27.8 ng/mL. Combining these detection thresholds with the kinetics of PfHRP2, it is predicted that as few as 8 parasites/muL may be required to maintain a positive RDT in a chronic infection. CONCLUSIONS: The results of the model indicate that good quality PfHRP2-based RDTs should be able to detect parasites on the first day of symptoms, and that the persistence of the antigen will cause the tests to remain positive for at least seven days after treatment. The duration of a positive test result following curative treatment is dependent on the duration and density of parasitaemia prior to treatment and the presence and affinity of anti-PfHRP2 antibodies.
Resumo:
Environmental monitoring has become increasingly important due to the significant impact of human activities and climate change on biodiversity. Environmental sound sources such as rain and insect vocalizations are a rich and underexploited source of information in environmental audio recordings. This paper is concerned with the classification of rain within acoustic sensor re-cordings. We present the novel application of a set of features for classifying environmental acoustics: acoustic entropy, the acoustic complexity index, spectral cover, and background noise. In order to improve the performance of the rain classification system we automatically classify segments of environmental recordings into the classes of heavy rain or non-rain. A decision tree classifier is experientially compared with other classifiers. The experimental results show that our system is effective in classifying segments of environmental audio recordings with an accuracy of 93% for the binary classification of heavy rain/non-rain.
Resumo:
This paper introduces a new method to automate the detection of marine species in aerial imagery using a Machine Learning approach. Our proposed system has at its core, a convolutional neural network. We compare this trainable classifier to a handcrafted classifier based on color features, entropy and shape analysis. Experiments demonstrate that the convolutional neural network outperforms the handcrafted solution. We also introduce a negative training example-selection method for situations where the original training set consists of a collection of labeled images in which the objects of interest (positive examples) have been marked by a bounding box. We show that picking random rectangles from the background is not necessarily the best way to generate useful negative examples with respect to learning.
Resumo:
Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.
Resumo:
Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.
Resumo:
This research aims to explore and identify political risks on a large infrastructure project in an exaggerated environment to ascertain whether sufficient objective information can be gathered by project managers to utilise risk modelling techniques. During the study, the author proposes a new definition of political risk; performs a detailed project study of the Neelum Jhelum Hydroelectric Project in Pakistan; implements a probabilistic model using the principle of decomposition and Bayes probabilistic theorem and answers the question: was it possible for project managers to obtain all the relevant objective data to implement a probabilistic model?
Resumo:
We present a framework and first set of simulations for evolving a language for communicating about space. The framework comprises two components: (1) An established mobile robot platform, RatSLAM, which has a "brain" architecture based on rodent hippocampus with the ability to integrate visual and odometric cues to create internal maps of its environment. (2) A language learning system based on a neural network architecture that has been designed and implemented with the ability to evolve generalizable languages which can be learned by naive learners. A study using visual scenes and internal maps streamed from the simulated world of the robots to evolve languages is presented. This study investigated the structure of the evolved languages showing that with these inputs, expressive languages can effectively categorize the world. Ongoing studies are extending these investigations to evolve languages that use the full power of the robots representations in populations of agents.
Resumo:
Summary 1. Acoustic methods are used increasingly to survey and monitor bat populations. However, the use of acoustic methods at continental scales can be hampered by the lack of standardized and objective methods to identify all species recorded. This makes comparable continent-wide monitoring difficult, impeding progress towards developing biodiversity indicators, transboundary conservation programmes and monitoring species distribution changes. 2. Here we developed a continental-scale classifier for acoustic identification of bats, which can be used throughout Europe to ensure objective, consistent and comparable species identifications. We selected 1350 full-spectrum reference calls from a set of 15 858 calls of 34 European species, from EchoBank, a global echolocation call library. We assessed 24 call parameters to evaluate how well they distinguish between species and used the 12 most useful to train a hierarchy of ensembles of artificial neural networks to distinguish the echolocation calls of these bat species. 3. Calls are first classified to one of five call-type groups, with a median accuracy of 97·6%. The median species-level classification accuracy is 83·7%, providing robust classification for most European species, and an estimate of classification error for each species. 4. These classifiers were packaged into an online tool, iBatsID, which is freely available, enabling anyone to classify European calls in an objective and consistent way, allowing standardized acoustic identification across the continent. 5. Synthesis and applications. iBatsID is the first freely available and easily accessible continental- scale bat call classifier, providing the basis for standardized, continental acoustic bat monitoring in Europe. This method can provide key information to managers and conservation planners on distribution changes and changes in bat species activity through time.
Resumo:
Background Although the detrimental impact of major depressive disorder (MDD) at the individual level has been described, its global epidemiology remains unclear given limitations in the data. Here we present the modelled epidemiological profile of MDD dealing with heterogeneity in the data, enforcing internal consistency between epidemiological parameters and making estimates for world regions with no empirical data. These estimates were used to quantify the burden of MDD for the Global Burden of Disease Study 2010 (GBD 2010). Method Analyses drew on data from our existing literature review of the epidemiology of MDD. DisMod-MR, the latest version of the generic disease modelling system redesigned as a Bayesian meta-regression tool, derived prevalence by age, year and sex for 21 regions. Prior epidemiological knowledge, study- and country-level covariates adjusted sub-optimal raw data. Results There were over 298 million cases of MDD globally at any point in time in 2010, with the highest proportion of cases occurring between 25 and 34 years. Global point prevalence was very similar across time (4.4% (95% uncertainty: 4.2–4.7%) in 1990, 4.4% (4.1–4.7%) in 2005 and 2010), but higher in females (5.5% (5.0–6.0%) compared to males (3.2% (3.0–3.6%) in 2010. Regions in conflict had higher prevalence than those with no conflict. The annual incidence of an episode of MDD followed a similar age and regional pattern to prevalence but was about one and a half times higher, consistent with an average duration of 37.7 weeks. Conclusion We were able to integrate available data, including those from high quality surveys and sub-optimal studies, into a model adjusting for known methodological sources of heterogeneity. We were also able to estimate the epidemiology of MDD in regions with no available data. This informed GBD 2010 and the public health field, with a clearer understanding of the global distribution of MDD.