9 resultados para Classifiers
em University of Queensland eSpace - Australia
Resumo:
Promiscuous human leukocyte antigen (HLA) binding peptides are ideal targets for vaccine development. Existing computational models for prediction of promiscuous peptides used hidden Markov models and artificial neural networks as prediction algorithms. We report a system based on support vector machines that outperforms previously published methods. Preliminary testing showed that it can predict peptides binding to HLA-A2 and -A3 super-type molecules with excellent accuracy, even for molecules where no binding data are currently available.
Resumo:
There has been an abundance of literature on the modelling of hydrocyclones over the past 30 years. However, in the comminution area at least, the more popular commercially available packages (e.g. JKSimMet, Limn, MODSIM) use the models developed by Nageswararao and Plitt in the 1970s, either as published at that time, or with minor modification. With the benefit of 30 years of hindsight, this paper discusses the assumptions and approximations used in developing these models. Differences in model structure and the choice of dependent and independent variables are also considered. Redundancies are highlighted and an assessment made of the general applicability of each of the models, their limitations and the sources of error in their model predictions. This paper provides the latest version of the Nageswararao model based on the above analysis, in a form that can readily be implemented in any suitable programming language, or within a spreadsheet. The Plitt model is also presented in similar form. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Classifications of perinatal deaths have been undertaken for surveillance of causes of death, but also for auditing individual deaths to identify suboptimal care at any level, so that preventive strategies may be implemented. This paper describes the history and development of the paired obstetric and neonatal Perinatal Society of Australia and New Zealand (PSANZ) classifications in the context of other classifications. The PSANZ Perinatal Death Classification is based on obstetric antecedent factors that initiated the sequence of events leading to the death, and was developed largely from the Aberdeen and Whitfield classifications. The PSANZ Neonatal Death Classification is based on fetal and neonatal factors associated with the death. The classifications, accessible on the PSANZ website (http://www.psanz.org), have definitions and guidelines for use, a high level of agreement between classifiers, and are now being used in nearly all Australian states and New Zealand.
Resumo:
The difficulties associated with slurry transportation in autogenous (ag) and semi-autogenous (sag) grinding mills have become more apparent in recent years with the increasing trend to build larger diameter mills for grinding high tonnages. This is particularly noticeable when ag/sag mills are run in closed circuit with classifiers such as fine screens/cyclones. Extensive test work carried out on slurry removal mechanism in grate discharge mills (ag/sag) has shown that the conventional pulp lifters (radial and curved) have inherent drawbacks. They allow short-circuiting of the slurry from pulp lifters into the grinding chamber leading to slurry pool formation. Slurry pool absorbs part of the impact thus inhibiting the grinding process. Twin Chamber Pulp Lifter (TCPL) - an efficient design of pulp lifter developed by the authors overcomes the inherent drawbacks of the conventional pulp lifters. Extensive testing in both laboratory and pilot scale mills has shown that the TCPL completely blocks the flow-back process, thus allowing the mill to operate close to their design flow capacity. The TCPL performance is also found to be independent of variations in charge volume and grate design, whereas they significantly affect the performance of conventional pulp lifters (radial and curved). (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.
Resumo:
We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifiers. We apply both of these to the problem of matching the HI Parkes All Sky Survey radio catalogue with large positional uncertainties to the much denser SuperCOSMOS catalogue with much smaller positional uncertainties. We demonstrate the utility of probabilistic outputs by a controllable completeness and efficiency trade-off and by identifying objects that have high probability of being rare. Finally, possible biasing effects in the output of these classifiers are also highlighted and discussed.
Resumo:
The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the performance of Naïve Bayes classifiers and in this paper, we investigate its effectiveness on application to the TAN classifier.
Resumo:
In this paper we demonstrate that it is possible to gradually improve the performance of support vector machine (SVM) classifiers by using a genetic algorithm to select a sequence of training subsets from the available data. Performance improvement is possible because the SVM solution generally lies some distance away from the Bayes optimal in the space of learning parameters. We illustrate performance improvements on a number of benchmark data sets.
Resumo:
Conventionally, document classification researches focus on improving the learning capabilities of classifiers. Nevertheless, according to our observation, the effectiveness of classification is limited by the suitability of document representation. Intuitively, the more features that are used in representation, the more comprehensive that documents are represented. However, if a representation contains too many irrelevant features, the classifier would suffer from not only the curse of high dimensionality, but also overfitting. To address this problem of suitableness of document representations, we present a classifier-independent approach to measure the effectiveness of document representations. Our approach utilises a labelled document corpus to estimate the distribution of documents in the feature space. By looking through documents in this way, we can clearly identify the contributions made by different features toward the document classification. Some experiments have been performed to show how the effectiveness is evaluated. Our approach can be used as a tool to assist feature selection, dimensionality reduction and document classification.