967 resultados para Crossed Classification Models


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The structure of proteins may change as a result of the inherent flexibility of some protein regions. We develop and explore probabilistic machine learning methods for predicting a continuum secondary structure, i.e. assigning probabilities to the conformational states of a residue. We train our methods using data derived from high-quality NMR models. Results: Several probabilistic models not only successfully estimate the continuum secondary structure, but also provide a categorical output on par with models directly trained on categorical data. Importantly, models trained on the continuum secondary structure are also better than their categorical counterparts at identifying the conformational state for structurally ambivalent residues. Conclusion: Cascaded probabilistic neural networks trained on the continuum secondary structure exhibit better accuracy in structurally ambivalent regions of proteins, while sustaining an overall classification accuracy on par with standard, categorical prediction methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Racing algorithms have recently been proposed as a general-purpose method for performing model selection in machine teaming algorithms. In this paper, we present an empirical study of the Hoeffding racing algorithm for selecting the k parameter in a simple k-nearest neighbor classifier. Fifteen widely-used classification datasets from UCI are used and experiments conducted across different confidence levels for racing. The results reveal a significant amount of sensitivity of the k-nn classifier to its model parameter value. The Hoeffding racing algorithm also varies widely in its performance, in terms of the computational savings gained over an exhaustive evaluation. While in some cases the savings gained are quite small, the racing algorithm proved to be highly robust to the possibility of erroneously eliminating the optimal models. All results were strongly dependent on the datasets used.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. In this paper we show how RBFs with logistic and softmax outputs can be trained efficiently using algorithms derived from Generalised Linear Models. This approach is compared with standard non-linear optimisation algorithms on a number of datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The scaling problems which afflict attempts to optimise neural networks (NNs) with genetic algorithms (GAs) are disclosed. A novel GA-NN hybrid is introduced, based on the bumptree, a little-used connectionist model. As well as being computationally efficient, the bumptree is shown to be more amenable to genetic coding lthan other NN models. A hierarchical genetic coding scheme is developed for the bumptree and shown to have low redundancy, as well as being complete and closed with respect to the search space. When applied to optimising bumptree architectures for classification problems the GA discovers bumptrees which significantly out-perform those constructed using a standard algorithm. The fields of artificial life, control and robotics are identified as likely application areas for the evolutionary optimisation of NNs. An artificial life case-study is presented and discussed. Experiments are reported which show that the GA-bumptree is able to learn simulated pole balancing and car parking tasks using only limited environmental feedback. A simple modification of the fitness function allows the GA-bumptree to learn mappings which are multi-modal, such as robot arm inverse kinematics. The dynamics of the 'geographic speciation' selection model used by the GA-bumptree are investigated empirically and the convergence profile is introduced as an analytical tool. The relationships between the rate of genetic convergence and the phenomena of speciation, genetic drift and punctuated equilibrium arc discussed. The importance of genetic linkage to GA design is discussed and two new recombination operators arc introduced. The first, linkage mapped crossover (LMX) is shown to be a generalisation of existing crossover operators. LMX provides a new framework for incorporating prior knowledge into GAs.Its adaptive form, ALMX, is shown to be able to infer linkage relationships automatically during genetic search.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents a thorough and principled investigation into the application of artificial neural networks to the biological monitoring of freshwater. It contains original ideas on the classification and interpretation of benthic macroinvertebrates, and aims to demonstrate their superiority over the biotic systems currently used in the UK to report river water quality. The conceptual basis of a new biological classification system is described, and a full review and analysis of a number of river data sets is presented. The biological classification is compared to the common biotic systems using data from the Upper Trent catchment. This data contained 292 expertly classified invertebrate samples identified to mixed taxonomic levels. The neural network experimental work concentrates on the classification of the invertebrate samples into biological class, where only a subset of the sample is used to form the classification. Other experimentation is conducted into the identification of novel input samples, the classification of samples from different biotopes and the use of prior information in the neural network models. The biological classification is shown to provide an intuitive interpretation of a graphical representation, generated without reference to the class labels, of the Upper Trent data. The selection of key indicator taxa is considered using three different approaches; one novel, one from information theory and one from classical statistical methods. Good indicators of quality class based on these analyses are found to be in good agreement with those chosen by a domain expert. The change in information associated with different levels of identification and enumeration of taxa is quantified. The feasibility of using neural network classifiers and predictors to develop numeric criteria for the biological assessment of sediment contamination in the Great Lakes is also investigated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The traditional method of classifying neurodegenerative diseases is based on the original clinico-pathological concept supported by 'consensus' criteria and data from molecular pathological studies. This review discusses first, current problems in classification resulting from the coexistence of different classificatory schemes, the presence of disease heterogeneity and multiple pathologies, the use of 'signature' brain lesions in diagnosis, and the existence of pathological processes common to different diseases. Second, three models of neurodegenerative disease are proposed: (1) that distinct diseases exist ('discrete' model), (2) that relatively distinct diseases exist but exhibit overlapping features ('overlap' model), and (3) that distinct diseases do not exist and neurodegenerative disease is a 'continuum' in which there is continuous variation in clinical/pathological features from one case to another ('continuum' model). Third, to distinguish between models, the distribution of the most important molecular 'signature' lesions across the different diseases is reviewed. Such lesions often have poor 'fidelity', i.e., they are not unique to individual disorders but are distributed across many diseases consistent with the overlap or continuum models. Fourth, the question of whether the current classificatory system should be rejected is considered and three alternatives are proposed, viz., objective classification, classification for convenience (a 'dissection'), or analysis as a continuum.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a comparative study of three closely related Bayesian models for unsupervised document level sentiment classification, namely, the latent sentiment model (LSM), the joint sentiment-topic (JST) model, and the Reverse-JST model. Extensive experiments have been conducted on two corpora, the movie review dataset and the multi-domain sentiment dataset. It has been found that while all the three models achieve either better or comparable performance on these two corpora when compared to the existing unsupervised sentiment classification approaches, both JST and Reverse-JST are able to extract sentiment-oriented topics. In addition, Reverse-JST always performs worse than JST suggesting that the JST model is more appropriate for joint sentiment topic detection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Organisations have been approaching servitisation in an unstructured fashion. This is partially because there is insufficient understanding of the different types of Product-Service offerings. Therefore, a more detailed understanding of Product-Service types might advance the collective knowledge and assist organisations that are considering a servitisation strategy. Current models discuss specific aspects on the basis of few (or sometimes single) dimensions. In this paper, we develop a comprehensive model for classifying traditional and green Product-Service offerings, thus combining business and green offerings in a single model. We describe the model building process and its practical application in a case study. The model reveals the various traditional and green options available to companies and identifies how to compete between services; it allows servitisation positions to be identified such that a company may track its journey over time. Finally it fosters the introduction of innovative Product-Service Systems as promising business models to address environmental and social challenges. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A major drawback of artificial neural networks is their black-box character. Therefore, the rule extraction algorithm is becoming more and more important in explaining the extracted rules from the neural networks. In this paper, we use a method that can be used for symbolic knowledge extraction from neural networks, once they have been trained with desired function. The basis of this method is the weights of the neural network trained. This method allows knowledge extraction from neural networks with continuous inputs and output as well as rule extraction. An example of the application is showed. This example is based on the extraction of average load demand of a power plant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prognostic procedures can be based on ranked linear models. Ranked regression type models are designed on the basis of feature vectors combined with set of relations defined on selected pairs of these vectors. Feature vectors are composed of numerical results of measurements on particular objects or events. Ranked relations defined on selected pairs of feature vectors represent additional knowledge and can reflect experts' opinion about considered objects. Ranked models have the form of linear transformations of feature vectors on a line which preserve a given set of relations in the best manner possible. Ranked models can be designed through the minimization of a special type of convex and piecewise linear (CPL) criterion functions. Some sets of ranked relations cannot be well represented by one ranked model. Decomposition of global model into a family of local ranked models could improve representation. A procedures of ranked models decomposition is described in this paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mathematics Subject Classification: 26A33, 47B06, 47G30, 60G50, 60G52, 60G60.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mathematics Subject Classification: 26A33, 45K05, 60J60, 60G50, 65N06, 80-99.