930 resultados para selection methods


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article presents a new method for predicting viral resistance to seven protease inhibitors from the HIV-1 genotype, and for identifying the positions in the protease gene at which the specific nature of the mutation affects resistance. The neural network Analog ARTMAP predicts protease inhibitor resistance from viral genotypes. A feature selection method detects genetic positions that contribute to resistance both alone and through interactions with other positions. This method has identified positions 35, 37, 62, and 77, where traditional feature selection methods have not detected a contribution to resistance. At several positions in the protease gene, mutations confer differing degress of resistance, depending on the specific amino acid to which the sequence has mutated. To find these positions, an Amino Acid Space is introduced to represent genes in a vector space that captures the functional similarity between amino acid pairs. Feature selection identifies several new positions, including 36, 37, and 43, with amino acid-specific contributions to resistance. Analog ARTMAP networks applied to inputs that represent specific amino acids at these positions perform better than networks that use only mutation locations.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Although many feature selection methods for classification have been developed, there is a need to identify genes in high-dimensional data with censored survival outcomes. Traditional methods for gene selection in classification problems have several drawbacks. First, the majority of the gene selection approaches for classification are single-gene based. Second, many of the gene selection procedures are not embedded within the algorithm itself. The technique of random forests has been found to perform well in high-dimensional data settings with survival outcomes. It also has an embedded feature to identify variables of importance. Therefore, it is an ideal candidate for gene selection in high-dimensional data with survival outcomes. In this paper, we develop a novel method based on the random forests to identify a set of prognostic genes. We compare our method with several machine learning methods and various node split criteria using several real data sets. Our method performed well in both simulations and real data analysis.Additionally, we have shown the advantages of our approach over single-gene-based approaches. Our method incorporates multivariate correlations in microarray data for survival outcomes. The described method allows us to better utilize the information available from microarray data with survival outcomes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A number of medicine selection methods have been used worldwide for formulary purposes. In Northern Ireland, integrated medicines management is being developed, and related projects have been carried out. This paper deals with the description of the STEPS (Safe Therapeutic Economic Pharmaceutical Selection) programme. The paper outlines the development of STEPS and its application as an element of a cost-effective medicines-management process in Northern Ireland.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A number of neural networks can be formulated as the linear-in-the-parameters models. Training such networks can be transformed to a model selection problem where a compact model is selected from all the candidates using subset selection algorithms. Forward selection methods are popular fast subset selection approaches. However, they may only produce suboptimal models and can be trapped into a local minimum. More recently, a two-stage fast recursive algorithm (TSFRA) combining forward selection and backward model refinement has been proposed to improve the compactness and generalization performance of the model. This paper proposes unified two-stage orthogonal least squares methods instead of the fast recursive-based methods. In contrast to the TSFRA, this paper derives a new simplified relationship between the forward and the backward stages to avoid repetitive computations using the inherent orthogonal properties of the least squares methods. Furthermore, a new term exchanging scheme for backward model refinement is introduced to reduce computational demand. Finally, given the error reduction ratio criterion, effective and efficient forward and backward subset selection procedures are proposed. Extensive examples are presented to demonstrate the improved model compactness constructed by the proposed technique in comparison with some popular methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper investigates the gene selection problem for microarray data with small samples and variant correlation. Most existing algorithms usually require expensive computational effort, especially under thousands of gene conditions. The main objective of this paper is to effectively select the most informative genes from microarray data, while making the computational expenses affordable. This is achieved by proposing a novel forward gene selection algorithm (FGSA). To overcome the small samples' problem, the augmented data technique is firstly employed to produce an augmented data set. Taking inspiration from other gene selection methods, the L2-norm penalty is then introduced into the recently proposed fast regression algorithm to achieve the group selection ability. Finally, by defining a proper regression context, the proposed method can be fast implemented in the software, which significantly reduces computational burden. Both computational complexity analysis and simulation results confirm the effectiveness of the proposed algorithm in comparison with other approaches

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article provides an analysis of the leadership selection methods adopted by Northern Ireland's five main parties. Drawing on data from interviews with party elites and internal party documents, it sheds light on an important element of intra-party organisation in the region and constitutes a rare case-study of leadership selection in a consociational democracy. By accounting for instances of organisational reform, this article also reveals the extent to which Northern Ireland's parties align with the wider comparative trend of leadership ‘democratisation’. In terms of ‘who’ selects party leaders, the analysis finds a substantial degree of organisational heterogeneity and a reasonably high rate of democratisation. Northern Ireland's parties also prove rather exceptional in their universal adoption of short fixed terms for party leaders and, in the case of three of the parties, their preference for high candidacy thresholds.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Building on the instrumental model of group conflict (IMGC), the present experiment investigates the support for discriminatory and meritocratic method of selections at university in a sample of local and immigrant students. Results showed that local students were supporting in a larger proportion selection method that favors them over immigrants in comparison to method that consists in selecting the best applicants without considering his/her origin. Supporting the assumption of the IMGC, this effect was stronger for locals who perceived immigrants as competing for resources. Immigrant students supported more strongly the meritocratic selection method than the one that discriminated them. However, contrasting with the assumption of the IMGC, this effect was only present in students who perceived immigrants as weakly competing for locals' resources. Results demonstrate that selection methods used at university can be perceived differently depending on students' origin. Further, they suggest that the mechanisms underlying the perception of discriminatory and meritocratic selection methods differ between local and immigrant students. Hence, the present experiment makes a theoretical contribution to the IMGC by delimiting its assumptions to the ingroup facing a competitive situation with a relevant outgroup. Practical implication for universities recruitment policies are discussed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Nonlinear adjustment toward long-run price equilibrium relationships in the sugar-ethanol-oil nexus in Brazil is examined. We develop generalized bivariate error correction models that allow for cointegration between sugar, ethanol, and oil prices, where dynamic adjustments are potentially nonlinear functions of the disequilibrium errors. A range of models are estimated using Bayesian Monte Carlo Markov Chain algorithms and compared using Bayesian model selection methods. The results suggest that the long-run drivers of Brazilian sugar prices are oil prices and that there are nonlinearities in the adjustment processes of sugar and ethanol prices to oil price but linear adjustment between ethanol and sugar prices.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

1. Species-based indices are frequently employed as surrogates for wider biodiversity health and measures of environmental condition. Species selection is crucial in determining an indicators metric value and hence the validity of the interpretation of ecosystem condition and function it provides, yet an objective process to identify appropriate indicator species is frequently lacking. 2. An effective indicator needs to (i) be representative, reflecting the status of wider biodiversity; (ii) be reactive, acting as early-warning systems for detrimental changes in environmental conditions; (iii) respond to change in a predictable way. We present an objective, niche-based approach for species' selection, founded on a coarse categorisation of species' niche space and key resource requirements, which ensures the resultant indicator has these key attributes. 3. We use UK farmland birds as a case study to demonstrate this approach, identifying an optimal indicator set containing 12 species. In contrast to the 19 species included in the farmland bird index (FBI), a key UK biodiversity indicator that contributes to one of the UK Government's headline indicators of sustainability, the niche space occupied by these species fully encompasses that occupied by the wider community of 62 species. 4. We demonstrate that the response of these 12 species to land-use change is a strong correlate to that of the wider farmland bird community. Furthermore, the temporal dynamics of the index based on their population trends closely matches the population dynamics of the wider community. However, in both analyses, the magnitude of the change in our indicator was significantly greater, allowing this indicator to act as an early-warning system. 5. Ecological indicators are embedded in environmental management, sustainable development and biodiversity conservation policy and practice where they act as metrics against which progress towards national, regional and global targets can be measured. Adopting this niche-based approach for objective selection of indicator species will facilitate the development of sensitive and representative indices for a range of taxonomic groups, habitats and spatial scales.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Resistance to ivermectin (IVM) in field Populations of Rhipicephalus microplus of Brazil has been observed since 2001 In this work, four selection methods (infestations with: (I) IVM-treated larvae, (2) larvae from IVM-treated adult female ticks, (3) larvae from IVM-treated adult female ticks on an IVM-treated host, and (4) larvae obtained from W-treated females that produced eggs with a high eclosion rate) were used oil a field population with an initial ivermectin (IVM) resistance ratio at LC50 (RR50) of 1 37 with the objective to obtain experimentally a highly-resistant strain After ten generations, using these methods combined, the final RR50 was 8 06 This work shows for the first time that it was possible to increase IVM resistance in R. microplus in laboratory conditions. The establishment of a drug resistant R microplus strain is a fundamental first step for further research into the mechanisms of ivermectin-resistance in R. microplus and potentially methods to control this resistance (C) 2009 Elsevier B V All rights reserved

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Generalized Estimating Equations (GEE) method is one of the most commonly used statistical methods for the analysis of longitudinal data in epidemiological studies. A working correlation structure for the repeated measures of the outcome variable of a subject needs to be specified by this method. However, statistical criteria for selecting the best correlation structure and the best subset of explanatory variables in GEE are only available recently because the GEE method is developed on the basis of quasi-likelihood theory. Maximum likelihood based model selection methods, such as the widely used Akaike Information Criterion (AIC), are not applicable to GEE directly. Pan (2001) proposed a selection method called QIC which can be used to select the best correlation structure and the best subset of explanatory variables. Based on the QIC method, we developed a computing program to calculate the QIC value for a range of different distributions, link functions and correlation structures. This program was written in Stata software. In this article, we introduce this program and demonstrate how to use it to select the most parsimonious model in GEE analyses of longitudinal data through several representative examples.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The purpose of instance selection is to identify which instances (examples, patterns) in a large dataset should be selected as representatives of the entire dataset, without significant loss of information. When a machine learning method is applied to the reduced dataset, the accuracy of the model should not be significantly worse than if the same method were applied to the entire dataset. The reducibility of any dataset, and hence the success of instance selection methods, surely depends on the characteristics of the dataset, as well as the machine learning method. This paper adopts a meta-learning approach, via an empirical study of 112 classification datasets from the UCI Repository [1], to explore the relationship between data characteristics, machine learning methods, and the success of instance selection method.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we investigate the parameters selection for Eigenfaces. Our focus is on the eigenvectors and threshold selection issues. We will propose a systematic approach in selecting the eigenvectors based on relative errors of the eigenvalues for the covariance matrix. In addition, we have proposed a method for selecting the classification threshold that utilizes the information obtained from the training data set. Experimentation was conducted on two benchmark face databases, ORL and AMP, with results indicating that the proposed automatic eigenvectors and threshold selection methods produce better recognition performance in terms of precision and recall rates. Furthermore, we show that the eigenvector selection method outperforms energy and stretching dimension methods in terms of selected number of eigenvectors and computation cost.