36 resultados para model selection in binary regression
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Resumo:
In this paper a novel rank estimation technique for trajectories motion segmentation within the Local Subspace Affinity (LSA) framework is presented. This technique, called Enhanced Model Selection (EMS), is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built by LSA. The results on synthetic and real data show that without any a priori knowledge, EMS automatically provides an accurate and robust rank estimation, improving the accuracy of the final motion segmentation
Resumo:
The effectiveness of decision rules depends on characteristics of bothrules and environments. A theoretical analysis of environments specifiesthe relative predictive accuracies of the lexicographic rule 'take-the-best'(TTB) and other simple strategies for binary choice. We identify threefactors: how the environment weights variables; characteristics of choicesets; and error. For cases involving from three to five binary cues, TTBis effective across many environments. However, hybrids of equal weights(EW) and TTB models are more effective as environments become morecompensatory. In the presence of error, TTB and similar models do not predictmuch better than a naïve model that exploits dominance. We emphasizepsychological implications and the need for more complete theories of theenvironment that include the role of error.
Resumo:
When can a single variable be more accurate in binary choice than multiple sources of information? We derive analytically the probability that a single variable (SV) will correctly predict one of two choices when both criterion and predictor are continuous variables. We further provide analogous derivations for multiple regression (MR) and equal weighting (EW) and specify the conditions under which the models differ in expected predictive ability. Key factors include variability in cue validities, intercorrelation between predictors, and the ratio of predictors to observations in MR. Theory and simulations are used to illustrate the differential effects of these factors. Results directly address why and when one-reason decision making can be more effective than analyses that use more information. We thus provide analytical backing to intriguing empirical results that, to date, have lacked theoretical justification. There are predictable conditions for which one should expect less to be more.
Resumo:
Alpine tree-line ecotones are characterized by marked changes at small spatial scales that may result in a variety of physiognomies. A set of alternative individual-based models was tested with data from four contrasting Pinus uncinata ecotones in the central Spanish Pyrenees to reveal the minimal subset of processes required for tree-line formation. A Bayesian approach combined with Markov chain Monte Carlo methods was employed to obtain the posterior distribution of model parameters, allowing the use of model selection procedures. The main features of real tree lines emerged only in models considering nonlinear responses in individual rates of growth or mortality with respect to the altitudinal gradient. Variation in tree-line physiognomy reflected mainly changes in the relative importance of these nonlinear responses, while other processes, such as dispersal limitation and facilitation, played a secondary role. Different nonlinear responses also determined the presence or absence of krummholz, in agreement with recent findings highlighting a different response of diffuse and abrupt or krummholz tree lines to climate change. The method presented here can be widely applied in individual-based simulation models and will turn model selection and evaluation in this type of models into a more transparent, effective, and efficient exercise.
Resumo:
Peer-reviewed
Resumo:
The aim of this paper is to verify, for the Spanish case, whether between 1977 and 2008 has increased the internal democracy of the major political parties (PSOE, AP / PP, PCE / IU, PNV and CDC). To do this, we will focus on their leadership selection processes, one of the key elements associated with intra-party democracy. The paper is going to introduce data on four different dimensions of leadership selection: the certification process, the voting procedure, the inclusiveness of the selectorate and, finally, the degree of competitiveness. The results will show that have been few changes in the leadership selection processes of the Spanish political parties since 1977. However, the results of the Spanish case will also be used to suggest some preliminary links between the four dimensions.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt"
Resumo:
The performance of the SAOP potential for the calculation of NMR chemical shifts was evaluated. SAOP results show considerable improvement with respect to previous potentials, like VWN or BP86, at least for the carbon, nitrogen, oxygen, and fluorine chemical shifts. Furthermore, a few NMR calculations carried out on third period atoms (S, P, and Cl) improved when using the SAOP potential
Resumo:
The human olfactory receptor repertoire is reduced in comparison to other mammalsand to other non-human primates. Nonetheless, this olfactory decline opens an opportunity forevolutionary innovation and improvement. In the present study, we focus on an olfactoryreceptor gene, OR5I1, which had previously been shown to present an excess of amino acidreplacement substitutions between humans and chimpanzees. We analyze the geneticvariation in OR5I1 in a large worldwide human panel and find an excess of derived allelessegregating at relatively high frequencies in all populations. Additional evidence for selectionincludes departures from neutrality in allele frequency spectra tests but no unusually extendedhaplotype structure. Moreover, molecular structural inference suggests that one of thenonsynonymous polymorphisms defining the presumably adaptive protein form of OR5I1may alter the functional binding properties of the olfactory receptor. These results arecompatible with positive selection having modeled the pattern of variation found in the OR5I1gene and with a relatively ancient, mild selective sweep predating the “Out of Africa”expansion of modern humans.
Resumo:
Poor understanding of the spliceosomal mechanisms to select intronic 3' ends (3'ss) is a major obstacle to deciphering eukaryotic genomes. Here, we discern the rules for global 3'ss selection in yeast. We show that, in contrast to the uniformity of yeast splicing, the spliceosome uses all available 3'ss within a distance window from the intronic branch site (BS), and that in 70% of all possible 3'ss this is likely to be mediated by pre-mRNA structures. Our results reveal that one of these RNA folds acts as an RNA thermosensor, modulating alternative splicing in response to heat shock by controlling alternate 3'ss availability. Thus, our data point to a deeper role for the pre-mRNA in the control of its own fate, and to a simple mechanism for some alternative splicing.
Resumo:
Background: The human FOXI1 gene codes for a transcription factor involved in the physiology of the inner ear, testis, and kidney. Using three interspecies comparisons, it has been suggested that this may be a gene underhuman-specific selection. We sought to confirm this finding by using an extended set of orthologous sequences.Additionally, we explored for signals of natural selection within humans by sequencing the gene in 20 Europeans,20 East Asians and 20 Yorubas and by analysing SNP variation in a 2 Mb region centered on FOXI1 in 39worldwide human populations from the HGDP-CEPH diversity panel.Results: The genome sequences recently available from other primate and non-primate species showed that FOXI1divergence patterns are compatible with neutral evolution. Sequence-based neutrality tests were not significant inEuropeans, East Asians or Yorubas. However, the Long Range Haplotype (LRH) test, as well as the iHS and XP-Rsbstatistics revealed significantly extended tracks of homozygosity around FOXI1 in Africa, suggesting a recentepisode of positive selection acting on this gene. A functionally relevant SNP, as well as several SNPs either on theputatively selected core haplotypes or with significant iHS or XP-Rsb values, displayed allele frequencies stronglycorrelated with the absolute geographical latitude of the populations sampled.Conclusions: We present evidence for recent positive selection in the FOXI1 gene region in Africa. Climate mightbe related to this recent adaptive event in humans. Of the multiple functions of FOXI1, its role in kidney-mediatedwater-electrolyte homeostasis is the most obvious candidate for explaining a climate-related adaptation.
Resumo:
Several studies have reported high performance of simple decision heuristics multi-attribute decision making. In this paper, we focus on situations where attributes are binary and analyze the performance of Deterministic-Elimination-By-Aspects (DEBA) and similar decision heuristics. We consider non-increasing weights and two probabilistic models for the attribute values: one where attribute values are independent Bernoulli randomvariables; the other one where they are binary random variables with inter-attribute positive correlations. Using these models, we show that good performance of DEBA is explained by the presence of cumulative as opposed to simple dominance. We therefore introduce the concepts of cumulative dominance compliance and fully cumulative dominance compliance and show that DEBA satisfies those properties. We derive a lower bound with which cumulative dominance compliant heuristics will choose a best alternative and show that, even with many attributes, this is not small. We also derive an upper bound for the expected loss of fully cumulative compliance heuristics and show that this is moderateeven when the number of attributes is large. Both bounds are independent of the values ofthe weights.