4 resultados para Dirichlet-multinomial
Resumo:
There has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and time complexity). Once one has developed an approach to a problem of interest, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Standard tests used for this purpose are able to consider jointly neither performance measures nor multiple competitors at once. The aim of this paper is to resolve these issues by developing statistical procedures that are able to account for multiple competing measures at the same time and to compare multiple algorithms altogether. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameters of such models, as usually the number of studied cases is very reduced in such comparisons. Data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
El artículo analiza los cambios político electorales en León, Guanajuato, a partir de cómo se fue configurando el desplazamiento del Partido Revolucionario Institucional (PRI) por el Partido Acción Nacional (PAN) en este ayuntamiento en el año 1988, hasta el cambio de correlación de fuerzas en el año 2012. Ello da pauta para analizar los escenarios que podrían caracterizar las próximas elecciones de este año. Con este objetivo se propone un modelo estadístico para dicho estudio: el modelo de regresión Dirichlet, el cual permite considerar la naturaleza de los datos electorales.
The article analyzes the electoral changes in León, Guanajuato, based on how it was setting the displacement of the Institutional Revolutionary Party (PRI) by the National Action Party (PAN) in this council in 1988, until the change of correlation forces in 2012, which gives guidelines to analyze the scenarios that could characterize the upcoming elections this year. With this aim the authors proposed a statistical model for the study: the Dirichlet regression model, which allows to consider the nature of electoral data.
Resumo:
Tests for dependence of continuous, discrete and mixed continuous-discrete variables are ubiquitous in science. The goal of this paper is to derive Bayesian alternatives to frequentist null hypothesis significance tests for dependence. In particular, we will present three Bayesian tests for dependence of binary, continuous and mixed variables. These tests are nonparametric and based on the Dirichlet Process, which allows us to use the same prior model for all of them. Therefore, the tests are “consistent” among each other, in the sense that the probabilities that variables are dependent computed with these tests are commensurable across the different types of variables being tested. By means of simulations with artificial data, we show the effectiveness of the new tests.
Resumo:
Genome-wide association studies (GWAS) of schizophrenia have yielded more than 100 common susceptibility variants, and strongly support a substantial polygenic contribution of a large number of small allelic effects. It has been hypothesized that familial schizophrenia is largely a consequence of inherited rather than environmental factors. We investigated the extent to which familiality of schizophrenia is associated with enrichment for common risk variants detectable in a large GWAS. We analyzed single nucleotide polymorphism (SNP) data for cases reporting a family history of psychotic illness (N = 978), cases reporting no such family history (N = 4,503), and unscreened controls (N = 8,285) from the Psychiatric Genomics Consortium (PGC1) study of schizophrenia. We used a multinomial logistic regression approach with model-fitting to detect allelic effects specific to either family history subgroup. We also considered a polygenic model, in which we tested whether family history positive subjects carried more schizophrenia risk alleles than family history negative subjects, on average. Several individual SNPs attained suggestive but not genome-wide significant association with either family history subgroup. Comparison of genome-wide polygenic risk scores based on GWAS summary statistics indicated a significant enrichment for SNP effects among family history positive compared to family history negative cases (Nagelkerke's R(2 ) = 0.0021; P = 0.00331; P-value threshold <0.4). Estimates of variability in disease liability attributable to the aggregate effect of genome-wide SNPs were significantly greater for family history positive compared to family history negative cases (0.32 and 0.22, respectively; P = 0.031). We found suggestive evidence of allelic effects detectable in large GWAS of schizophrenia that might be specific to particular family history subgroups. However, consideration of a polygenic risk score indicated a significant enrichment among family history positive cases for common allelic effects. Familial illness might, therefore, represent a more heritable form of schizophrenia, as suggested by previous epidemiological studies.