941 resultados para Multiple Hypothesis Testing


Relevância:

50.00% 50.00%

Publicador:

Resumo:

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Most panel unit root tests are designed to test the joint null hypothesis of a unit root for each individual series in a panel. After a rejection, it will often be of interest to identify which series can be deemed to be stationary and which series can be deemed nonstationary. Researchers will sometimes carry out this classification on the basis of n individual (univariate) unit root tests based on some ad hoc significance level. In this paper, we demonstrate how to use the false discovery rate (FDR) in evaluating I(1)=I(0) classifications based on individual unit root tests when the size of the cross section (n) and time series (T) dimensions are large. We report results from a simulation experiment and illustrate the methods on two data sets.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In this work we aim to propose a new approach for preliminary epidemiological studies on Standardized Mortality Ratios (SMR) collected in many spatial regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated via individual epidemiological studies that avoid bias carried on by aggregated analyses. Starting from collecting disease counts and calculating expected disease counts by means of reference population disease rates, in each area an SMR is derived as the MLE under the Poisson assumption on each observation. Such estimators have high standard errors in small areas, i.e. where the expected count is low either because of the low population underlying the area or the rarity of the disease under study. Disease mapping models and other techniques for screening disease rates among the map aiming to detect anomalies and possible high-risk areas have been proposed in literature according to the classic and the Bayesian paradigm. Our proposal is approaching this issue by a decision-oriented method, which focus on multiple testing control, without however leaving the preliminary study perspective that an analysis on SMR indicators is asked to. We implement the control of the FDR, a quantity largely used to address multiple comparisons problems in the eld of microarray data analysis but which is not usually employed in disease mapping. Controlling the FDR means providing an estimate of the FDR for a set of rejected null hypotheses. The small areas issue arises diculties in applying traditional methods for FDR estimation, that are usually based only on the p-values knowledge (Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional p-value provide weak power in small areas, where the expected number of disease cases is small. Moreover tests cannot be assumed as independent when spatial correlation between SMRs is expected, neither they are identical distributed when population underlying the map is heterogeneous. The Bayesian paradigm oers a way to overcome the inappropriateness of p-values based methods. Another peculiarity of the present work is to propose a hierarchical full Bayesian model for FDR estimation in testing many null hypothesis of absence of risk.We will use concepts of Bayesian models for disease mapping, referring in particular to the Besag York and Mollié model (1991) often used in practice for its exible prior assumption on the risks distribution across regions. The borrowing of strength between prior and likelihood typical of a hierarchical Bayesian model takes the advantage of evaluating a singular test (i.e. a test in a singular area) by means of all observations in the map under study, rather than just by means of the singular observation. This allows to improve the power test in small areas and addressing more appropriately the spatial correlation issue that suggests that relative risks are closer in spatially contiguous regions. The proposed model aims to estimate the FDR by means of the MCMC estimated posterior probabilities b i's of the null hypothesis (absence of risk) for each area. An estimate of the expected FDR conditional on data (\FDR) can be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be used to provide an easy decision rule for selecting high-risk areas, i.e. selecting as many as possible areas such that the\FDR is non-lower than a prexed value; we call them\FDR based decision (or selection) rules. The sensitivity and specicity of such rule depend on the accuracy of the FDR estimate, the over-estimation of FDR causing a loss of power and the under-estimation of FDR producing a loss of specicity. Moreover, our model has the interesting feature of still being able to provide an estimate of relative risk values as in the Besag York and Mollié model (1991). A simulation study to evaluate the model performance in FDR estimation accuracy, sensitivity and specificity of the decision rule, and goodness of estimation of relative risks, was set up. We chose a real map from which we generated several spatial scenarios whose counts of disease vary according to the spatial correlation degree, the size areas, the number of areas where the null hypothesis is true and the risk level in the latter areas. In summarizing simulation results we will always consider the FDR estimation in sets constituted by all b i's selected lower than a threshold t. We will show graphs of the\FDR and the true FDR (known by simulation) plotted against a threshold t to assess the FDR estimation. Varying the threshold we can learn which FDR values can be accurately estimated by the practitioner willing to apply the model (by the closeness between\FDR and true FDR). By plotting the calculated sensitivity and specicity (both known by simulation) vs the\FDR we can check the sensitivity and specicity of the corresponding\FDR based decision rules. For investigating the over-smoothing level of relative risk estimates we will compare box-plots of such estimates in high-risk areas (known by simulation), obtained by both our model and the classic Besag York Mollié model. All the summary tools are worked out for all simulated scenarios (in total 54 scenarios). Results show that FDR is well estimated (in the worst case we get an overestimation, hence a conservative FDR control) in small areas, low risk levels and spatially correlated risks scenarios, that are our primary aims. In such scenarios we have good estimates of the FDR for all values less or equal than 0.10. The sensitivity of\FDR based decision rules is generally low but specicity is high. In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can be suggested. In cases where the number of true alternative hypotheses (number of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and \FDR = 0:15 based decision rules gains power maintaining an high specicity. On the other hand, in non-small areas and non-small risk level scenarios the FDR is under-estimated unless for very small values of it (much lower than 0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule. In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules cannot be suggested because the true FDR is actually much higher. As regards the relative risk estimation, our model achieves almost the same results of the classic Besag York Molliè model. For this reason, our model is interesting for its ability to perform both the estimation of relative risk values and the FDR control, except for non-small areas and large risk level scenarios. A case of study is nally presented to show how the method can be used in epidemiology.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Nell'era genomica moderna, la mole di dati generata dal sequenziamento genetico è diventata estremamente elevata. L’analisi di dati genomici richiede l’utilizzo di metodi di significatività statistica per quantificare la robustezza delle correlazioni individuate nei dati. La significatività statistica ci permette di capire se le relazioni nei dati che stiamo analizzando abbiano effettivamente un peso statistico, cioè se l’evento che stiamo analizzando è successo “per caso” o è effettivamente corretto pensare che avvenga con una probabilità utile. Indipendentemente dal test statistico utilizzato, in presenza di test multipli di verifica (“Multiple Testing Hypothesis”) è necessario utilizzare metodi per la correzione della significatività statistica (“Multiple Testing Correction”). Lo scopo di questa tesi è quello di rendere disponibili le implementazioni dei più noti metodi di correzione della significatività statistica. È stata creata una raccolta di questi metodi, sottoforma di libreria, proprio perché nel panorama bioinformatico moderno non è stato trovato nulla del genere.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

When we study the variables that a ffect survival time, we usually estimate their eff ects by the Cox regression model. In biomedical research, e ffects of the covariates are often modi ed by a biomarker variable. This leads to covariates-biomarker interactions. Here biomarker is an objective measurement of the patient characteristics at baseline. Liu et al. (2015) has built up a local partial likelihood bootstrap model to estimate and test this interaction e ffect of covariates and biomarker, but the R code developed by Liu et al. (2015) can only handle one variable and one interaction term and can not t the model with adjustment to nuisance variables. In this project, we expand the model to allow adjustment to nuisance variables, expand the R code to take any chosen interaction terms, and we set up many parameters for users to customize their research. We also build up an R package called "lplb" to integrate the complex computations into a simple interface. We conduct numerical simulation to show that the new method has excellent fi nite sample properties under both the null and alternative hypothesis. We also applied the method to analyze data from a prostate cancer clinical trial with acid phosphatase (AP) biomarker.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Mestrado em Finanças

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Environmental effects on the concentration of photosynthetic pigments in micro-algae can be explained by dynamics of photosystem synthesis and deactivation. A model that couples photosystem losses to the relative cellular rates of energy harvesting (light absorption) and assimilation predicts optimal concentrations of light-harvesting pigments and balanced energy flow under environmental conditions that affect light availability and metabolic rates. Effects of light intensity, nutrient supply and temperature on growth rate and pigment levels were similar to general patterns observed across diverse micro-algal taxa. Results imply that dynamic behaviour associated with photophysical stress, and independent of gene regulation, might constitute one mechanism for photo-acclimation of photosynthesis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recent empirical studies have found significant evidence of departures from competition in the input side of the Australian bread, breakfast cereal and margarine end-product markets. For example, Griffith (2000) found that firms in some parts of the processing and marketing sector exerted market power when purchasing grains and oilseeds from farmers. As noted at the time, this result accorded well with the views of previous regulatory authorities (p.358). In the mid-1990s, the Prices Surveillence Authority (PSA 1994) determined that the markets for products contained in the Breakfast Cereals and Cooking Oils and Fats indexes were "not effectively competitive" (p.14). The PSA consequently maintained price surveillence on the major firms in this product group. The Griffith result is also consistent with the large number of legal judgements against firms in this sector over the past decade for price fixing or other types of non-competitive behaviour. For example, bread manufacturer George Weston was fined twice during 2000 for non-competitive conduct and the ACCC has also recently pursued and won cases against retailer Safeway in grains and oilseeds product lines.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recent developments in evolutionary physiology have seen many of the long-held assumptions within comparative physiology receive rigorous experimental analysis. Studies of the adaptive significance of physiological acclimation exemplify this new evolutionary approach. The beneficial acclimation hypothesis (BAH) was proposed to describe the assumption that all acclimation changes enhance the physiological performance or fitness of an individual organism. To the surprise of most physiologists, all empirical examinations of the BAH have rejected its generality. However, we suggest that these examinations are neither direct nor complete tests of the functional benefit of acclimation. We consider them to be elegant analyses of the adaptive significance of developmental plasticity, a type of phenotypic plasticity that is very different from the traditional concept of acclimation that is used by comparative physiologists.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The aim of this study was to assess the variation between neuropathologists in the diagnosis of common dementia syndromes when multiple published protocols are applied. Fourteen out of 18 Australian neuropathologists participated in diagnosing 20 cases (16 cases of dementia, 4 age-matched controls) using consensus diagnostic methods. Diagnostic criteria, clinical synopses and slides from multiple brain regions were sent to participants who were asked for case diagnoses. Diagnostic sensitivity, specificity, predictive value, accuracy and variability were determined using percentage agreement and kappa statistics. Using CERAD criteria, there was a high inter-rater agreement for cases with probable and definite Alzheimer's disease but low agreement for cases with possible Alzheimer's disease. Braak staging and the application of criteria for dementia with Lewy bodies also resulted in high inter-rater agreement. There was poor agreement for the diagnosis of frontotemporal dementia and for identifying small vessel disease. Participants rarely diagnosed more than one disease in any case. To improve efficiency when applying multiple diagnostic criteria, several simplifications were proposed and tested on 5 of the original 210 cases. Inter-rater reliability for the diagnosis of Alzheimer's disease and dementia with Lewy bodies significantly improved. Further development of simple and accurate methods to identify small vessel lesions and diagnose frontotemporal dementia is warranted.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Liberal-Institutionalism and Structural Realism expectations about international organizations are confronted by looking at if and how US-controlled international aid is granted, and particularly if it is related or not to political affinity and to United Nations Security Council (UNSC) non-permanent membership. A preliminary assessment suggests that these relations only hold for the period of the Cold War, and, even then, only when UNSC non-permanent membership is in years in which the Security Council was deemed very important.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper addresses the challenging task of computing multiple roots of a system of nonlinear equations. A repulsion algorithm that invokes the Nelder-Mead (N-M) local search method and uses a penalty-type merit function based on the error function, known as 'erf', is presented. In the N-M algorithm context, different strategies are proposed to enhance the quality of the solutions and improve the overall efficiency. The main goal of this paper is to use a two-level factorial design of experiments to analyze the statistical significance of the observed differences in selected performance criteria produced when testing different strategies in the N-M based repulsion algorithm. The main goal of this paper is to use a two-level factorial design of experiments to analyze the statistical significance of the observed differences in selected performance criteria produced when testing different strategies in the N-M based repulsion algorithm.