174 resultados para Statistical Tolerance Analysis
Resumo:
The discovery of several genes that affect the risk for Alzheimer's disease ignited a worldwide search for single-nucleotide polymorphisms (SNPs), common genetic variants that affect the brain. Genome-wide search of all possible SNP-SNP interactions is challenging and rarely attempted because of the complexity of conducting approximately 1011 pairwise statistical tests. However, recent advances in machine learning, for example, iterative sure independence screening, make it possible to analyze data sets with vastly more predictors than observations. Using an implementation of the sure independence screening algorithm (called EPISIS), we performed a genome-wide interaction analysis testing all possible SNP-SNP interactions affecting regional brain volumes measured on magnetic resonance imaging and mapped using tensor-based morphometry. We identified a significant SNP-SNP interaction between rs1345203 and rs1213205 that explains 1.9% of the variance in temporal lobe volume. We mapped the whole brain, voxelwise effects of the interaction in the Alzheimer's Disease Neuroimaging Initiative data set and separately in an independent replication data set of healthy twins (Queensland Twin Imaging). Each additional loading in the interaction effect was associated with approximately 5% greater brain regional brain volume (a protective effect) in both Alzheimer's Disease Neuroimaging Initiative and Queensland Twin Imaging samples.
Resumo:
Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a "candidate interactome" (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms. © 2013 Mechelli et al.
Resumo:
Some statistical procedures already available in literature are employed in developing the water quality index, WQI. The nature of complexity and interdependency that occur in physical and chemical processes of water could be easier explained if statistical approaches were applied to water quality indexing. The most popular statistical method used in developing WQI is the principal component analysis (PCA). In literature, the WQI development based on the classical PCA mostly used water quality data that have been transformed and normalized. Outliers may be considered in or eliminated from the analysis. However, the classical mean and sample covariance matrix used in classical PCA methodology is not reliable if the outliers exist in the data. Since the presence of outliers may affect the computation of the principal component, robust principal component analysis, RPCA should be used. Focusing in Langat River, the RPCA-WQI was introduced for the first time in this study to re-calculate the DOE-WQI. Results show that the RPCA-WQI is capable to capture similar distribution in the existing DOE-WQI.
Resumo:
Aims Elevated dynamic plantar pressures are a consistent finding in diabetes patients with peripheral neuropathy with implications for plantar foot ulceration. This meta-analysis aimed to compare the plantar pressures of diabetes patients that had peripheral neuropathy and those with neuropathy with active or previous foot ulcers. Methods Published articles were identified from Medline via OVID, CINAHL, SCOPUS, INFORMIT, Cochrane Central EMBASE via OVID and Web of Science via ISI Web of Knowledge bibliographic databases. Observational studies reporting barefoot dynamic plantar pressure in adults with diabetic peripheral neuropathy, where at least one group had a history of plantar foot ulcers were included. Interventional studies, shod plantar pressure studies and studies not published in English were excluded. Overall mean peak plantar pressure (MPP) and pressure time integral (PTI) were primary outcomes. The six secondary outcomes were MPP and PTI at the rear foot, mid foot and fore foot. The protocol of the meta-analysis was published with PROPSERO, (registration number CRD42013004310). Results Eight observational studies were included. Overall MPP and PTI were greater in diabetic peripheral neuropathy patients with foot ulceration compared to those without ulceration (standardised mean difference 0.551, 95% CI 0.290–0.811, p<0.001; and 0.762, 95% CI 0.303–1.221, p = 0.001, respectively). Sub-group analyses demonstrated no significant difference in MPP for those with neuropathy with active ulceration compared to those without ulcers. A significant difference in MPP was found for those with neuropathy with a past history of ulceration compared to those without ulcers; (0.467, 95% CI 0.181– 0.753, p = 0.001). Statistical heterogeneity between studies was moderate. Conclusions Plantar pressures appear to be significantly higher in patients with diabetic peripheral neuropathy with a history of foot ulceration compared to those with diabetic neuropathy without a history of ulceration. More homogenous data is needed to confirm these findings.
Resumo:
Giant Cell Arteritis (GCA) is the most common vasculitis affecting the elderly. Archived formalin-fixed paraffin-embedded (FFPE) temporal artery biopsy (TAB) specimens potentially represent a valuable resource for large-scale genetic analysis of this disease. FFPE TAB samples were obtained from 12 patients with GCA. Extracted TAB DNA was assessed by real time PCR before restoration using the Illumina HD FFPE Restore Kit. Paired FFPE-blood samples were genotyped on the Illumina OmniExpress FFPE microarray. The FFPE samples that passed stringent quality control measures had a mean genotyping success of >97%. When compared with their matching peripheral blood DNA, the mean discordant heterozygote and homozygote single nucleotide polymorphisms calls were 0.0028 and 0.0003, respectively, which is within the accepted tolerance of reproducibility. This work demonstrates that it is possible to successfully obtain high-quality microarray-based genotypes FFPE TAB samples and that this data is similar to that obtained from peripheral blood.
Resumo:
Background Forearm fractures affect 1.7 million individuals worldwide each year and most occur earlier in life than hip fractures. While the heritability of forearm bone mineral density (BMD) and fracture is high, their genetic determinants are largely unknown. Aim To identify genetic variants associated with forearm BMD and forearm fractures. Methods BMD at distal radius, measured by dualenergy x-ray absorptiometry, was tested for association with common genetic variants. We conducted a metaanalysis of genome-wide association studies for BMD in 5866 subjects of European descent and then selected the variants for replication in 715 Mexican American samples. Gene-based association was carried out to supplement the single-nucleotide polymorphism (SNP) association test. We then tested the BMD-associated SNPs for association with forearm fracture in 2023 cases and 3740 controls. Results We found that five SNPs in the introns of MEF2C were associated with forearm BMD at a genome-wide significance level (p<5×10-8) in meta-analysis (lead SNP, rs11951031[T] -0.20 SDs per allele, p=9.01×10-9). The gene-based association test suggested an association between MEF2C and forearm BMD ( p=0.003). The association between MEF2C variants and risk of fracture did not achieve statistical significance (SNP rs12521522[A]: OR=1.14 (95% CI 0.92 to 1.35), p=0.14). Meta-analysis also revealed two genome-wide suggestive loci at CTNNA2 and 6q23.2. Conclusions These findings demonstrate that variants at MEF2C were associated with forearm BMD, implicating this gene in the determination of BMD at forearm.
Resumo:
Genome-wide association studies (GWASs) have been successful at identifying single-nucleotide polymorphisms (SNPs) highly associated with common traits; however, a great deal of the heritable variation associated with common traits remains unaccounted for within the genome. Genome-wide complex trait analysis (GCTA) is a statistical method that applies a linear mixed model to estimate phenotypic variance of complex traits explained by genome-wide SNPs, including those not associated with the trait in a GWAS. We applied GCTA to 8 cohorts containing 7096 case and 19 455 control individuals of European ancestry in order to examine the missing heritability present in Parkinson's disease (PD). We meta-analyzed our initial results to produce robust heritability estimates for PD types across cohorts. Our results identify 27% (95% CI 17-38, P = 8.08E - 08) phenotypic variance associated with all types of PD, 15% (95% CI -0.2 to 33, P = 0.09) phenotypic variance associated with early-onset PD and 31% (95% CI 17-44, P = 1.34E - 05) phenotypic variance associated with late-onset PD. This is a substantial increase from the genetic variance identified by top GWAS hits alone (between 3 and 5%) and indicates there are substantially more risk loci to be identified. Our results suggest that although GWASs are a useful tool in identifying the most common variants associated with complex disease, a great deal of common variants of small effect remain to be discovered. © Published by Oxford University Press 2012.
Resumo:
Objective. Ankylosing spondylitis (AS) is a debilitating chronic inflammatory condition with a high degree of familiality (λs=82) and heritability (>90%) that primarily affects spinal and sacroiliac joints. Whole genome scans for linkage to AS phenotypes have been conducted, although results have been inconsistent between studies and all have had modest sample sizes. One potential solution to these issues is to combine data from multiple studies in a retrospective meta-analysis. Methods: The International Genetics of Ankylosing Spondylitis Consortium combined data from three whole genome linkage scans for AS (n=3744 subjects) to determine chromosomal markers that show evidence of linkage with disease. Linkage markers typed in different centres were integrated into a consensus map to facilitate effective data pooling. We performed a weighted meta-analysis to combine the linkage results, and compared them with the three individual scans and a combined pooled scan. Results: In addition to the expected region surrounding the HLA-B27 gene on chromosome 6, we determined that several marker regions showed significant evidence of linkage with disease status. Regions on chromosome 10q and 16q achieved 'suggestive' evidence of linkage, and regions on chromosomes 1q, 3q, 5q, 6q, 9q, 17q and 19q showed at least nominal linkage in two or more scans and in the weighted meta-analysis. Regions previously associated with AS on chromosome 2q (the IL-1 gene cluster) and 22q (CYP2D6) exhibited nominal linkage in the meta-analysis, providing further statistical support for their involvement in susceptibility to AS. Conclusion: These findings provide a useful guide for future studies aiming to identify the genes involved in this highly heritable condition. . Published by on behalf of the British Society for Rheumatology.
Resumo:
Objective: The aim of this study was to explore whether there is a relationship between the degree of MR-defined inflammation using ultra small super-paramagnetic iron oxide (USPIO) particles, and biomechanical stress using finite element analysis (FEA) techniques, in carotid atheromatous plaques. Methods and Results: 18 patients with angiographically proven carotid stenoses underwent multi-sequence MR imaging before and 36 h after USPIO infusion. T2 * weighted images were manually segmented into quadrants and the signal change in each quadrant normalised to adjacent muscle was calculated after USPIO administration. Plaque geometry was obtained from the rest of the multi-sequence dataset and used within a FEA model to predict maximal stress concentration within each slice. Subsequently, a new statistical model was developed to explicitly investigate the form of the relationship between biomechanical stress and signal change. The Spearman's rank correlation coefficient for USPIO enhanced signal change and maximal biomechanical stress was -0.60 (p = 0.009). Conclusions: There is an association between biomechanical stress and USPIO enhanced MR-defined inflammation within carotid atheroma, both known risk factors for plaque vulnerability. This underlines the complex interaction between physiological processes and biomechanical mechanisms in the development of carotid atheroma. However, this is preliminary data that will need validation in a larger cohort of patients.
Resumo:
With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.
Resumo:
Water temperature measurements from Wivenhoe Dam offer a unique opportunity for studying fluctuations of temperatures in a subtropical dam as a function of time and depth. Cursory examination of the data indicate a complicated structure across both time and depth. We propose simplifying the task of describing these data by breaking the time series at each depth into physically meaningful components that individually capture daily, subannual, and annual (DSA) variations. Precise definitions for each component are formulated in terms of a wavelet-based multiresolution analysis. The DSA components are approximately pairwise uncorrelated within a given depth and between different depths. They also satisfy an additive property in that their sum is exactly equal to the original time series. Each component is based upon a set of coefficients that decomposes the sample variance of each time series exactly across time and that can be used to study both time-varying variances of water temperature at each depth and time-varying correlations between temperatures at different depths. Each DSA component is amenable for studying a certain aspect of the relationship between the series at different depths. The daily component in general is weakly correlated between depths, including those that are adjacent to one another. The subannual component quantifies seasonal effects and in particular isolates phenomena associated with the thermocline, thus simplifying its study across time. The annual component can be used for a trend analysis. The descriptive analysis provided by the DSA decomposition is a useful precursor to a more formal statistical analysis.
Resumo:
In analysis of longitudinal data, the variance matrix of the parameter estimates is usually estimated by the 'sandwich' method, in which the variance for each subject is estimated by its residual products. We propose smooth bootstrap methods by perturbing the estimating functions to obtain 'bootstrapped' realizations of the parameter estimates for statistical inference. Our extensive simulation studies indicate that the variance estimators by our proposed methods can not only correct the bias of the sandwich estimator but also improve the confidence interval coverage. We applied the proposed method to a data set from a clinical trial of antibiotics for leprosy.
Resumo:
Robust methods are useful in making reliable statistical inferences when there are small deviations from the model assumptions. The widely used method of the generalized estimating equations can be "robustified" by replacing the standardized residuals with the M-residuals. If the Pearson residuals are assumed to be unbiased from zero, parameter estimators from the robust approach are asymptotically biased when error distributions are not symmetric. We propose a distribution-free method for correcting this bias. Our extensive numerical studies show that the proposed method can reduce the bias substantially. Examples are given for illustration.
Resumo:
Prostate cancer is the second most common malignancy among men worldwide. Genome-wide association studies have identified 100 risk variants for prostate cancer, which can explain approximately 33% of the familial risk of the disease. We hypothesized that a comprehensive analysis of genetic variations found within the 3' untranslated region of genes predicted to affect miRNA binding (miRSNP) can identify additional prostate cancer risk variants. We investigated the association between 2,169 miRSNPs and prostate cancer risk in a large-scale analysis of 22,301 cases and 22,320 controls of European ancestry from 23 participating studies. Twenty-two miRSNPs were associated (P<2.3×10(-5)) with risk of prostate cancer, 10 of which were within 7 genes previously not mapped by GWAS studies. Further, using miRNA mimics and reporter gene assays, we showed that miR-3162-5p has specific affinity for the KLK3 rs1058205 miRSNP T-allele, whereas miR-370 has greater affinity for the VAMP8 rs1010 miRSNP A-allele, validating their functional role. SIGNIFICANCE Findings from this large association study suggest that a focus on miRSNPs, including functional evaluation, can identify candidate risk loci below currently accepted statistical levels of genome-wide significance. Studies of miRNAs and their interactions with SNPs could provide further insights into the mechanisms of prostate cancer risk.
Resumo:
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988-96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.