598 resultados para gene selection


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A modeling paradigm is proposed for covariate, variance and working correlation structure selection for longitudinal data analysis. Appropriate selection of covariates is pertinent to correct variance modeling and selecting the appropriate covariates and variance function is vital to correlation structure selection. This leads to a stepwise model selection procedure that deploys a combination of different model selection criteria. Although these criteria find a common theoretical root based on approximating the Kullback-Leibler distance, they are designed to address different aspects of model selection and have different merits and limitations. For example, the extended quasi-likelihood information criterion (EQIC) with a covariance penalty performs well for covariate selection even when the working variance function is misspecified, but EQIC contains little information on correlation structures. The proposed model selection strategies are outlined and a Monte Carlo assessment of their finite sample properties is reported. Two longitudinal studies are used for illustration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Efficiency of analysis using generalized estimation equations is enhanced when intracluster correlation structure is accurately modeled. We compare two existing criteria (a quasi-likelihood information criterion, and the Rotnitzky-Jewell criterion) to identify the true correlation structure via simulations with Gaussian or binomial response, covariates varying at cluster or observation level, and exchangeable or AR(l) intracluster correlation structure. Rotnitzky and Jewell's approach performs better when the true intracluster correlation structure is exchangeable, while the quasi-likelihood criteria performs better for an AR(l) structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Breast cancer incidence and mortality rates are increasing despite our current knowledge on the disease. Ninety-five percent of breast cancer cases correspond to sporadic forms of the disease and are believed to involve an interaction between environmental and genetic determinants. The microRNA 17–92 cluster host gene (MIR17HG) has been shown to regulate expression of genes involved in breast cancer development and progression. Study of single-nucleotide polymorphisms (SNPs) located in this cluster gene could help provide a further understanding of its role in breast cancer. Therefore, this study investigated six SNPs in the MIR17HG using two independent Australian Caucasian case–control populations (GRC-BC and GU-CCQ BB populations) to determine association to breast cancer susceptibility. Genotyping was undertaken using chip-based matrix assisted laser desorption ionisation time-of-flight (MALDI-TOF) mass spectrometry (MS). We found significant association between rs4824505 and breast cancer at the allelic level in both study cohorts (GRC-BC p = 0.01 and GU-CCQ BB p = 0.03). Furthermore, haplotypic analysis of results from our combined population determined a significant association between rs4824505/rs7336610 and breast cancer susceptibility (p = 5 × 10−4). Our study is the first to show that the A allele of rs4824505 and the AC haplotype of rs4824505/rs7336610 are associated with risk of breast cancer development. However, definitive validation of this finding requires larger cohorts or populations in different ethnical backgrounds. Finally, functional studies of these SNPs could provide a deeper understanding of the role that MIR17HG plays in the pathophysiology of breast cancer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Movement of tephritid flies underpins their survival, reproduction, and ability to establish in new areas and is thus of importance when designing effective management strategies. Much of the knowledge currently available on tephritid movement throughout landscapes comes from the use of direct or indirect methods that rely on the trapping of individuals. Here, we review published experimental designs and methods from mark-release-recapture (MRR) studies, as well as other methods, that have been used to estimate movement of the four major tephritid pest genera (Bactrocera, Ceratitis, Anastrepha, and Rhagoletis). In doing so, we aim to illustrate the theoretical and practical considerations needed to study tephritid movement. MRR studies make use of traps to directly estimate the distance that tephritid species can move within a generation and to evaluate the ecological and physiological factors that influence dispersal patterns. MRR studies, however, require careful planning to ensure that the results obtained are not biased by the methods employed, including marking methods, trap properties, trap spacing, and spatial extent of the trapping array. Despite these obstacles, MRR remains a powerful tool for determining tephritid movement, with data particularly required for understudied species that affect developing countries. To ensure that future MRR studies are successful, we suggest that site selection be carefully considered and sufficient resources be allocated to achieve optimal spacing and placement of traps in line with the stated aims of each study. An alternative to MRR is to make use of indirect methods for determining movement, or more correctly, gene flow, which have become widely available with the development of molecular tools. Key to these methods is the trapping and sequencing of a suitable number of individuals to represent the genetic diversity of the sampled population and investigate population structuring using nuclear genomic markers or non-recombinant mitochondrial DNA markers. Microsatellites are currently the preferred marker for detecting recent population displacement and provide genetic information that may be used in assignment tests for the direct determination of contemporary movement. Neither MRR nor molecular methods, however, are able to monitor fine-scale movements of individual flies. Recent developments in the miniaturization of electronics offer the tantalising possibility to track individual movements of insects using harmonic radar. Computer vision and radio frequency identification tags may also permit the tracking of fine-scale movements by tephritid flies by automated resampling, although these methods come with the same problems as traditional traps used in MRR studies. Although all methods described in this chapter have limitations, a better understanding of tephritid movement far outweighs the drawbacks of the individual methods because of the need for this information to manage tephritid populations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The identification of molecular networks at the system level in mammals is accelerated by next-generation mammalian genetics without crossing, which requires both the efficient production of whole-body biallelic knockout (KO) mice in a single generation and high-performance phenotype analyses. Here, we show that the triple targeting of a single gene using the CRISPR/Cas9 system achieves almost perfect KO efficiency (96%–100%). In addition, we developed a respiration-based fully automated noninvasive sleep phenotyping system, the Snappy Sleep Stager (SSS), for high-performance (95.3% accuracy) sleep/wake staging. Using the triple-target CRISPR and SSS in tandem, we reliably obtained sleep/wake phenotypes, even in double-KO mice. By using this system to comprehensively analyze all of the N-methyl-D-aspartate (NMDA) receptor family members, we found Nr3a as a short-sleeper gene, which is verified by an independent set of triple-target CRISPR. These results demonstrate the application of mammalian reverse genetics without crossing to organism-level systems biology in sleep research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Introduction: It is unclear whether patients diagnosed according to International Classification of Headache Disorders criteria for migraine with aura (MA) and migraine without aura (MO) experience distinct disorders or whether their migraine subtypes are genetically related. Aim: Using a novel gene-based (statistical) approach, we aimed to identify individual genes and pathways associated both with MA and MO. Methods: Gene-based tests were performed using genome-wide association summary statistic results from the most recent International Headache Genetics Consortium study comparing 4505 MA cases with 34,813 controls and 4038 MO cases with 40,294 controls. After accounting for non-independence of gene-based test results, we examined the significance of the proportion of shared genes associated with MA and MO. Results: We found a significant overlap in genes associated with MA and MO. Of the total 1514 genes with a nominally significant gene-based p value (pgene-based ≤ 0.05) in the MA subgroup, 107 also produced pgene-based ≤ 0.05 in the MO subgroup. The proportion of overlapping genes is almost double the empirically derived null expectation, producing significant evidence of gene-based overlap (pleiotropy) (pbinomial-test = 1.5 × 10–4). Combining results across MA and MO, six genes produced genome-wide significant gene-based p values. Four of these genes (TRPM8, UFL1, FHL5 and LRP1) were located in close proximity to previously reported genome-wide significant SNPs for migraine, while two genes, TARBP2 and NPFF separated by just 259 bp on chromosome 12q13.13, represent a novel risk locus. The genes overlapping in both migraine types were enriched for functions related to inflammation, the cardiovascular system and connective tissue. Conclusions: Our results provide novel insight into the likely genes and biological mechanisms that underlie both MA and MO, and when combined with previous data, highlight the neuropeptide FF-amide peptide encoding gene (NPFF) as a novel candidate risk gene for both types of migraine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Leucocyte telomere length (LTL), which is fashioned by multiple genes, has been linked to a host of human diseases, including sporadic melanoma. A number of genes associated with LTL have already been identified through genome-wide association studies. The main aim of this study was to establish whether DCAF4 (DDB1 and CUL4-associated factor 4) is associated with LTL. In addition, using ingenuity pathway analysis (IPA), we examined whether LTL-associated genes in the general population might partially explain the inherently longer LTL in patients with sporadic melanoma, the risk for which is increased with ultraviolet radiation (UVR). Results Genome-wide association (GWA) meta-analysis and de novo genotyping of 20 022 individuals revealed a novel association (p=6.4×10−10) between LTL and rs2535913, which lies within DCAF4. Notably, eQTL analysis showed that rs2535913 is associated with decline in DCAF4 expressions in both lymphoblastoid cells and sun-exposed skin (p=4.1×10−3 and 2×10−3, respectively). Moreover, IPA revealed that LTL-associated genes, derived from GWA meta-analysis (N=9190), are over-represented among genes engaged in melanoma pathways. Meeting increasingly stringent p value thresholds (p<0.05, <0.01, <0.005, <0.001) in the LTL-GWA meta-analysis, these genes were jointly over-represented for melanoma at p values ranging from 1.97×10−169 to 3.42×10−24. Conclusions We uncovered a new locus associated with LTL in the general population. We also provided preliminary findings that suggest a link of LTL through genetic mechanisms with UVR and melanoma in the general population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is evidence across several species for genetic control of phenotypic variation of complex traits1, 2, 3, 4, such that the variance among phenotypes is genotype dependent. Understanding genetic control of variability is important in evolutionary biology, agricultural selection programmes and human medicine, yet for complex traits, no individual genetic variants associated with variance, as opposed to the mean, have been identified. Here we perform a meta-analysis of genome-wide association studies of phenotypic variation using ~170,000 samples on height and body mass index (BMI) in human populations. We report evidence that the single nucleotide polymorphism (SNP) rs7202116 at the FTO gene locus, which is known to be associated with obesity (as measured by mean BMI for each rs7202116 genotype)5, 6, 7, is also associated with phenotypic variability. We show that the results are not due to scale effects or other artefacts, and find no other experiment-wise significant evidence for effects on variability, either at loci other than FTO for BMI or at any locus for height. The difference in variance for BMI among individuals with opposite homozygous genotypes at the FTO locus is approximately 7%, corresponding to a difference of ~0.5 kilograms in the standard deviation of weight. Our results indicate that genetic variants can be discovered that are associated with variability, and that between-person variability in obesity can partly be explained by the genotype at the FTO locus. The results are consistent with reported FTO by environment interactions for BMI8, possibly mediated by DNA methylation9, 10. Our BMI results for other SNPs and our height results for all SNPs suggest that most genetic variants, including those that influence mean height or mean BMI, are not associated with phenotypic variance, or that their effects on variability are too small to detect even with samples sizes greater than 100,000.