5 resultados para gene selection

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although many feature selection methods for classification have been developed, there is a need to identify genes in high-dimensional data with censored survival outcomes. Traditional methods for gene selection in classification problems have several drawbacks. First, the majority of the gene selection approaches for classification are single-gene based. Second, many of the gene selection procedures are not embedded within the algorithm itself. The technique of random forests has been found to perform well in high-dimensional data settings with survival outcomes. It also has an embedded feature to identify variables of importance. Therefore, it is an ideal candidate for gene selection in high-dimensional data with survival outcomes. In this paper, we develop a novel method based on the random forests to identify a set of prognostic genes. We compare our method with several machine learning methods and various node split criteria using several real data sets. Our method performed well in both simulations and real data analysis.Additionally, we have shown the advantages of our approach over single-gene-based approaches. Our method incorporates multivariate correlations in microarray data for survival outcomes. The described method allows us to better utilize the information available from microarray data with survival outcomes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although it has recently been shown that A/J mice are highly susceptible to Staphylococcus aureus sepsis as compared to C57BL/6J, the specific genes responsible for this differential phenotype are unknown. Using chromosome substitution strains (CSS), we found that loci on chromosomes 8, 11, and 18 influence susceptibility to S. aureus sepsis in A/J mice. We then used two candidate gene selection strategies to identify genes on these three chromosomes associated with S. aureus susceptibility, and targeted genes identified by both gene selection strategies. First, we used whole genome transcription profiling to identify 191 (56 on chr. 8, 100 on chr. 11, and 35 on chr. 18) genes on our three chromosomes of interest that are differentially expressed between S. aureus-infected A/J and C57BL/6J. Second, we identified two significant quantitative trait loci (QTL) for survival post-infection on chr. 18 using N(2) backcross mice (F(1) [C18A]xC57BL/6J). Ten genes on chr. 18 (March3, Cep120, Chmp1b, Dcp2, Dtwd2, Isoc1, Lman1, Spire1, Tnfaip8, and Seh1l) mapped to the two significant QTL regions and were also identified by the expression array selection strategy. Using real-time PCR, 6 of these 10 genes (Chmp1b, Dtwd2, Isoc1, Lman1, Tnfaip8, and Seh1l) showed significantly different expression levels between S. aureus-infected A/J and C57BL/6J. For two (Tnfaip8 and Seh1l) of these 6 genes, siRNA-mediated knockdown of gene expression in S. aureus-challenged RAW264.7 macrophages induced significant changes in the cytokine response (IL-1 beta and GM-CSF) compared to negative controls. These cytokine response changes were consistent with those seen in S. aureus-challenged peritoneal macrophages from CSS 18 mice (which contain A/J chromosome 18 but are otherwise C57BL/6J), but not C57BL/6J mice. These findings suggest that two genes, Tnfaip8 and Seh1l, may contribute to susceptibility to S. aureus in A/J mice, and represent promising candidates for human genetic susceptibility studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: While effective population size (Ne) and life history traits such as generation time are known to impact substitution rates, their potential effects on base composition evolution are less well understood. GC content increases with decreasing body mass in mammals, consistent with recombination-associated GC biased gene conversion (gBGC) more strongly impacting these lineages. However, shifts in chromosomal architecture and recombination landscapes between species may complicate the interpretation of these results. In birds, interchromosomal rearrangements are rare and the recombination landscape is conserved, suggesting that this group is well suited to assess the impact of life history on base composition. RESULTS: Employing data from 45 newly and 3 previously sequenced avian genomes covering a broad range of taxa, we found that lineages with large populations and short generations exhibit higher GC content. The effect extends to both coding and non-coding sites, indicating that it is not due to selection on codon usage. Consistent with recombination driving base composition, GC content and heterogeneity were positively correlated with the rate of recombination. Moreover, we observed ongoing increases in GC in the majority of lineages. CONCLUSIONS: Our results provide evidence that gBGC may drive patterns of nucleotide composition in avian genomes and are consistent with more effective gBGC in large populations and a greater number of meioses per unit time; that is, a shorter generation time. Thus, in accord with theoretical predictions, base composition evolution is substantially modulated by species life history.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The wide range of complex photic systems observed in birds exemplifies one of their key evolutionary adaptions, a well-developed visual system. However, genomic approaches have yet to be used to disentangle the evolutionary mechanisms that govern evolution of avian visual systems. RESULTS: We performed comparative genomic analyses across 48 avian genomes that span extant bird phylogenetic diversity to assess evolutionary changes in the 17 representatives of the opsin gene family and five plumage coloration genes. Our analyses suggest modern birds have maintained a repertoire of up to 15 opsins. Synteny analyses indicate that PARA and PARIE pineal opsins were lost, probably in conjunction with the degeneration of the parietal organ. Eleven of the 15 avian opsins evolved in a non-neutral pattern, confirming the adaptive importance of vision in birds. Visual conopsins sw1, sw2 and lw evolved under negative selection, while the dim-light RH1 photopigment diversified. The evolutionary patterns of sw1 and of violet/ultraviolet sensitivity in birds suggest that avian ancestors had violet-sensitive vision. Additionally, we demonstrate an adaptive association between the RH2 opsin and the MC1R plumage color gene, suggesting that plumage coloration has been photic mediated. At the intra-avian level we observed some unique adaptive patterns. For example, barn owl showed early signs of pseudogenization in RH2, perhaps in response to nocturnal behavior, and penguins had amino acid deletions in RH2 sites responsible for the red shift and retinal binding. These patterns in the barn owl and penguins were convergent with adaptive strategies in nocturnal and aquatic mammals, respectively. CONCLUSIONS: We conclude that birds have evolved diverse opsin adaptations through gene loss, adaptive selection and coevolution with plumage coloration, and that differentiated selective patterns at the species level suggest novel photic pressures to influence evolutionary patterns of more-recent lineages.