994 resultados para gene trees


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Decision Trees need train samples in the train data set to get classification rules. If the number of train data was too small, the important information might be missed and thus the model could not explain the classification rules of data. While it is not affirmative that large scale of train data set can get well model. This Paper analysis the relationship between decision trees and the train data scale. We use nine decision tree algorithms to experiment the accuracy, complexity and robustness of decision tree algorithms. Some results are demonstrated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CD6 has recently been identified and validated as risk gene for multiple sclerosis (MS), based on the association of a single nucleotide polymorphism (SNP), rs17824933, located in intron 1. CD6 is a cell surface scavenger receptor involved in T-cell activation and proliferation, as well as in thymocyte differentiation. In this study, we performed a haptag SNP screen of the CD6 gene locus using a total of thirteen tagging SNPs, of which three were non-synonymous SNPs, and replicated the recently reported GWAS SNP rs650258 in a Spanish-Basque collection of 814 controls and 823 cases. Validation of the six most strongly associated SNPs was performed in an independent collection of 2265 MS patients and 2600 healthy controls. We identified association of haplotypes composed of two non-synonymous SNPs [rs11230563 (R225W) and rs2074225 (A257V)] in the 2nd SRCR domain with susceptibility to MS (Pmax(T) permutation=161024). The effect of these haplotypes on CD6 surface expression and cytokine secretion was also tested. The analysis showed significantly different CD6 expression patterns in the distinct cell subsets, i.e. – CD4+ naı¨ve cells, P = 0.0001; CD8+ naı¨ve cells, P,0.0001; CD4+ and CD8+ central memory cells, P = 0.01 and 0.05, respectively; and natural killer T (NKT) cells, P = 0.02; with the protective haplotype (RA) showing higher expression of CD6. However, no significant changes were observed in natural killer (NK) cells, effector memory and terminally differentiated effector memory T cells. Our findings reveal that this new MS-associated CD6 risk haplotype significantly modifies expression of CD6 on CD4+ and CD8+ T cells.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Progress report for the Trees and Tweets, Digging into Data Challenge round 3, project.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Gene expression technologies have opened up new ways to diagnose and treat cancer and other diseases. Clustering algorithms are a useful approach with which to analyze genome expression data. They attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. An important problem associated with gene classification is to discern whether the clustering process can find a relevant partition as well as the identification of new genes classes. There are two key aspects to classification: the estimation of the number of clusters, and the decision as to whether a new unit (gene, tumor sample ... ) belongs to one of these previously identified clusters or to a new group. Results: ICGE is a user-friendly R package which provides many functions related to this problem: identify the number of clusters using mixed variables, usually found by applied biomedical researchers; detect whether the data have a cluster structure; identify whether a new unit belongs to one of the pre-identified clusters or to a novel group, and classify new units into the corresponding cluster. The functions in the ICGE package are accompanied by help files and easy examples to facilitate its use. Conclusions: We demonstrate the utility of ICGE by analyzing simulated and real data sets. The results show that ICGE could be very useful to a broad research community.