2 resultados para Two-Locus Inheritance, Pest Resistance, Grain, Genetics
em Collection Of Biostatistics Research Archive
Resumo:
The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in the context of generalized linear regression based on a previous approach, Iteratively ReWeighted Partial Least Squares, i.e. IRWPLS (Marx, 1996). We compare our results with two-stage PLS (Nguyen and Rocke, 2002A; Nguyen and Rocke, 2002B) and other classifiers. We show that by phrasing the problem in a generalized linear model setting and by applying bias correction to the likelihood to avoid (quasi)separation, we often get lower classification error rates.
Resumo:
The etiology of complex diseases is heterogeneous. The presence of risk alleles in one or more genetic loci affects the function of a variety of intermediate biological pathways, resulting in the overt expression of disease. Hence, there is an increasing focus on identifying the genetic basis of disease by sytematically studying phenotypic traits pertaining to the underlying biological functions. In this paper we focus on identifying genetic loci linked to quantitative phenotypic traits in experimental crosses. Such genetic mapping methods often use a one stage design by genotyping all the markers of interest on the available subjects. A genome scan based on single locus or multi-locus models is used to identify the putative loci. Since the number of quantitative trait loci (QTLs) is very likely to be small relative to the number of markers genotyped, a one-stage selective genotyping approach is commonly used to reduce the genotyping burden, whereby markers are genotyped solely on individuals with extreme trait values. This approach is powerful in the presence of a single quantitative trait locus (QTL) but may result in substantial loss of information in the presence of multiple QTLs. Here we investigate the efficiency of sequential two stage designs to identify QTLs in experimental populations. Our investigations for backcross and F2 crosses suggest that genotyping all the markers on 60% of the subjects in Stage 1 and genotyping the chromosomes significant at 20% level using additional subjects in Stage 2 and testing using all the subjects provides an efficient approach to identify the QTLs and utilizes only 70% of the genotyping burden relative to a one stage design, regardless of the heritability and genotyping density. Complex traits are a consequence of multiple QTLs conferring main effects as well as epistatic interactions. We propose a two-stage analytic approach where a single-locus genome scan is conducted in Stage 1 to identify promising chromosomes, and interactions are examined using the loci on these chromosomes in Stage 2. We examine settings under which the two-stage analytic approach provides sufficient power to detect the putative QTLs.