4 resultados para permutation
em Helda - Digital Repository of University of Helsinki
Resumo:
Knowing the chromosomal areas or actual genes affecting the traits under selection would add more information to be used in the selection decisions which would potentially lead to higher genetic response. The first objective of this study was to map quantitative trait loci (QTL) affecting economically important traits in the Finnish Ayrshire population. The second objective was to investigate the effects of using QTL information in marker-assisted selection (MAS) on the genetic response and the linkage disequilibrium between the different parts of the genome. Whole genome scans were carried out on a grand-daughter design with 12 half-sib families and a total of 493 sons. Twelve different traits were studied: milk yield, protein yield, protein content, fat yield, fat content, somatic cell score (SCS), mastitis treatments, other veterinary treatments, days open, fertility treatments, non-return rate, and calf mortality. The average spacing of the typed markers was 20 cM with 2 to 14 markers per chromosome. Associations between markers and traits were analyzed with multiple marker regression. Significance was determined by permutation and genome-wise P-values obtained by Bonferroni correction. The benefits from MAS were investigated by simulation: a conventional progeny testing scheme was compared to a scheme where QTL information was used within families to select among full-sibs in the male path. Two QTL on different chromosomes were modelled. The effects of different starting frequencies of the favourable alleles and different size of the QTL effects were evaluated. A large number of QTL, 48 in total, were detected at 5% or higher chromosome-wise significance. QTL for milk production were found on 8 chromosomes, for SCS on 6, for mastitis treatments on 1, for other veterinary treatments on 5, for days open on 7, for fertility treatments on 7, for calf mortality on 6, and for non-return rate on 2 chromosomes. In the simulation study the total genetic response was faster with MAS than with conventional selection and the advantage of MAS persisted over the studied generations. The rate of response and the difference between the selection schemes reflected clearly the changes in allele frequencies of the favourable QTL. The disequilibrium between the polygenes and QTL was always negative and it was larger with larger QTL size. The disequilibrium between the two QTL was larger with QTL of large effect and it was somewhat larger with MAS for scenarios with starting frequencies below 0.5 for QTL of moderate size and below 0.3 for large QTL. In conclusion, several QTL affecting economically important traits of dairy cattle were detected. Further studies are needed to verify these QTL, check their presence in the present breeding population, look for pleiotropy and fine map the most interesting QTL regions. The results of the simulation studies show that using MAS together with embryo transfer to pre-select young bulls within families is a useful approach to increase the genetic merit of the AI-bulls compared to conventional selection.
Resumo:
This thesis studies the tree species’ juvenile diversity in cacao (Theobroma cacao L.) based agroforestry and in primary forest in a natural conservation forest environment of Lore Lindu National Park, Sulawesi, Indonesia. Species’ adult composition in Lore Lindu National Park is relatively well studied, less is known about tree species’ diversity in seedling communities particularly in frequently disturbed cacao agroforestry field environment. Cacao production forms a potentially serious thread for maintaining the conservation areas pristine and forested in Sulawesi. The impacts of cacao production on natural environment are directly linked to the diversity and abundance of shade tree usage. The study aims at comparing differences between cacao agroforestry and natural forest in the surrounding area in their species composition in seedling and sapling size categories. The study was carried out in two parts. Biodiversity inventory of seedlings and saplings was combined with social survey with farmer interviews. Aim of the survey was to gain knowledge of the cacao fields, and farmers’ observations and choices regarding tree species associated with cacao. Data was collected in summer 2008. The assessment of the impact of environmental factors of solar radiation, weeding frequency, cacao tree planting density, distance to forest and distance to main park road, and type of habitat on seedling and sapling compositions was done with Non-metric Multidimensional Scaling (NMS). Outlier analysis was used to assess distorting variables for NMS, and Multi-Response Permutation Procedures (MRPP) analysis to differentiate the impact of categorical variables. Sampling success was estimated with rarefaction curves and jackknife estimate of species richness. In the inventory 135 species of trees and shrubs were found. Only some agroforestry related species were dominating. The most species rich were sapling communities in forest habitat. NMS was showing generally low linear correlation between variation of species composition and environmental variables. Solar radiation was having most significance as explaining variable. The most clearly separated in ordination were cacao and forest habitats. The results of seedling and sapling inventory were only partly coinciding with farmers’ knowledge of the tree species occurring on their fields. More research with frequent assessment of seedling cohorts is needed due to natural variability of cohorts and high mortality rate of seedlings.
Resumo:
This work is a case study of applying nonparametric statistical methods to corpus data. We show how to use ideas from permutation testing to answer linguistic questions related to morphological productivity and type richness. In particular, we study the use of the suffixes -ity and -ness in the 17th-century part of the Corpus of Early English Correspondence within the framework of historical sociolinguistics. Our hypothesis is that the productivity of -ity, as measured by type counts, is significantly low in letters written by women. To test such hypotheses, and to facilitate exploratory data analysis, we take the approach of computing accumulation curves for types and hapax legomena. We have developed an open source computer program which uses Monte Carlo sampling to compute the upper and lower bounds of these curves for one or more levels of statistical significance. By comparing the type accumulation from women’s letters with the bounds, we are able to confirm our hypothesis.
Resumo:
Bayesian networks are compact, flexible, and interpretable representations of a joint distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure. This is called structure discovery. This thesis contributes to two areas of structure discovery in Bayesian networks: space--time tradeoffs and learning ancestor relations. The fastest exact algorithms for structure discovery in Bayesian networks are based on dynamic programming and use excessive amounts of space. Motivated by the space usage, several schemes for trading space against time are presented. These schemes are presented in a general setting for a class of computational problems called permutation problems; structure discovery in Bayesian networks is seen as a challenging variant of the permutation problems. The main contribution in the area of the space--time tradeoffs is the partial order approach, in which the standard dynamic programming algorithm is extended to run over partial orders. In particular, a certain family of partial orders called parallel bucket orders is considered. A partial order scheme that provably yields an optimal space--time tradeoff within parallel bucket orders is presented. Also practical issues concerning parallel bucket orders are discussed. Learning ancestor relations, that is, directed paths between nodes, is motivated by the need for robust summaries of the network structures when there are unobserved nodes at work. Ancestor relations are nonmodular features and hence learning them is more difficult than modular features. A dynamic programming algorithm is presented for computing posterior probabilities of ancestor relations exactly. Empirical tests suggest that ancestor relations can be learned from observational data almost as accurately as arcs even in the presence of unobserved nodes.