940 resultados para models, genetic
Resumo:
The factors affecting the non-industrial, private forest landowners' (hereafter referred to using the acronym NIPF) strategic decisions in management planning are studied. A genetic algorithm is used to induce a set of rules predicting potential cut of the landowners' choices of preferred timber management strategies. The rules are based on variables describing the characteristics of the landowners and their forest holdings. The predictive ability of a genetic algorithm is compared to linear regression analysis using identical data sets. The data are cross-validated seven times applying both genetic algorithm and regression analyses in order to examine the data-sensitivity and robustness of the generated models. The optimal rule set derived from genetic algorithm analyses included the following variables: mean initial volume, landowner's positive price expectations for the next eight years, landowner being classified as farmer, and preference for the recreational use of forest property. When tested with previously unseen test data, the optimal rule set resulted in a relative root mean square error of 0.40. In the regression analyses, the optimal regression equation consisted of the following variables: mean initial volume, proportion of forestry income, intention to cut extensively in future, and positive price expectations for the next two years. The R2 of the optimal regression equation was 0.34 and the relative root mean square error obtained from the test data was 0.38. In both models, mean initial volume and positive stumpage price expectations were entered as significant predictors of potential cut of preferred timber management strategy. When tested with the complete data set of 201 observations, both the optimal rule set and the optimal regression model achieved the same level of accuracy.
Resumo:
BACKGROUND: The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. RESULTS: We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. CONCLUSION: The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible.
Resumo:
The fungal species Cryptococcus neoformans and Cryptococcus gattii cause respiratory and neurological disease in animals and humans following inhalation of basidiospores or desiccated yeast cells from the environment. Sexual reproduction in C. neoformans and C. gattii is controlled by a bipolar system in which a single mating type locus (MAT) specifies compatibility. These two species are dimorphic, growing as yeast in the asexual stage, and producing hyphae, basidia, and basidiospores during the sexual stage. In contrast, Filobasidiella depauperata, one of the closest related species, grows exclusively as hyphae and it is found in association with decaying insects. Examination of two available strains of F. depauperata showed that the life cycle of this fungal species shares features associated with the unisexual or same-sex mating cycle in C. neoformans. Therefore, F. depauperata may represent a homothallic and possibly an obligately sexual fungal species. RAPD genotyping of 39 randomly isolated progeny from isolate CBS7855 revealed a new genotype pattern in one of the isolated basidiospores progeny, therefore suggesting that the homothallic cycle in F. depauperata could lead to the emergence of new genotypes. Phylogenetic analyses of genes linked to MAT in C. neoformans indicated that two of these genes in F. depauperata, MYO2 and STE20, appear to form a monophyletic clade with the MATa alleles of C. neoformans and C. gattii, and thus these genes may have been recruited to the MAT locus before F. depauperata diverged. Furthermore, the ancestral MATa locus may have undergone accelerated evolution prior to the divergence of the pathogenic Cryptococcus species since several of the genes linked to the MATa locus appear to have a higher number of changes and substitutions than their MATalpha counterparts. Synteny analyses between C. neoformans and F. depauperata showed that genomic regions on other chromosomes displayed conserved gene order. In contrast, the genes linked to the MAT locus of C. neoformans showed a higher number of chromosomal translocations in the genome of F. depauperata. We therefore propose that chromosomal rearrangements appear to be a major force driving speciation and sexual divergence in these closely related pathogenic and saprobic species.
Resumo:
A strand-specific transcriptome sequencing strategy, directional ligation sequencing or DeLi-seq, was employed to profile antisense transcriptome of Schizosaccharomyces pombe. Under both normal and heat shock conditions, we found that polyadenylated antisense transcripts are broadly expressed while distinct expression patterns were observed for protein-coding and non-coding loci. Dominant antisense expression is enriched in protein-coding genes involved in meiosis or stress response pathways. Detailed analyses further suggest that antisense transcripts are independently regulated with respect to their sense transcripts, and diverse mechanisms might be potentially involved in the biogenesis and degradation of antisense RNAs. Taken together, antisense transcription may have profound impacts on global gene regulation in S. pombe.
Resumo:
The Rhizopus oryzae species complex is a group of zygomycete fungi that are common, cosmopolitan saprotrophs. Some strains are used beneficially for production of Asian fermented foods but they can also act as opportunistic human pathogens. Although R. oryzae reportedly has a heterothallic (+/-) mating system, most strains have not been observed to undergo sexual reproduction and the genetic structure of its mating locus has not been characterized. Here we report on the mating behavior and genetic structure of the mating locus for 54 isolates of the R. oryzae complex. All 54 strains have a mating locus similar in overall organization to Phycomyces blakesleeanus and Mucor circinelloides (Mucoromycotina, Zygomycota). In all of these fungi, the minus (-) allele features the SexM high mobility group (HMG) gene flanked by an RNA helicase gene and a TP transporter gene (TPT). Within the R. oryzae complex, the plus (+) mating allele includes an inserted region that codes for a BTB/POZ domain gene and the SexP HMG gene. Phylogenetic analyses of multiple genes, including the mating loci (HMG, TPT, RNA helicase), ITS1-5.8S-ITS2 rDNA, RPB2, and LDH genes, identified two distinct groups of strains. These correspond to previously described sibling species R. oryzae sensu stricto and R. delemar. Within each species, discordant gene phylogenies among multiple loci suggest an outcrossing population structure. The hypothesis of random-mating is also supported by a 50:50 ratio of plus and minus mating types in both cryptic species. When crossed with tester strains of the opposite mating type, most isolates of R. delemar failed to produce zygospores, while isolates of R. oryzae produced sterile zygospores. In spite of the reluctance of most strains to mate in vitro, the conserved sex locus structure and evidence for outcrossing suggest that a normal sexual cycle occurs in both species.
Resumo:
Complex diseases will have multiple functional sites, and it will be invaluable to understand the cross-locus interaction in terms of linkage disequilibrium (LD) between those sites (epistasis) in addition to the haplotype-LD effects. We investigated the statistical properties of a class of matrix-based statistics to assess this epistasis. These statistical methods include two LD contrast tests (Zaykin et al., 2006) and partial least squares regression (Wang et al., 2008). To estimate Type 1 error rates and power, we simulated multiple two-variant disease models using the SIMLA software package. SIMLA allows for the joint action of up to two disease genes in the simulated data with all possible multiplicative interaction effects between them. Our goal was to detect an interaction between multiple disease-causing variants by means of their linkage disequilibrium (LD) patterns with other markers. We measured the effects of marginal disease effect size, haplotype LD, disease prevalence and minor allele frequency have on cross-locus interaction (epistasis). In the setting of strong allele effects and strong interaction, the correlation between the two disease genes was weak (r=0.2). In a complex system with multiple correlations (both marginal and interaction), it was difficult to determine the source of a significant result. Despite these complications, the partial least squares and modified LD contrast methods maintained adequate power to detect the epistatic effects; however, for many of the analyses we often could not separate interaction from a strong marginal effect. While we did not exhaust the entire parameter space of possible models, we do provide guidance on the effects that population parameters have on cross-locus interaction.
Resumo:
New applications of genetic data to questions of historical biogeography have revolutionized our understanding of how organisms have come to occupy their present distributions. Phylogenetic methods in combination with divergence time estimation can reveal biogeographical centres of origin, differentiate between hypotheses of vicariance and dispersal, and reveal the directionality of dispersal events. Despite their power, however, phylogenetic methods can sometimes yield patterns that are compatible with multiple, equally well-supported biogeographical hypotheses. In such cases, additional approaches must be integrated to differentiate among conflicting dispersal hypotheses. Here, we use a synthetic approach that draws upon the analytical strengths of coalescent and population genetic methods to augment phylogenetic analyses in order to assess the biogeographical history of Madagascar's Triaenops bats (Chiroptera: Hipposideridae). Phylogenetic analyses of mitochondrial DNA sequence data for Malagasy and east African Triaenops reveal a pattern that equally supports two competing hypotheses. While the phylogeny cannot determine whether Africa or Madagascar was the centre of origin for the species investigated, it serves as the essential backbone for the application of coalescent and population genetic methods. From the application of these methods, we conclude that a hypothesis of two independent but unidirectional dispersal events from Africa to Madagascar is best supported by the data.
Resumo:
Existing theories explain why operons are advantageous in prokaryotes, but their occurrence in metazoans is an enigma. Nematode operon genes, typically consisting of growth genes, are significantly upregulated during recovery from growth-arrested states. This expression pattern is anticorrelated to nonoperon genes, consistent with a competition for transcriptional resources. We find that transcriptional resources are initially limiting during recovery and that recovering animals are highly sensitive to any additional decrease in transcriptional resources. We provide evidence that operons become advantageous because, by clustering growth genes into operons, fewer promoters compete for the limited transcriptional machinery, effectively increasing the concentration of transcriptional resources and accelerating recovery. Mathematical modeling reveals how a moderate increase in transcriptional resources can substantially enhance transcription rate and recovery. This design principle occurs in different nematodes and the chordate C. intestinalis. As transition from arrest to rapid growth is shared by many metazoans, operons could have evolved to facilitate these processes.
Resumo:
DNaseI footprinting is an established assay for identifying transcription factor (TF)-DNA interactions with single base pair resolution. High-throughput DNase-seq assays have recently been used to detect in vivo DNase footprints across the genome. Multiple computational approaches have been developed to identify DNase-seq footprints as predictors of TF binding. However, recent studies have pointed to a substantial cleavage bias of DNase and its negative impact on predictive performance of footprinting. To assess the potential for using DNase-seq to identify individual binding sites, we performed DNase-seq on deproteinized genomic DNA and determined sequence cleavage bias. This allowed us to build bias corrected and TF-specific footprint models. The predictive performance of these models demonstrated that predicted footprints corresponded to high-confidence TF-DNA interactions. DNase-seq footprints were absent under a fraction of ChIP-seq peaks, which we show to be indicative of weaker binding, indirect TF-DNA interactions or possible ChIP artifacts. The modeling approach was also able to detect variation in the consensus motifs that TFs bind to. Finally, cell type specific footprints were detected within DNase hypersensitive sites that are present in multiple cell types, further supporting that footprints can identify changes in TF binding that are not detectable using other strategies.
Resumo:
UNLABELLED: • PREMISE OF THE STUDY: Understanding fern (monilophyte) phylogeny and its evolutionary timescale is critical for broad investigations of the evolution of land plants, and for providing the point of comparison necessary for studying the evolution of the fern sister group, seed plants. Molecular phylogenetic investigations have revolutionized our understanding of fern phylogeny, however, to date, these studies have relied almost exclusively on plastid data.• METHODS: Here we take a curated phylogenomics approach to infer the first broad fern phylogeny from multiple nuclear loci, by combining broad taxon sampling (73 ferns and 12 outgroup species) with focused character sampling (25 loci comprising 35877 bp), along with rigorous alignment, orthology inference and model selection.• KEY RESULTS: Our phylogeny corroborates some earlier inferences and provides novel insights; in particular, we find strong support for Equisetales as sister to the rest of ferns, Marattiales as sister to leptosporangiate ferns, and Dennstaedtiaceae as sister to the eupolypods. Our divergence-time analyses reveal that divergences among the extant fern orders all occurred prior to ∼200 MYA. Finally, our species-tree inferences are congruent with analyses of concatenated data, but generally with lower support. Those cases where species-tree support values are higher than expected involve relationships that have been supported by smaller plastid datasets, suggesting that deep coalescence may be reducing support from the concatenated nuclear data.• CONCLUSIONS: Our study demonstrates the utility of a curated phylogenomics approach to inferring fern phylogeny, and highlights the need to consider underlying data characteristics, along with data quantity, in phylogenetic studies.
Resumo:
We recently developed an approach for testing the accuracy of network inference algorithms by applying them to biologically realistic simulations with known network topology. Here, we seek to determine the degree to which the network topology and data sampling regime influence the ability of our Bayesian network inference algorithm, NETWORKINFERENCE, to recover gene regulatory networks. NETWORKINFERENCE performed well at recovering feedback loops and multiple targets of a regulator with small amounts of data, but required more data to recover multiple regulators of a gene. When collecting the same number of data samples at different intervals from the system, the best recovery was produced by sampling intervals long enough such that sampling covered propagation of regulation through the network but not so long such that intervals missed internal dynamics. These results further elucidate the possibilities and limitations of network inference based on biological data.
Resumo:
Genetic oscillators, such as circadian clocks, are constantly perturbed by molecular noise arising from the small number of molecules involved in gene regulation. One of the strongest sources of stochasticity is the binary noise that arises from the binding of a regulatory protein to a promoter in the chromosomal DNA. In this study, we focus on two minimal oscillators based on activator titration and repressor titration to understand the key parameters that are important for oscillations and for overcoming binary noise. We show that the rate of unbinding from the DNA, despite traditionally being considered a fast parameter, needs to be slow to broaden the space of oscillatory solutions. The addition of multiple, independent DNA binding sites further expands the oscillatory parameter space for the repressor-titration oscillator and lengthens the period of both oscillators. This effect is a combination of increased effective delay of the unbinding kinetics due to multiple binding sites and increased promoter ultrasensitivity that is specific for repression. We then use stochastic simulation to show that multiple binding sites increase the coherence of oscillations by mitigating the binary noise. Slow values of DNA unbinding rate are also effective in alleviating molecular noise due to the increased distance from the bifurcation point. Our work demonstrates how the number of DNA binding sites and slow unbinding kinetics, which are often omitted in biophysical models of gene circuits, can have a significant impact on the temporal and stochastic dynamics of genetic oscillators.
Resumo:
A leading theory hypothesizes that schizophrenia arises from dysregulation of the dopamine system in certain brain regions. As this dysregulation could arise from abnormal expression of D2 dopamine receptors, the D2 receptor gene (DRD2) on chromosome 11q is a candidate locus for schizophrenia. We tested whether allelic variation at DRD2 and five surrounding loci cosegregated with schizophrenia in 112 small- to moderate-size Irish families containing two or more members affected with schizophrenia or schizoaffective disorder, defined by DSM-III-R. Evidence of linkage was assessed using varying definitions of illness and modes of transmission. Assuming genetic homogeneity, linkage between schizophrenia and large regions of 11q around DRD2 could be strongly excluded. Assuming genetic heterogeneity, variation at the DRD2 locus could be rejected as a major risk factor for schizophrenia in more than 50% of these families for all models tested and in as few as 25% of the families for certain models. The DRD2 linkage in fewer than 25% of these families could not be excluded under any of the models tested. Our results suggest that the major component of genetic susceptibility to schizophrenia is not due to allelic variation at the DRD2 locus or other genes in the surrounding chromosomal region.
Resumo:
This study was an attempt to replicate evidence for a vulnerability locus for schizophrenia and associated disorders in the 8p22-21 region reported by Pulver and colleagues.