954 resultados para Computational biology and bioinformatics
Resumo:
Lesions of anatomical brain networks result in functional disturbances of brain systems and behavior which depend sensitively, often unpredictably, on the lesion site. The availability of whole-brain maps of structural connections within the human cerebrum and our increased understanding of the physiology and large-scale dynamics of cortical networks allow us to investigate the functional consequences of focal brain lesions in a computational model. We simulate the dynamic effects of lesions placed in different regions of the cerebral cortex by recording changes in the pattern of endogenous ("resting-state") neural activity. We find that lesions produce specific patterns of altered functional connectivity among distant regions of cortex, often affecting both cortical hemispheres. The magnitude of these dynamic effects depends on the lesion location and is partly predicted by structural network properties of the lesion site. In the model, lesions along the cortical midline and in the vicinity of the temporo-parietal junction result in large and widely distributed changes in functional connectivity, while lesions of primary sensory or motor regions remain more localized. The model suggests that dynamic lesion effects can be predicted on the basis of specific network measures of structural brain networks and that these effects may be related to known behavioral and cognitive consequences of brain lesions.
Resumo:
Although age-dependent effects on blood pressure (BP) have been reported, they have not been systematically investigated in large-scale genome-wide association studies (GWASs). We leveraged the infrastructure of three well-established consortia (CHARGE, GBPgen, and ICBP) and a nonstandard approach (age stratification and metaregression) to conduct a genome-wide search of common variants with age-dependent effects on systolic (SBP), diastolic (DBP), mean arterial (MAP), and pulse (PP) pressure. In a two-staged design using 99,241 individuals of European ancestry, we identified 20 genome-wide significant (p ≤ 5 × 10(-8)) loci by using joint tests of the SNP main effect and SNP-age interaction. Nine of the significant loci demonstrated nominal evidence of age-dependent effects on BP by tests of the interactions alone. Index SNPs in the EHBP1L1 (DBP and MAP), CASZ1 (SBP and MAP), and GOSR2 (PP) loci exhibited the largest age interactions, with opposite directions of effect in the young versus the old. The changes in the genetic effects over time were small but nonnegligible (up to 1.58 mm Hg over 60 years). The EHBP1L1 locus was discovered through gene-age interactions only in whites but had DBP main effects replicated (p = 8.3 × 10(-4)) in 8,682 Asians from Singapore, indicating potential interethnic heterogeneity. A secondary analysis revealed 22 loci with evidence of age-specific effects (e.g., only in 20 to 29-year-olds). Age can be used to select samples with larger genetic effect sizes and more homogenous phenotypes, which may increase statistical power. Age-dependent effects identified through novel statistical approaches can provide insight into the biology and temporal regulation underlying BP associations.
Resumo:
The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the "ortholog conjecture", or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins.
Resumo:
In recent years, both homing endonucleases (HEases) and zinc-finger nucleases (ZFNs) have been engineered and selected for the targeting of desired human loci for gene therapy. However, enzyme engineering is lengthy and expensive and the off-target effect of the manufactured endonucleases is difficult to predict. Moreover, enzymes selected to cleave a human DNA locus may not cleave the homologous locus in the genome of animal models because of sequence divergence, thus hampering attempts to assess the in vivo efficacy and safety of any engineered enzyme prior to its application in human trials. Here, we show that naturally occurring HEases can be found, that cleave desirable human targets. Some of these enzymes are also shown to cleave the homologous sequence in the genome of animal models. In addition, the distribution of off-target effects may be more predictable for native HEases. Based on our experimental observations, we present the HomeBase algorithm, database and web server that allow a high-throughput computational search and assignment of HEases for the targeting of specific loci in the human and other genomes. We validate experimentally the predicted target specificity of candidate fungal, bacterial and archaeal HEases using cell free, yeast and archaeal assays.
Resumo:
Teleost fishes provide the first unambiguous support for ancient whole-genome duplication in an animal lineage. Studies in yeast or plants have shown that the effects of such duplications can be mediated by a complex pattern of gene retention and changes in evolutionary pressure. To explore such patterns in fishes, we have determined by phylogenetic analysis the evolutionary origin of 675 Tetraodon duplicated genes assigned to chromosomes, using additional data from other species of actinopterygian fishes. The subset of genes, which was retained in double after the genome duplication, is enriched in development, signaling, behavior, and regulation functional categories. The evolutionary rate of duplicate fish genes appears to be determined by 3 forces: 1) fish proteins evolve faster than mammalian orthologs; 2) the genes kept in double after genome duplication represent the subset under strongest purifying selection; and 3) following duplication, there is an asymmetric acceleration of evolutionary rate in one of the paralogs. These results show that similar mechanisms are at work in fishes as in yeast or plants and provide a framework for future investigation of the consequences of duplication in fishes and other animals.
Resumo:
BACKGROUND: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. RESULTS: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. CONCLUSION: The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.
Resumo:
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT-PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of genes.
Resumo:
Application of wild-type or genetically-modified bacteria to the soil environment entails the risk of dissemination of these organisms to the groundwater. To measure vertical transport of bacteria under natural climatic conditions, Pseudomonas fluorescens strain CHA0 was released together with bromide as a mobile tracer at the surface of large outdoor lysimeters. Two experiments, one starting in autumn 1993 and the other in spring 1994 were performed. Shortly after a heavy rainfall in late spring 1994, the released bacteria were detected for the first time in effluent water from the 2.5-m-deep lysimeters in both experiments, i.e. 210 d and 21 d, respectively, after inoculation. Only a 10−9 to 10−8 fraction of the inoculum was recovered as culturable cells in the effluent water, but a larger fraction of the CHA0 cells was in a non-culturable state as detected with immunofluorescence microscopy. As much as 50% of the mobile tracer percolated through the lysimeters, indicating that, compared with bromide, bacterial cells were retained in soil. In the second part of this study, persistence of CHA0 in groundwater microcosms consisting of lysimeter effluent water was studied for 380 d. Survival of the inoculant as culturable cells was better under anaerobic than under aerobic conditions. However, a large fraction of the cells became non-culturable in both cases. When the experiment was performed with filter-sterilized effluent water, the total count of introduced bacteria did not decline with time. In conclusion, the biocontrol strain was transported in low numbers to a potential groundwater level under natural climatic conditions, but could persist for an extended period in groundwater microcosms.
Resumo:
C(4) photosynthesis is an adaptive trait conferring an advantage in warm and open habitats. It originated multiple times and is currently reported in 18 plant families. It has been recently shown that phosphoenolpyruvate carboxylase (PEPC), a key enzyme of the C(4) pathway, evolved through numerous independent but convergent genetic changes in grasses (Poaceae). To compare the genetics of multiple C(4) origins on a broader scale, we reconstructed the evolutionary history of the C(4) pathway in sedges (Cyperaceae), the second most species-rich C(4) family. A sedge phylogeny based on two plastome genes (rbcL and ndhF) has previously identified six fully C(4) clades. Here, a relaxed molecular clock was used to calibrate this tree and showed that the first C(4) acquisition occurred in this family between 19.6 and 10.1 Ma. According to analyses of PEPC-encoding genes (ppc), at least five distinct C(4) origins are present in sedges. Two C(4) Eleocharis species, which were unrelated in the plastid phylogeny, acquired their C(4)-specific PEPC genes from a single source, probably through reticulate evolution or a horizontal transfer event. Acquisitions of C(4) PEPC in sedges have been driven by positive selection on at least 16 codons (3.5% of the studied gene segment). These sites underwent parallel genetic changes across the five sedge C(4) origins. Five of these sites underwent identical changes also in grass and eudicot C(4) lineages, indicating that genetic convergence is most important within families but that identical genetic changes occurred even among distantly related taxa. These lines of evidence give new insights into the constraints that govern molecular evolution.
Resumo:
Little is known about the ecology of soil inoculants used for pathogen biocontrol, biofertilization and bioremediation under field conditions. We investigated the persistence and the physiological states of soil-inoculated Pseudomonas protegens (previously Pseudomonas fluorescens) CHA0 (108 CFU g−1 surface soil) in different soil microbial habitats in a planted ley (Medicago sativa L.) and an uncovered field plot. At 72 days, colony counts of the inoculant were low in surface soil (uncovered plot) and earthworm guts (ley plot), whereas soil above the plow pan (uncovered plot), and the rhizosphere and worm burrows present until 1.2 m depth (ley plot) were survival hot spots (105-106 CFU g−1 soil). Interestingly, strain CHA0 was also detected in the subsoil of both plots, at 102-105 CFU g−1 soil between 1.8 and 2 m depth. However, non-cultured CHA0 cells were also evidenced based on immunofluorescence microscopy. Kogure's direct viable counts of nutrient-responsive cells showed that many more CHA0 cells were in a viable but non-culturable (VBNC) or a non-responsive (dormant) state than in a culturable state, and the proportion of cells in those non-cultured states depended on soil microbial habitat. At the most, cells in a VBNC state amounted to 34% (above the plow pan) and those in a dormant state to 89% (in bulk soil between 0.6 and 2 m) of all CHA0 cells. The results indicate that field-released Pseudomonas inoculants may persist at high cell numbers, even in deeper soil layers, and display a combination of different physiological states whose prevalence fluctuates according to soil microbial habitats.
Resumo:
This chapter presents the state of the art concerning the deep-sea Mediterranean environment: geology, hydrology, biology and fisheries. These are the fields of study dealt with in the scientific papers of this volume. The authors are specialists who have addressed their research to the Mediterranean deep-sea environment during the last years. This introduction is an overview but not an exhaustive review.
Resumo:
BACKGROUND: We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. RESULTS: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. CONCLUSION: This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.
Resumo:
1. Identifying the boundary of a species' niche from observational and environmental data is a common problem in ecology and conservation biology and a variety of techniques have been developed or applied to model niches and predict distributions. Here, we examine the performance of some pattern-recognition methods as ecological niche models (ENMs). Particularly, one-class pattern recognition is a flexible and seldom used methodology for modelling ecological niches and distributions from presence-only data. The development of one-class methods that perform comparably to two-class methods (for presence/absence data) would remove modelling decisions about sampling pseudo-absences or background data points when absence points are unavailable. 2. We studied nine methods for one-class classification and seven methods for two-class classification (five common to both), all primarily used in pattern recognition and therefore not common in species distribution and ecological niche modelling, across a set of 106 mountain plant species for which presence-absence data was available. We assessed accuracy using standard metrics and compared trade-offs in omission and commission errors between classification groups as well as effects of prevalence and spatial autocorrelation on accuracy. 3. One-class models fit to presence-only data were comparable to two-class models fit to presence-absence data when performance was evaluated with a measure weighting omission and commission errors equally. One-class models were superior for reducing omission errors (i.e. yielding higher sensitivity), and two-classes models were superior for reducing commission errors (i.e. yielding higher specificity). For these methods, spatial autocorrelation was only influential when prevalence was low. 4. These results differ from previous efforts to evaluate alternative modelling approaches to build ENM and are particularly noteworthy because data are from exhaustively sampled populations minimizing false absence records. Accurate, transferable models of species' ecological niches and distributions are needed to advance ecological research and are crucial for effective environmental planning and conservation; the pattern-recognition approaches studied here show good potential for future modelling studies. This study also provides an introduction to promising methods for ecological modelling inherited from the pattern-recognition discipline.
Resumo:
A recurring task in the analysis of mass genome annotation data from high-throughput technologies is the identification of peaks or clusters in a noisy signal profile. Examples of such applications are the definition of promoters on the basis of transcription start site profiles, the mapping of transcription factor binding sites based on ChIP-chip data and the identification of quantitative trait loci (QTL) from whole genome SNP profiles. Input to such an analysis is a set of genome coordinates associated with counts or intensities. The output consists of a discrete number of peaks with respective volumes, extensions and center positions. We have developed for this purpose a flexible one-dimensional clustering tool, called MADAP, which we make available as a web server and as standalone program. A set of parameters enables the user to customize the procedure to a specific problem. The web server, which returns results in textual and graphical form, is useful for small to medium-scale applications, as well as for evaluation and parameter tuning in view of large-scale applications, requiring a local installation. The program written in C++ can be freely downloaded from ftp://ftp.epd.unil.ch/pub/software/unix/madap. The MADAP web server can be accessed at http://www.isrec.isb-sib.ch/madap/.
Resumo:
BACKGROUND: Fourmidable is an infrastructure to curate and share the emerging genetic, molecular, and functional genomic data and protocols for ants. DESCRIPTION: The Fourmidable assembly pipeline groups nucleotide sequences into clusters before independently assembling each cluster. Subsequently, assembled sequences are annotated via Interproscan and BLAST against general and insect-specific databases. Gene-specific information can be retrieved using gene identifiers, searching for similar sequences or browsing through inferred Gene Ontology annotations. The database will readily scale as ultra-high throughput sequence data and sequences from additional species become available. CONCLUSION: Fourmidable currently houses EST data from two ant species and microarray gene expression data for one of these. Fourmidable is publicly available at http://fourmidable.unil.ch.