931 resultados para Bioinformatics
Resumo:
Main developmental programs are highly conserved among species of the animal kingdom. Improper execution of these programs often leads to progression of various diseases and disorders. Here we focused on Drosophila wing tissue morphogenesis, a fairly complex developmental program, one of the steps of which - apposition of the dorsal and ventral wing sheets during metamorphosis - is mediated by integrins. Disruption of this apposition leads to wing blistering which serves as an easily screenable phenotype for components regulating this process. By means of RNAi-silencing technique and the blister phenotype as readout, we identify numerous novel proteins potentially involved in wing sheet adhesion. Remarkably, our results reveal not only participants of the integrin-mediated machinery, but also components of other cellular processes, e.g. cell cycle, RNA splicing, and vesicular trafficking. With the use of bioinformatics tools, these data are assembled into a large blisterome network. Analysis of human orthologues of the Drosophila blisterome components shows that many disease-related genes may contribute to cell adhesion implementation, providing hints on possible mechanisms of these human pathologies.
Resumo:
Microtubule plus-end-tracking proteins (+TIPs) specifically localize to the growing plus-ends of microtubules to regulate microtubule dynamics and functions. A large group of +TIPs contain a short linear motif, SXIP, which is essential for them to bind to end-binding proteins (EBs) and target microtubule ends. The SXIP sequence site thus acts as a widespread microtubule tip localization signal (MtLS). Here we have analyzed the sequence-function relationship of a canonical MtLS. Using synthetic peptide arrays on membrane supports, we identified the residue preferences at each amino acid position of the SXIP motif and its surrounding sequence with respect to EB binding. We further developed an assay based on fluorescence polarization to assess the mechanism of the EB-SXIP interaction and to correlate EB binding and microtubule tip tracking of MtLS sequences from different +TIPs. Finally, we investigated the role of phosphorylation in regulating the EB-SXIP interaction. Together, our results define the sequence determinants of a canonical MtLS and provide the experimental data for bioinformatics approaches to carry out genome-wide predictions of novel +TIPs in multiple organisms.
Resumo:
The European Mouse Mutagenesis Consortium is the European initiative contributing to the international effort on functional annotation of the mouse genome. Its objectives are to establish and integrate mutagenesis platforms, gene expression resources, phenotyping units, storage and distribution centers and bioinformatics resources. The combined efforts will accelerate our understanding of gene function and of human health and disease.
Resumo:
Genome-scale metabolic network reconstructions are now routinely used in the study of metabolic pathways, their evolution and design. The development of such reconstructions involves the integration of information on reactions and metabolites from the scientific literature as well as public databases and existing genome-scale metabolic models. The reconciliation of discrepancies between data from these sources generally requires significant manual curation, which constitutes a major obstacle in efforts to develop and apply genome-scale metabolic network reconstructions. In this work, we discuss some of the major difficulties encountered in the mapping and reconciliation of metabolic resources and review three recent initiatives that aim to accelerate this process, namely BKM-react, MetRxn and MNXref (presented in this article). Each of these resources provides a pre-compiled reconciliation of many of the most commonly used metabolic resources. By reducing the time required for manual curation of metabolite and reaction discrepancies, these resources aim to accelerate the development and application of high-quality genome-scale metabolic network reconstructions and models.
Resumo:
We have identified new malaria vaccine candidates through the combination of bioinformatics prediction of stable protein domains in the Plasmodium falciparum genome, chemical synthesis of polypeptides, in vitro biological functional assays, and association of an antigen-specific antibody response with protection against clinical malaria. Within the predicted open reading frame of P. falciparum hypothetical protein PFF0165c, several segments with low hydrophobic amino acid content, which are likely to be intrinsically unstructured, were identified. The synthetic peptide corresponding to one such segment (P27A) was well recognized by sera and peripheral blood mononuclear cells of adults living in different regions where malaria is endemic. High antibody titers were induced in different strains of mice and in rabbits immunized with the polypeptide formulated with different adjuvants. These antibodies recognized native epitopes in P. falciparum-infected erythrocytes, formed distinct bands in Western blots, and were inhibitory in an in vitro antibody-dependent cellular inhibition parasite-growth assay. The immunological properties of P27A, together with its low polymorphism and association with clinical protection from malaria in humans, warrant its further development as a malaria vaccine candidate.
Resumo:
Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.
Resumo:
Recent technological progress has greatly facilitated de novo genome sequencing. However, de novo assemblies consist in many pieces of contiguous sequence (contigs) arranged in thousands of scaffolds instead of small numbers of chromosomes. Confirming and improving the quality of such assemblies is critical for subsequent analysis. We present a method to evaluate genome scaffolding by aligning independently obtained transcriptome sequences to the genome and visually summarizing the alignments using the Cytoscape software. Applying this method to the genome of the red fire ant Solenopsis invicta allowed us to identify inconsistencies in 7%, confirm contig order in 20% and extend 16% of scaffolds.Scripts that generate tables for visualization in Cytoscape from FASTA sequence and scaffolding information files are publicly available at https://github.com/ksanao/TGNet.
Resumo:
Olfactory systems are evolutionarily ancient, underlying the common requirement for all animals to sense and respond to diverse volatile chemical signals in their environment. Odor detection is mediated by odorant receptors (ORs) that, in most olfactory systems, comprise large families of divergent G protein-coupled receptors. Here, I discuss our and others' recent investigations of ORs in the fruit fly, Drosophila melanogaster, which have revealed insights into the distinct evolutionary origin and molecular function of insect ORs. I also describe a bioinformatics strategy that we developed to identify molecules that function with these insect-specific receptors in odor detection.
Resumo:
The n-octanol/water partition coefficient (log Po/w) is a key physicochemical parameter for drug discovery, design, and development. Here, we present a physics-based approach that shows a strong linear correlation between the computed solvation free energy in implicit solvents and the experimental log Po/w on a cleansed data set of more than 17,500 molecules. After internal validation by five-fold cross-validation and data randomization, the predictive power of the most interesting multiple linear model, based on two GB/SA parameters solely, was tested on two different external sets of molecules. On the Martel druglike test set, the predictive power of the best model (N = 706, r = 0.64, MAE = 1.18, and RMSE = 1.40) is similar to six well-established empirical methods. On the 17-drug test set, our model outperformed all compared empirical methodologies (N = 17, r = 0.94, MAE = 0.38, and RMSE = 0.52). The physical basis of our original GB/SA approach together with its predictive capacity, computational efficiency (1 to 2 s per molecule), and tridimensional molecular graphics capability lay the foundations for a promising predictor, the implicit log P method (iLOGP), to complement the portfolio of drug design tools developed and provided by the SIB Swiss Institute of Bioinformatics.
Resumo:
Downmodulation or loss-of-function mutations of the gene encoding NOTCH1 are associated with dysfunctional squamous cell differentiation and development of squamous cell carcinoma (SCC) in skin and internal organs. While NOTCH1 receptor activation has been well characterized, little is known about how NOTCH1 gene transcription is regulated. Using bioinformatics and functional screening approaches, we identified several regulators of the NOTCH1 gene in keratinocytes, with the transcription factors DLX5 and EGR3 and estrogen receptor β (ERβ) directly controlling its expression in differentiation. DLX5 and ERG3 are required for RNA polymerase II (PolII) recruitment to the NOTCH1 locus, while ERβ controls NOTCH1 transcription through RNA PolII pause release. Expression of several identified NOTCH1 regulators, including ERβ, is frequently compromised in skin, head and neck, and lung SCCs and SCC-derived cell lines. Furthermore, a keratinocyte ERβ-dependent program of gene expression is subverted in SCCs from various body sites, and there are consistent differences in mutation and gene-expression signatures of head and neck and lung SCCs in female versus male patients. Experimentally increased ERβ expression or treatment with ERβ agonists inhibited proliferation of SCC cells and promoted NOTCH1 expression and squamous differentiation both in vitro and in mouse xenotransplants. Our data identify a link between transcriptional control of NOTCH1 expression and the estrogen response in keratinocytes, with implications for differentiation therapy of squamous cancer.
Resumo:
MOTIVATION: Most bioactive molecules perform their action by interacting with proteins or other macromolecules. However, for a significant fraction of them, the primary target remains unknown. In addition, the majority of bioactive molecules have more than one target, many of which are poorly characterized. Computational predictions of bioactive molecule targets based on similarity with known ligands are powerful to narrow down the number of potential targets and to rationalize side effects of known molecules. RESULTS: Using a reference set of 224 412 molecules active on 1700 human proteins, we show that accurate target prediction can be achieved by combining different measures of chemical similarity based on both chemical structure and molecular shape. Our results indicate that the combined approach is especially efficient when no ligand with the same scaffold or from the same chemical series has yet been discovered. We also observe that different combinations of similarity measures are optimal for different molecular properties, such as the number of heavy atoms. This further highlights the importance of considering different classes of similarity measures between new molecules and known ligands to accurately predict their targets. CONTACT: olivier.michielin@unil.ch or vincent.zoete@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
Congenital hypogonadotropic hypogonadism (CHH) and its anosmia-associated form (Kallmann syndrome [KS]) are genetically heterogeneous. Among the >15 genes implicated in these conditions, mutations in FGF8 and FGFR1 account for ∼12% of cases; notably, KAL1 and HS6ST1 are also involved in FGFR1 signaling and can be mutated in CHH. We therefore hypothesized that mutations in genes encoding a broader range of modulators of the FGFR1 pathway might contribute to the genetics of CHH as causal or modifier mutations. Thus, we aimed to (1) investigate whether CHH individuals harbor mutations in members of the so-called "FGF8 synexpression" group and (2) validate the ability of a bioinformatics algorithm on the basis of protein-protein interactome data (interactome-based affiliation scoring [IBAS]) to identify high-quality candidate genes. On the basis of sequence homology, expression, and structural and functional data, seven genes were selected and sequenced in 386 unrelated CHH individuals and 155 controls. Except for FGF18 and SPRY2, all other genes were found to be mutated in CHH individuals: FGF17 (n = 3 individuals), IL17RD (n = 8), DUSP6 (n = 5), SPRY4 (n = 14), and FLRT3 (n = 3). Independently, IBAS predicted FGF17 and IL17RD as the two top candidates in the entire proteome on the basis of a statistical test of their protein-protein interaction patterns to proteins known to be altered in CHH. Most of the FGF17 and IL17RD mutations altered protein function in vitro. IL17RD mutations were found only in KS individuals and were strongly linked to hearing loss (6/8 individuals). Mutations in genes encoding components of the FGF pathway are associated with complex modes of CHH inheritance and act primarily as contributors to an oligogenic genetic architecture underlying CHH.
Resumo:
The Gene Ontology (GO) (http://www.geneontology.org) is a community bioinformatics resource that represents gene product function through the use of structured, controlled vocabularies. The number of GO annotations of gene products has increased due to curation efforts among GO Consortium (GOC) groups, including focused literature-based annotation and ortholog-based functional inference. The GO ontologies continue to expand and improve as a result of targeted ontology development, including the introduction of computable logical definitions and development of new tools for the streamlined addition of terms to the ontology. The GOC continues to support its user community through the use of e-mail lists, social media and web-based resources.
Resumo:
MOTIVATION: High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. Results: We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.