124 resultados para bioinformatics


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The MyHits web server (http://myhits.isb-sib.ch) is a new integrated service dedicated to the annotation of protein sequences and to the analysis of their domains and signatures. Guest users can use the system anonymously, with full access to (i) standard bioinformatics programs (e.g. PSI-BLAST, ClustalW, T-Coffee, Jalview); (ii) a large number of protein sequence databases, including standard (Swiss-Prot, TrEMBL) and locally developed databases (splice variants); (iii) databases of protein motifs (Prosite, Interpro); (iv) a precomputed list of matches ('hits') between the sequence and motif databases. All databases are updated on a weekly basis and the hit list is kept up to date incrementally. The MyHits server also includes a new collection of tools to generate graphical representations of pairwise and multiple sequence alignments including their annotated features. Free registration enables users to upload their own sequences and motifs to private databases. These are then made available through the same web interface and the same set of analytical tools. Registered users can manage their own sequences and annotations using only web tools and freeze their data in their private database for publication purposes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e. g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. Results: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/similar to vpopovic/research/ Conclusion: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

SUMMARY: Large sets of data, such as expression profiles from many samples, require analytic tools to reduce their complexity. The Iterative Signature Algorithm (ISA) is a biclustering algorithm. It was designed to decompose a large set of data into so-called 'modules'. In the context of gene expression data, these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different 'resolutions' of the modular mapping. In this short note, we introduce two BioConductor software packages written in GNU R: The isa2 package includes an optimized implementation of the ISA and the eisa package provides a convenient interface to run the ISA, visualize its output and put the biclusters into biological context. Potential users of these packages are all R and BioConductor users dealing with tabular (e.g. gene expression) data. AVAILABILITY: http://www.unil.ch/cbg/ISA CONTACT: sven.bergmann@unil.ch

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pendant ma thèse de doctorat, j'ai utilisé des espèces modèles, comme la souris et le poisson-zèbre, pour étudier les facteurs qui affectent l'évolution des gènes et leur expression. Plus précisément, j'ai montré que l'anatomie et le développement sont des facteurs clés à prendre en compte, car ils influencent la vitesse d'évolution de la séquence des gènes, l'impact sur eux de mutations (i.e. la délétion du gène est-elle létale ?), et leur tendance à se dupliquer. Où et quand il est exprimé impose à un gène certaines contraintes ou au contraire lui donne des opportunités d'évoluer. J'ai pu comparer ces tendances aux modèles classiques d'évolution de la morphologie, que l'on pensait auparavant refléter directement les contraintes s'appliquant sur le génome. Nous avons montré que les contraintes entre ces deux niveaux d'organisation ne peuvent pas être transférées simplement : il n'y a pas de lien direct entre la conservation du génotype et celle de phénotypes comme la morphologie. Ce travail a été possible grâce au développement d'outils bioinformatiques. Notamment, j'ai travaillé sur le développement de la base de données Bgee, qui a pour but de comparer l'expression des gènes entre différentes espèces de manière automatique et à large échelle. Cela implique une formalisation de l'anatomie, du développement et de concepts liés à l'homologie grâce à l'utilisation d'ontologies. Une intégration cohérente de données d'expression hétérogènes (puces à ADN, marqueurs de séquence exprimée, hybridations in situ) a aussi été nécessaire. Cette base de données est mise à jour régulièrement et disponible librement. Elle devrait contribuer à étendre les possibilités de comparaison de l'expression des gènes entre espèces pour des études d'évo-devo (évolution du développement) et de génomique. During my PhD, I used model species of vertebrates, such as mouse and zebrafish, to study factors affecting the evolution of genes and their expression. More precisely I have shown that anatomy and development are key factors to take into account, influencing the rate of gene sequence evolution, the impact of mutations (i.e. is the deletion of a gene lethal?), and the propensity of a gene to duplicate. Where and when genes are expressed imposes constraints, or on the contrary leaves them some opportunity to evolve. We analyzed these patterns in relation to classical models of morphological evolution in vertebrates, which were previously thought to directly reflect constraints on the genomes. We showed that the patterns of evolution at these two levels of organization do not translate smoothly: there is no direct link between the conservation of genotype and phenotypes such as morphology. This work was made possible by the development of bioinformatics tools. Notably, I worked on the development of the database Bgee, which aims at comparing gene expression between different species in an automated and large-scale way. This involves the formalization of anatomy, development, and concepts related to homology, through the use of ontologies. A coherent integration of heterogeneous expression data (microarray, expressed sequence tags, in situ hybridizations) is also required. This database is regularly updated and freely available. It should contribute to extend the possibilities for comparison of gene expression between species in evo-devo and genomics studies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Animal toxins are of interest to a wide range of scientists, due to their numerous applications in pharmacology, neurology, hematology, medicine, and drug research. This, and to a lesser extent the development of new performing tools in transcriptomics and proteomics, has led to an increase in toxin discovery. In this context, providing publicly available data on animal toxins has become essential. The UniProtKB/Swiss-Prot Tox-Prot program (http://www.uniprot.org/program/Toxins) plays a crucial role by providing such an access to venom protein sequences and functions from all venomous species. This program has up to now curated more than 5000 venom proteins to the high-quality standards of UniProtKB/Swiss-Prot (release 2012_02). Proteins targeted by these toxins are also available in the knowledgebase. This paper describes in details the type of information provided by UniProtKB/Swiss-Prot for toxins, as well as the structured format of the knowledgebase.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: Regulatory gene networks contain generic modules such as feedback loops that are essential for the regulation of many biological functions. The study of the stochastic mechanisms of gene regulation is instrumental for the understanding of how cells maintain their expression at levels commensurate with their biological role, as well as to engineer gene expression switches of appropriate behavior. The lack of precise knowledge on the steady-state distribution of gene expression requires the use of Gillespie algorithms and Monte-Carlo approximations. METHODOLOGY: In this study, we provide new exact formulas and efficient numerical algorithms for computing/modeling the steady-state of a class of self-regulated genes, and we use it to model/compute the stochastic expression of a gene of interest in an engineered network introduced in mammalian cells. The behavior of the genetic network is then analyzed experimentally in living cells. RESULTS: Stochastic models often reveal counter-intuitive experimental behaviors, and we find that this genetic architecture displays a unimodal behavior in mammalian cells, which was unexpected given its known bimodal response in unicellular organisms. We provide a molecular rationale for this behavior, and we implement it in the mathematical picture to explain the experimental results obtained from this network.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method. RESULTS: We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end. CONCLUSIONS: De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

SUMMARY: MetaNetX.org is a website for accessing, analysing and manipulating genome-scale metabolic networks (GSMs) as well as biochemical pathways. It consistently integrates data from various public resources and makes the data accessible in a standardized format using a common namespace. Currently, it provides access to hundreds of GSMs and pathways that can be interactively compared (two or more), analysed (e.g. detection of dead-end metabolites and reactions, flux balance analysis or simulation of reaction and gene knockouts), manipulated and exported. Users can also upload their own metabolic models, choose to automatically map them into the common namespace and subsequently make use of the website's functionality. Availability and implementation: MetaNetX.org is available at http://metanetx.org. CONTACT: help@metanetx.org.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Phosphate homeostasis was studied in a monocotyledonous model plant through the characterization of the PHO1 gene family in rice (Oryza sativa). Bioinformatics and phylogenetic analysis showed that the rice genome has three PHO1 homologs, which cluster with the Arabidopsis (Arabidopsis thaliana) AtPHO1 and AtPHO1;H1, the only two genes known to be involved in root-to-shoot transfer of phosphate. In contrast to the Arabidopsis PHO1 gene family, all three rice PHO1 genes have a cis-natural antisense transcript located at the 5 ' end of the genes. Strand-specific quantitative reverse transcription-PCR analyses revealed distinct patterns of expression for sense and antisense transcripts for all three genes, both at the level of tissue expression and in response to nutrient stress. The most abundantly expressed gene was OsPHO1;2 in the roots, for both sense and antisense transcripts. However, while the OsPHO1;2 sense transcript was relatively stable under various nutrient deficiencies, the antisense transcript was highly induced by inorganic phosphate (Pi) deficiency. Characterization of Ospho1;1 and Ospho1;2 insertion mutants revealed that only Ospho1;2 mutants had defects in Pi homeostasis, namely strong reduction in Pi transfer from root to shoot, which was accompanied by low-shoot and high-root Pi. Our data identify OsPHO1;2 as playing a key role in the transfer of Pi from roots to shoots in rice, and indicate that this gene could be regulated by its cis-natural antisense transcripts. Furthermore, phylogenetic analysis of PHO1 homologs in monocotyledons and dicotyledons revealed the emergence of a distinct clade of PHO1 genes in dicotyledons, which include members having roles other than long-distance Pi transport.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. RESULTS: In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. AVAILABILITY: Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of 'Gold standard' phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this study we have demonstrated the potential of two-dimensional electrophoresis (2DE)-based technologies as tools for characterization of the Leishmania proteome (the expressed protein complement of the genome). Standardized neutral range (pH 5-7) proteome maps of Leishmania (Viannia) guyanensis and Leishmania (Viannia) panamensis promastigotes were reproducibly generated by 2DE of soluble parasite extracts, which were prepared using lysis buffer containing urea and nonidet P-40 detergent. The Coomassie blue and silver nitrate staining systems both yielded good resolution and representation of protein spots, enabling the detection of approximately 800 and 1,500 distinct proteins, respectively. Several reference protein spots common to the proteomes of all parasite species/strains studied were isolated and identified by peptide mass spectrometry (LC-ES-MS/MS), and bioinformatics approaches as members of the heat shock protein family, ribosomal protein S12, kinetoplast membrane protein 11 and a hypothetical Leishmania-specific 13 kDa protein of unknown function. Immunoblotting of Leishmania protein maps using a monoclonal antibody resulted in the specific detection of the 81.4 kDa and 77.5 kDa subunits of paraflagellar rod proteins 1 and 2, respectively. Moreover, differences in protein expression profiles between distinct parasite clones were reproducibly detected through comparative proteome analyses of paired maps using image analysis software. These data illustrate the resolving power of 2DE-based proteome analysis. The production and basic characterization of good quality Leishmania proteome maps provides an essential first step towards comparative protein expression studies aimed at identifying the molecular determinants of parasite drug resistance and virulence, as well as discovering new drug and vaccine targets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The primary mission of UniProt is to support biological research by maintaining a stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces freely accessible to the scientific community. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute (EBI), the Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. UniProt is updated and distributed every 3 weeks and can be accessed online for searches or download at http://www.uniprot.org.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Summary Cancer is a leading cause of morbidity and mortality in Western countries (as an example, colorectal cancer accounts for about 300'000 new cases and 200'000 deaths each year in Europe and in the USA). Despite that many patients with cancer have complete macroscopic clearance of their disease after resection, radiotherapy and/or chemotherapy, many of these patients develop fatal recurrence. Vaccination with immunogenic peptide tumor antigens has shown encouraging progresses in the last decade; immunotherapy might therefore constitute a fourth therapeutic option in the future. We dissect here and critically evaluate the numerous steps of reverse immunology, a forecast procedure to identify antigenic peptides from the sequence of a gene of interest. Bioinformatic algorithms were applied to mine sequence databases for tumor-specific transcripts. A quality assessment of publicly available sequence databanks allowed defining strengths and weaknesses of bioinformatics-based prediction of colon cancer-specific alternative splicing: new splice variants could be identified, however cancer-restricted expression could not be significantly predicted. Other sources of target transcripts were quantitatively investigated by polymerase chain reactions, as cancer-testis genes or reported overexpressed transcripts. Based on the relative expression of a defined set of housekeeping genes in colon cancer tissues, we characterized a precise procedure for accurate normalization and determined a threshold for the definition of significant overexpression of genes in cancers versus normal tissues. Further steps of reverse immunology were applied on a splice variant of the Melan¬A gene. Since it is known that the C-termini of antigenic peptides are directly produced by the proteasome, longer precursor and overlapping peptides encoded by the target sequence were synthesized chemically and digested in vitro with purified proteasome. The resulting fragments were identified by mass spectroscopy to detect cleavage sites. Using this information and based on the available anchor motifs for defined HLA class I molecules, putative antigenic peptides could be predicted. Their relative affinity for HLA molecules was confirmed experimentally with functional competitive binding assays and they were used to search patients' peripheral blood lymphocytes for the presence of specific cytolytic T lymphocytes (CTL). CTL clones specific for a splice variant of Melan-A could be isolated; although they recognized peptide-pulsed cells, they failed to lyse melanoma cells in functional assays of antigen recognition. In the conclusion, we discuss advantages and bottlenecks of reverse immunology and compare the technical aspects of this approach with the more classical procedure of direct immunology, a technique introduced by Boon and colleagues more than 10 years ago to successfully clone tumor antigens.