916 resultados para High-throughput screening


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Wound responses in plants have to be coordinated between organs so that locally reduced growth in a wounded tissue is balanced by appropriate growth elsewhere in the body. We used a JASMONATE ZIM DOMAIN 10 (JAZ10) reporter to screen for mutants affected in the organ-specific activation of jasmonate (JA) signaling in Arabidopsis thaliana seedlings. Wounding one cotyledon activated the reporter in both aerial and root tissues, and this was either disrupted or restricted to certain organs in mutant alleles of core components of the JA pathway including COI1, OPR3, and JAR1. In contrast, three other mutants showed constitutive activation of the reporter in the roots and hypocotyls of unwounded seedlings. All three lines harbored mutations in Novel Interactor of JAZ (NINJA), which encodes part of a repressor complex that negatively regulates JA signaling. These ninja mutants displayed shorter roots mimicking JA-mediated growth inhibition, and this was due to reduced cell elongation. Remarkably, this phenotype and the constitutive JAZ10 expression were still observed in backgrounds lacking the ability to synthesize JA or the key transcriptional activator MYC2. Therefore, JA-like responses can be recapitulated in specific tissues without changing a plant's ability to make or perceive JA, and MYC2 either has no role or is not the only derepressed transcription factor in ninja mutants. Our results show that the role of NINJA in the root is to repress JA signaling and allow normal cell elongation. Furthermore, the regulation of the JA pathway differs between roots and aerial tissues at all levels, from JA biosynthesis to transcriptional activation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Gene expression changes may underlie much of phenotypic evolution. The development of high-throughput RNA sequencing protocols has opened the door to unprecedented large-scale and cross-species transcriptome comparisons by allowing accurate and sensitive assessments of transcript sequences and expression levels. Here, we review the initial wave of the new generation of comparative transcriptomic studies in mammals and vertebrate outgroup species in the context of earlier work. Together with various large-scale genomic and epigenomic data, these studies have unveiled commonalities and differences in the dynamics of gene expression evolution for various types of coding and non-coding genes across mammalian lineages, organs, developmental stages, chromosomes and sexes. They have also provided intriguing new clues to the regulatory basis and phenotypic implications of evolutionary gene expression changes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

MOTIVATION: High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. Results: We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years, both homing endonucleases (HEases) and zinc-finger nucleases (ZFNs) have been engineered and selected for the targeting of desired human loci for gene therapy. However, enzyme engineering is lengthy and expensive and the off-target effect of the manufactured endonucleases is difficult to predict. Moreover, enzymes selected to cleave a human DNA locus may not cleave the homologous locus in the genome of animal models because of sequence divergence, thus hampering attempts to assess the in vivo efficacy and safety of any engineered enzyme prior to its application in human trials. Here, we show that naturally occurring HEases can be found, that cleave desirable human targets. Some of these enzymes are also shown to cleave the homologous sequence in the genome of animal models. In addition, the distribution of off-target effects may be more predictable for native HEases. Based on our experimental observations, we present the HomeBase algorithm, database and web server that allow a high-throughput computational search and assignment of HEases for the targeting of specific loci in the human and other genomes. We validate experimentally the predicted target specificity of candidate fungal, bacterial and archaeal HEases using cell free, yeast and archaeal assays.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

PURPOSE OF REVIEW: One of the seven key scientific priorities identified in the road map on HIV cure research is to 'determine the host mechanisms that control HIV replication in the absence of therapy'. This review summarizes the recent work in genomics and in epigenetic control of viral replication that is relevant for this mission. RECENT FINDINGS: New technologies allow the joint analysis of host and viral transcripts. They identify the patterns of antisense transcription of the viral genome and its role in gene regulation. High-throughput studies facilitate the assessment of integration at the genome scale. Integration site, orientation and host genomic context modulate the transcription and should also be assessed at the level of single cells. The various models of latency in primary cells can be followed using dynamic study designs to acquire transcriptome and proteome data of the process of entry, maintenance and reactivation of latency. Dynamic studies can be applied to the study of transcription factors and chromatin modifications in latency and upon reactivation. SUMMARY: The convergence of primary cell models of latency, new high-throughput quantitative technologies applied to the study of time series and the identification of compounds that reactivate viral transcription bring unprecedented precision to the study of viral latency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Eukaryotic transcription is tightly regulated by transcriptional regulatory elements, even though these elements may be located far away from their target genes. It is now widely recognized that these regulatory elements can be brought in close proximity through the formation of chromatin loops, and that these loops are crucial for transcriptional regulation of their target genes. The chromosome conformation capture (3C) technique presents a snapshot of long-range interactions, by fixing physically interacting elements with formaldehyde, digestion of the DNA, and ligation to obtain a library of unique ligation products. Recently, several large-scale modifications to the 3C technique have been presented. Here, we describe chromosome conformation capture sequencing (4C-seq), a high-throughput version of the 3C technique that combines the 3C-on-chip (4C) protocol with next-generation Illumina sequencing. The method is presented for use in mammalian cell lines, but can be adapted to use in mammalian tissues and any other eukaryotic genome.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Purpose:To describe a novel in silico method to gather and analyze data from high-throughput heterogeneous experimental procedures, i.e. gene and protein expression arrays. Methods:Each microarray is assigned to a database which handles common data (names, symbols, antibody codes, probe IDs, etc.). Links between informations are automatically generated from knowledge obtained in freely accessible databases (NCBI, Swissprot, etc). Requests can be made from any point of entry and the displayed result is fully customizable. Results:The initial database has been loaded with two sets of data: a first set of data originating from an Affymetrix-based retinal profiling performed in an RPE65 knock-out mouse model of Leber's congenital amaurosis. A second set of data generated from a Kinexus microarray experiment done on the retinas from the same mouse model has been added. Queries display wild type versus knock out expressions at several time points for both genes and proteins. Conclusions:This freely accessible database allows for easy consultation of data and facilitates data mining by integrating experimental data and biological pathways.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Progress in genomics with, in particular, high throughput next generation sequencing is revolutionizing oncology. The impact of these techniques is seen on the one hand the identification of germline mutations that predispose to a given type of cancer, allowing for a personalized care of patients or healthy carriers and, on the other hand, the characterization of all acquired somatic mutation of the tumor cell, opening the door to personalized treatment targeting the driver oncogenes. In both cases, next generation sequencing techniques allow a global approach whereby the integrality of the genome mutations is analyzed and correlated with the clinical data. The benefits on the quality of care delivered to our patients are extremely impressive.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract : Transcriptional regulation is the result of a combination of positive and negative effectors, such as transcription factors, cofactors and chromatin modifiers. During my thesis project I studied chromatin association, and transcriptional and cell cycle regulatory functions of dHCF, the Drosophila homologue of the human protein HCF-1 (host cell factor-1). The human and Drosophila HCF proteins are synthesized as large polypeptides that are cleaved into two subunits (HCFN and HCFC), which remain associated with one another by non covalent interactions. Studies in mammalian cells over the past 20 years have been devoted to understanding the cellular functions of HCF-1 and have revealed that it is a key regulator of transcription and cell cycle regulation. In human cells, HCF-1 interacts with the histone methyltransferase Set1/Ash2 and MLL/Ash2 complexes and the histone deacetylase Sin3 complex, which are involved in transcriptional activation and repression, respectively. HCF-1 is also recruited to promoters to regulate G1 -to-S phase progression during the cell cycle by the activator transcription factors E2F1 and E2F3, and by the repressor transcription factor E2F4. HCF-1 protein structure and these interactions between HCP-1 and E2F transcriptional regulator proteins are also conserved in Drosophila. In this doctoral thesis, I use proliferating Drosophila SL2 cells to study both the genomic-binding sites of dHCF, using a combination of chromatin immunoprecipitation and ultra high throughput sequencing (ChIP-seq) analysis, and dHCF regulated genes, employing RNAi and microarray expression analysis. I show that dHCF is bound to over 7500 chromosomal sites in proliferating SL2 cells, and is located at +-200 bp relative to the transcriptional start sites of about 30% of Drosophila genes. There is also a direct relationship between dHCF promoter association and promoter- associated transcriptional activity. Thus, dHCF binding levels at promoters correlated directly with transcriptional activity. In contrast, expression studies showed that dHCF appears to be involved in both transcriptional activation and repression. Analysis of dHCF-binding sites identified nine dHCF-associated motifs, four of them linked dHCF to (i) two insulator proteins, GAGA and BEAF, (ii) the E-box motif, and (iii) a degenerated TATA-box. The dHCF-associated motifs allowed the organization of the dHCF-bound genes into five biological processes: differentiation, cell cycle and gene expression, regulation of endocytosis, and cellular localization. I further show that different mechanisms regulate dHCF association with chromatin. Despite that after dHCF cleavage the dHCFN and dHCFC subunits remain associated, the two subunits showed different affinities for chromatin and differential binding to a set of tested promoters, suggesting that dHCF could target specific promoters through each of the two subunits. Moreover, in addition to the interaction between dHCF and E2F transcription factors, the dHCF binding pattern is correlated with dE2F2 genomic 4 distribution. I show that dE2F factors are necessary for recruitment of dHCF to the promoter of a set of dHCF regulated genes. Therefore dHCF, as in mammals, is involved in regulation of G1 to S phase progression in collaboration with the dE2Fs transcription factors. In addition, gene expression arrays reveal that dHCF could indirectly regulate cell cycle progression by promoting expression of genes involved in gene expression and protein synthesis, and inhibiting expression of genes involved in cell-cell adhesion. Therefore, dHCF is an evolutionary conserved protein, which binds to many specific sites of the Drosophila genome via interaction with DNA of chromatin-binding proteins to regulate the expression of genes involved in many different cellular functions. Résumé : La regulation de la transcription est le résultat des effets positifs et négatifs des facteurs de transcription, cofacteurs et protéines effectrices qui modifient la chromatine. Pendant mon projet de thèse, j'ai étudié l'association a la chromatine, ainsi que la régulation de la transcription et du cycle cellulaire par dHCF, l'homologue chez la drosophile de la protéine humaine HCF-1 (host cell factor-1). Chez 1'humain et la V drosophile, les deux protéines HCF sont synthétisées sous la forme d'un long polypeptide, qui est ensuite coupé en deux sous-unités au centre de la protéine. Les deux sous-unités restent associées ensemble grâce a des interactions non-covalentes. Des études réalisées pendant les 20 dernières années ont permit d'établir que HCF-l et un facteur clé dans la régulation de la transcription et du cycle cellulaire. Dans les cellules humaines, HCF-1 active et réprime la transcription en interagissant avec des complexes de protéines qui activent la transcription en méthylant les histones (HMT), comme par Set1/Ash2 et MLL/Ash2, et d'autres complexes qui répriment la transcription et sont responsables de la déacétylation des histones (HDAC) comme la protéine Sin3. HCF-l est aussi recruté aux promoteurs par les activateurs de la transcription E2F l et E2F3a, et par le répresseur de la transcription E2F4 pour réguler la transition entre les phases G1 et S du cycle cellulaire. La structure de HCF-1 et les interactions entre HCF-l et les régulateurs de la transcription sont conservées chez la drosophile. Pendant ma these j'ai utilisé les cellules de la drosophile, SL2 en culture, pour étudier les endroits de liaisons de HCF-l à la chromatine, grâce a immunoprecipitation de la chromatine et du séquençage de l'ADN massif ainsi que les gènes régulés par dHCF 3 grâce a la technique de RNAi et des microarrays. Mes résultats on montré que dHCF se lie à environ 7565 endroits, et estimé a 1200 paire de bases autour des sites d'initiation de la transcription de 30% des gènes de la drosophile. J 'ai observe une relation entre dHCF et le niveau de la transcription. En effet, le niveau de liaison dHCF au promoteur corrèle avec l'activité de la transcription. Cependant, mes études d'expression ont montré que dHCF est implique dans le processus d'activation et mais aussi de répression de la transcription. L'analyse des séquences d'ADN liées par dHCF a révèle neuf motifs, quatre de ces motifs ont permis d'associer dl-ICF a deux protéines isolatrices GAGA et BEAF, au motif pour les E-boxes et a une TATA-box dégénérée. Les neuf motifs associes à dHCF ont permis d'associer les gènes lies par dHCF au promoteur a cinq processus biologiques: différentiation, cycle cellulaire, expression de gènes, régulation de l'endocytosis et la localisation cellulaire, J 'ai aussi montré qu'il y a plusieurs mécanismes qui régulent l'association de dHCF a la chromatine, malgré qu'après clivage, les deux sous-unites dHCFN and dHCFC, restent associées, elles montrent différentes affinités pour la chromatine et lient différemment un group de promoteurs, les résultats suggèrent que dHCF peut se lier aux promoteurs en utilisant chacune de ses sous-unitées. En plus de l'association de dHCF avec les facteurs de transcription dE2F s, la distribution de dHCF sur le génome corrèle avec celle du facteur de transcription dE2F2. J'ai aussi montré que les dE2Fs sont nécessaires pour le recrutement de dHCF aux promoteurs d'un sous-groupe de gènes régules par dHCF. Mes résultats ont aussi montré que chez la drosophile comme chez les humains, dl-ICF est implique dans la régulation de la progression de la phase G1 a la phase S du cycle cellulaire en collaboration avec dE2Fs. D'ailleurs, les arrays d'expression ont suggéré que dHCF pourrait réguler le cycle cellulaire de façon indirecte en activant l'expression de gènes impliqués dans l'expression génique et la synthèse de protéines, et en inhibant l'expression de gènes impliqués dans l'adhésion cellulaire. En conclusion, dHCF est une protéine, conservée dans l'évolution, qui se lie spécifiquement a beaucoup d'endroits du génome de Drosophile, grâce à l'interaction avec d'autres protéines, pour réguler l'expression des gènes impliqués dans plusieurs fonctions cellulaires.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A recurring task in the analysis of mass genome annotation data from high-throughput technologies is the identification of peaks or clusters in a noisy signal profile. Examples of such applications are the definition of promoters on the basis of transcription start site profiles, the mapping of transcription factor binding sites based on ChIP-chip data and the identification of quantitative trait loci (QTL) from whole genome SNP profiles. Input to such an analysis is a set of genome coordinates associated with counts or intensities. The output consists of a discrete number of peaks with respective volumes, extensions and center positions. We have developed for this purpose a flexible one-dimensional clustering tool, called MADAP, which we make available as a web server and as standalone program. A set of parameters enables the user to customize the procedure to a specific problem. The web server, which returns results in textual and graphical form, is useful for small to medium-scale applications, as well as for evaluation and parameter tuning in view of large-scale applications, requiring a local installation. The program written in C++ can be freely downloaded from ftp://ftp.epd.unil.ch/pub/software/unix/madap. The MADAP web server can be accessed at http://www.isrec.isb-sib.ch/madap/.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Identification and relative quantification of hundreds to thousands of proteins within complex biological samples have become realistic with the emergence of stable isotope labeling in combination with high throughput mass spectrometry. However, all current chemical approaches target a single amino acid functionality (most often lysine or cysteine) despite the fact that addressing two or more amino acid side chains would drastically increase quantifiable information as shown by in silico analysis in this study. Although the combination of existing approaches, e.g. ICAT with isotope-coded protein labeling, is analytically feasible, it implies high costs, and the combined application of two different chemistries (kits) may not be straightforward. Therefore, we describe here the development and validation of a new stable isotope-based quantitative proteomics approach, termed aniline benzoic acid labeling (ANIBAL), using a twin chemistry approach targeting two frequent amino acid functionalities, the carboxylic and amino groups. Two simple and inexpensive reagents, aniline and benzoic acid, in their (12)C and (13)C form with convenient mass peak spacing (6 Da) and without chromatographic discrimination or modification in fragmentation behavior, are used to modify carboxylic and amino groups at the protein level, resulting in an identical peptide bond-linked benzoyl modification for both reactions. The ANIBAL chemistry is simple and straightforward and is the first method that uses a (13)C-reagent for a general stable isotope labeling approach of carboxylic groups. In silico as well as in vitro analyses clearly revealed the increase in available quantifiable information using such a twin approach. ANIBAL was validated by means of model peptides and proteins with regard to the quality of the chemistry as well as the ionization behavior of the derivatized peptides. A milk fraction was used for dynamic range assessment of protein quantification, and a bacterial lysate was used for the evaluation of relative protein quantification in a complex sample in two different biological states

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Understanding the structure of interphase chromosomes is essential to elucidate regulatory mechanisms of gene expression. During recent years, high-throughput DNA sequencing expanded the power of chromosome conformation capture (3C) methods that provide information about reciprocal spatial proximity of chromosomal loci. Since 2012, it is known that entire chromatin in interphase chromosomes is organized into regions with strongly increased frequency of internal contacts. These regions, with the average size of ∼1 Mb, were named topological domains. More recent studies demonstrated presence of unconstrained supercoiling in interphase chromosomes. Using Brownian dynamics simulations, we show here that by including supercoiling into models of topological domains one can reproduce and thus provide possible explanations of several experimentally observed characteristics of interphase chromosomes, such as their complex contact maps.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: Fourmidable is an infrastructure to curate and share the emerging genetic, molecular, and functional genomic data and protocols for ants. DESCRIPTION: The Fourmidable assembly pipeline groups nucleotide sequences into clusters before independently assembling each cluster. Subsequently, assembled sequences are annotated via Interproscan and BLAST against general and insect-specific databases. Gene-specific information can be retrieved using gene identifiers, searching for similar sequences or browsing through inferred Gene Ontology annotations. The database will readily scale as ultra-high throughput sequence data and sequences from additional species become available. CONCLUSION: Fourmidable currently houses EST data from two ant species and microarray gene expression data for one of these. Fourmidable is publicly available at http://fourmidable.unil.ch.