59 resultados para SEQUENCING DATA


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have used massively parallel signature sequencing (MPSS) to sample the transcriptomes of 32 normal human tissues to an unprecedented depth, thus documenting the patterns of expression of almost 20,000 genes with high sensitivity and specificity. The data confirm the widely held belief that differences in gene expression between cell and tissue types are largely determined by transcripts derived from a limited number of tissue-specific genes, rather than by combinations of more promiscuously expressed genes. Expression of a little more than half of all known human genes seems to account for both the common requirements and the specific functions of the tissues sampled. A classification of tissues based on patterns of gene expression largely reproduces classifications based on anatomical and biochemical properties. The unbiased sampling of the human transcriptome achieved by MPSS supports the idea that most human genes have been mapped, if not functionally characterized. This data set should prove useful for the identification of tissue-specific genes, for the study of global changes induced by pathological conditions, and for the definition of a minimal set of genes necessary for basic cell maintenance. The data are available on the Web at http://mpss.licr.org and http://sgb.lynxgen.com.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recent advances in sequencing technologies have given all microbiology laboratories access to whole genome sequencing. Providing that tools for the automated analysis of sequence data and databases for associated meta-data are developed, whole genome sequencing will become a routine tool for large clinical microbiology laboratories. Indeed, the continuing reduction in sequencing costs and the shortening of the 'time to result' makes it an attractive strategy in both research and diagnostics. Here, we review how high-throughput sequencing is revolutionizing clinical microbiology and the promise that it still holds. We discuss major applications, which include: (i) identification of target DNA sequences and antigens to rapidly develop diagnostic tools; (ii) precise strain identification for epidemiological typing and pathogen monitoring during outbreaks; and (iii) investigation of strain properties, such as the presence of antibiotic resistance or virulence factors. In addition, recent developments in comparative metagenomics and single-cell sequencing offer the prospect of a better understanding of complex microbial communities at the global and individual levels, providing a new perspective for understanding host-pathogen interactions. Being a high-resolution tool, high-throughput sequencing will increasingly influence diagnostics, epidemiology, risk management, and patient care.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Next-generation sequencing offers an unprecedented opportunity to jointly analyze cellular and viral transcriptional activity without prerequisite knowledge of the nature of the transcripts. SupT1 cells were infected with a vesicular stomatitis virus G envelope protein (VSV-G)-pseudotyped HIV vector. At 24 h postinfection, both cellular and viral transcriptomes were analyzed by serial analysis of gene expression followed by high-throughput sequencing (SAGE-Seq). Read mapping resulted in 33 to 44 million tags aligning with the human transcriptome and 0.23 to 0.25 million tags aligning with the genome of the HIV-1 vector. Thus, at peak infection, 1 transcript in 143 is of viral origin (0.7%), including a small component of antisense viral transcription. Of the detected cellular transcripts, 826 (2.3%) were differentially expressed between mock- and HIV-infected samples. The approach also assessed whether HIV-1 infection modulates the expression of repetitive elements or endogenous retroviruses. We observed very active transcription of these elements, with 1 transcript in 237 being of such origin, corresponding on average to 123,123 reads in mock-infected samples (0.40%) and 129,149 reads in HIV-1-infected samples (0.45%) mapping to the genomic Repbase repository. This analysis highlights key details in the generation and interpretation of high-throughput data in the setting of HIV-1 cellular infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the turn of the century the complete genome sequence of just one mouse strain, C57BL/6J, has been available. Knowing the sequence of this strain has enabled large-scale forward genetic screens to be performed, the creation of an almost complete set of embryonic stem (ES) cell lines with targeted alleles for protein-coding genes, and the generation of a rich catalog of mouse genomic variation. However, many experiments that use other common laboratory mouse strains have been hindered by a lack of whole-genome sequence data for these strains. The last 5 years has witnessed a revolution in DNA sequencing technologies. Recently, these technologies have been used to expand the repertoire of fully sequenced mouse genomes. In this article we review the main findings of these studies and discuss how the sequence of mouse genomes is helping pave the way from sequence to phenotype. Finally, we discuss the prospects for using de novo assembly techniques to obtain high-quality assembled genome sequences of these laboratory mouse strains, and what advances in sequencing technologies may be required to achieve this goal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

MicroRNAs (miRNAs) constitute an important class of gene regulators. While models have been proposed to explain their appearance and expansion, the validation of these models has been difficult due to the lack of comparative studies. Here, we analyze miRNA evolutionary patterns in two mammals, human and mouse, in relation to the age of miRNA families. In this comparative framework, we confirm some predictions of previously advanced models of miRNA evolution, e.g. that miRNAs arise more frequently de novo than by duplication, or that the number of protein-coding gene targeted by miRNAs decreases with evolutionary time. We also corroborate that miRNAs display an increase in expression level with evolutionary time, however we show that this relation is largely tissue-dependent, and especially low in embryonic or nervous tissues. We identify a bias of tag-sequencing techniques regarding the assessment of breadth of expression, leading us, contrary to predictions, to find more tissue-specific expression of older miRNAs. Together, our results refine the models used so far to depict the evolution of miRNA genes. They underline the role of tissue-specific selective forces on the evolution of miRNAs, as well as the potential co-evolution patterns between miRNAs and the protein-coding genes they target.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) experiments are widely used to determine, within entire genomes, the occupancy sites of any protein of interest, including, for example, transcription factors, RNA polymerases, or histones with or without various modifications. In addition to allowing the determination of occupancy sites within one cell type and under one condition, this method allows, in principle, the establishment and comparison of occupancy maps in various cell types, tissues, and conditions. Such comparisons require, however, that samples be normalized. Widely used normalization methods that include a quantile normalization step perform well when factor occupancy varies at a subset of sites, but may miss uniform genome-wide increases or decreases in site occupancy. We describe a spike adjustment procedure (SAP) that, unlike commonly used normalization methods intervening at the analysis stage, entails an experimental step prior to immunoprecipitation. A constant, low amount from a single batch of chromatin of a foreign genome is added to the experimental chromatin. This "spike" chromatin then serves as an internal control to which the experimental signals can be adjusted. We show that the method improves similarity between replicates and reveals biological differences including global and largely uniform changes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Retinal dystrophies (RD) are a group of hereditary diseases that lead to debilitating visual impairment and are usually transmitted as a Mendelian trait. Pathogenic mutations can occur in any of the 100 or more disease genes identified so far, making molecular diagnosis a rather laborious process. In this work we explored the use of whole exome sequencing (WES) as a tool for identification of RD mutations, with the aim of assessing its applicability in a diagnostic context. METHODOLOGY/PRINCIPAL FINDINGS: We ascertained 12 Spanish families with seemingly recessive RD. All of the index patients underwent mutational pre-screening by chip-based sequence hybridization and resulted to be negative for known RD mutations. With the exception of one pedigree, to simulate a standard diagnostic scenario we processed by WES only the DNA from the index patient of each family, followed by in silico data analysis. We successfully identified causative mutations in patients from 10 different families, which were later verified by Sanger sequencing and co-segregation analyses. Specifically, we detected pathogenic DNA variants (∼50% novel mutations) in the genes RP1, USH2A, CNGB3, NMNAT1, CHM, and ABCA4, responsible for retinitis pigmentosa, Usher syndrome, achromatopsia, Leber congenital amaurosis, choroideremia, or recessive Stargardt/cone-rod dystrophy cases. CONCLUSIONS/SIGNIFICANCE: Despite the absence of genetic information from other family members that could help excluding nonpathogenic DNA variants, we could detect causative mutations in a variety of genes known to represent a wide spectrum of clinical phenotypes in 83% of the patients analyzed. Considering the constant drop in costs for human exome sequencing and the relative simplicity of the analyses made, this technique could represent a valuable tool for molecular diagnostics or genetic research, even in cases for which no genotypes from family members are available.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the widespread availability of high-throughput sequencing technologies, sequencing projects have become pervasive in the molecular life sciences. The huge bulk of data generated daily must be analyzed further by biologists with skills in bioinformatics and by "embedded bioinformaticians," i.e., bioinformaticians integrated in wet lab research groups. Thus, students interested in molecular life sciences must be trained in the main steps of genomics: sequencing, assembly, annotation and analysis. To reach that goal, a practical course has been set up for master students at the University of Lausanne: the "Sequence a genome" class. At the beginning of the academic year, a few bacterial species whose genome is unknown are provided to the students, who sequence and assemble the genome(s) and perform manual annotation. Here, we report the progress of the first class from September 2010 to June 2011 and the results obtained by seven master students who specifically assembled and annotated the genome of Estrella lausannensis, an obligate intracellular bacterium related to Chlamydia. The draft genome of Estrella is composed of 29 scaffolds encompassing 2,819,825 bp that encode for 2233 putative proteins. Estrella also possesses a 9136 bp plasmid that encodes for 14 genes, among which we found an integrase and a toxin/antitoxin module. Like all other members of the Chlamydiales order, Estrella possesses a highly conserved type III secretion system, considered as a key virulence factor. The annotation of the Estrella genome also allowed the characterization of the metabolic abilities of this strictly intracellular bacterium. Altogether, the students provided the scientific community with the Estrella genome sequence and a preliminary understanding of the biology of this recently-discovered bacterial genus, while learning to use cutting-edge technologies for sequencing and to perform bioinformatics analyses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We performed exome sequencing to detect somatic mutations in protein-coding regions in seven melanoma cell lines and donor-matched germline cells. All melanoma samples had high numbers of somatic mutations, which showed the hallmark of UV-induced DNA repair. Such a hallmark was absent in tumor sample-specific mutations in two metastases derived from the same individual. Two melanomas with non-canonical BRAF mutations harbored gain-of-function MAP2K1 and MAP2K2 (MEK1 and MEK2, respectively) mutations, resulting in constitutive ERK phosphorylation and higher resistance to MEK inhibitors. Screening a larger cohort of individuals with melanoma revealed the presence of recurring somatic MAP2K1 and MAP2K2 mutations, which occurred at an overall frequency of 8%. Furthermore, missense and nonsense somatic mutations were frequently found in three candidate melanoma genes, FAT4, LRP1B and DSC1.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe an original case of disseminated infection with Histoplasma capsulatum (Hc) var. duboisii in an African patient with AIDS who migrated to Switzerland. The diagnosis of histoplasmosis was suggested using direct examination of tissues and confirmed in 24 h with a panfungal polymerase chain reaction assay. The variety duboisii of Hc was established using DNA sequencing of the polymorphic genomic region OLE. Molecular tools allow diagnosis of histoplasmosis in 24 h, which is drastically shorter than culture procedures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

INTRODUCTION: Diverse microarray and sequencing technologies have been widely used to characterise the molecular changes in malignant epithelial cells in breast cancers. Such gene expression studies to identify markers and targets in tumour cells are, however, compromised by the cellular heterogeneity of solid breast tumours and by the lack of appropriate counterparts representing normal breast epithelial cells. METHODS: Malignant neoplastic epithelial cells from primary breast cancers and luminal and myoepithelial cells isolated from normal human breast tissue were isolated by immunomagnetic separation methods. Pools of RNA from highly enriched preparations of these cell types were subjected to expression profiling using massively parallel signature sequencing (MPSS) and four different genome wide microarray platforms. Functional related transcripts of the differential tumour epithelial transcriptome were used for gene set enrichment analysis to identify enrichment of luminal and myoepithelial type genes. Clinical pathological validation of a small number of genes was performed on tissue microarrays. RESULTS: MPSS identified 6,553 differentially expressed genes between the pool of normal luminal cells and that of primary tumours substantially enriched for epithelial cells, of which 98% were represented and 60% were confirmed by microarray profiling. Significant expression level changes between these two samples detected only by microarray technology were shown by 4,149 transcripts, resulting in a combined differential tumour epithelial transcriptome of 8,051 genes. Microarray gene signatures identified a comprehensive list of 907 and 955 transcripts whose expression differed between luminal epithelial cells and myoepithelial cells, respectively. Functional annotation and gene set enrichment analysis highlighted a group of genes related to skeletal development that were associated with the myoepithelial/basal cells and upregulated in the tumour sample. One of the most highly overexpressed genes in this category, that encoding periostin, was analysed immunohistochemically on breast cancer tissue microarrays and its expression in neoplastic cells correlated with poor outcome in a cohort of poor prognosis estrogen receptor-positive tumours. CONCLUSION: Using highly enriched cell populations in combination with multiplatform gene expression profiling studies, a comprehensive analysis of molecular changes between the normal and malignant breast tissue was established. This study provides a basis for the identification of novel and potentially important targets for diagnosis, prognosis and therapy in breast cancer.