879 resultados para next-generation sequencing
Resumo:
The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings. © 2011 Wiley-Liss, Inc.
Resumo:
Since the turn of the century the complete genome sequence of just one mouse strain, C57BL/6J, has been available. Knowing the sequence of this strain has enabled large-scale forward genetic screens to be performed, the creation of an almost complete set of embryonic stem (ES) cell lines with targeted alleles for protein-coding genes, and the generation of a rich catalog of mouse genomic variation. However, many experiments that use other common laboratory mouse strains have been hindered by a lack of whole-genome sequence data for these strains. The last 5 years has witnessed a revolution in DNA sequencing technologies. Recently, these technologies have been used to expand the repertoire of fully sequenced mouse genomes. In this article we review the main findings of these studies and discuss how the sequence of mouse genomes is helping pave the way from sequence to phenotype. Finally, we discuss the prospects for using de novo assembly techniques to obtain high-quality assembled genome sequences of these laboratory mouse strains, and what advances in sequencing technologies may be required to achieve this goal.
Resumo:
Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.
Resumo:
High-fidelity 'proofreading' polymerases are often used in library construction for next-generation sequencing projects, in an effort to minimize errors in the resulting sequence data. The increased template fidelity of these polymerases can come at the cost of reduced template specificity, and library preparation methods based on the AFLP technique may be particularly susceptible. Here, we compare AFLP profiles generated with standard Taq and two versions of a high-fidelity polymerase. We find that Taq produces fewer and brighter peaks than high-fidelity polymerase, suggesting that Taq performs better at selectively amplifying templates that exactly match the primer sequences. Because the higher accuracy of proofreading polymerases remains important for sequencing applications, we suggest that it may be more effective to use alternative library preparation methods.
Resumo:
In the past 5 years "Next-generation" Sequencing (NGS) technologies have transformed genomics by delivering fast, inexpensive and accurate genomeinformation changing the way we think about scientific approaches in basic,applied and clinical research. The inexpensive production of large volumes ofsequence data is the main advantage over the automated Sanger method,making this new technology useful for many applications. In this chapter, a brieftechnical review of NGS technologies is given, along with the keys to NGSsuccess and a broad range of applications for NGS technologies.
Resumo:
Mitochondrial DNA (mtDNA), a maternally inherited 16.6-Kb molecule crucial for energy production, is implicated in numerous human traits and disorders. It has been hypothesized that the presence of mutations in the mtDNA may contribute to the complex genetic basis of schizophreniadisease, due to the evidence of maternal inheritance and the presence of schizophrenia symptoms in patients affected of a mitochondrial disorder related to a mtDNA mutation. The present project aims to study the association of variants of mitochondrial DNA (mtDNA), and an increased risk of schizophrenia in a cohort of patients and controls from the same population. The entire mtDNA of 55 schizophrenia patients with an apparent maternal transmission of the disease and 38 controls was sequenced by Next Generation Sequencing (Ion Torrent PGM, Life Technologies) and compared to the reference sequence. The current method for establishing mtDNA haplotypes is Sanger sequencing, which is laborious, timeconsuming, and expensive. With the emergence of Next Generation Sequencing technologies, this sequencing process can be much more quickly and cost-efficiently. We have identified 14 variants that have not been previously reported. Two of them were missense variants: MTATP6 p.V113M and MTND5 p.F334L ,and also three variants encoding rRNA and one variant encoding tRNA. Not significant differences have been found in the number of variants between the two groups. We found that the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of the bioinformatics analysis and annotation step would be desirable to facilitate the application of NGS in mtDNA analysis.
Resumo:
The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings. (C) 2011 Wiley-Liss, Inc.
Resumo:
Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one Holstein, and one Hereford) and one indicine (Nelore) cattle. Within mapped chromosomal sequence, we identified 1265 CNV regions comprising similar to 55.6-Mbp sequence-476 of which (similar to 38%) have not previously been reported. We validated this sequence-based CNV call set with array comparative genomic hybridization (aCGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH), achieving a validation rate of 82% and a false positive rate of 8%. We further estimated absolute copy numbers for genomic segments and annotated genes in each individual. Surveys of the top 25 most variable genes revealed that the Nelore individual had the lowest copy numbers in 13 cases (similar to 52%, chi(2) test; P-value <0.05). In contrast, genes related to pathogen- and parasite-resistance, such as CATHL4 and ULBP17, were highly duplicated in the Nelore individual relative to the taurine cattle, while genes involved in lipid transport and metabolism, including APOL3 and FABP2, were highly duplicated in the beef breeds. These CNV regions also harbor genes like BPIFA2A (BSP30A) and WC1, suggesting that some CNVs may be associated with breed-specific differences in adaptation, health, and production traits. By providing the first individualized cattle CNV and segmental duplication maps and genome-wide gene copy number estimates, we enable future CNV studies into highly duplicated regions in the cattle genome.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Background: Next-generation sequencing (NGS) allows for sampling numerous viral variants from infected patients. This provides a novel opportunity to represent and study the mutational landscape of Hepatitis C Virus (HCV) within a single host.Results: Intra-host variants of the HCV E1/E2 region were extensively sampled from 58 chronically infected patients. After NGS error correction, the average number of reads and variants obtained from each sample were 3202 and 464, respectively. The distance between each pair of variants was calculated and networks were created for each patient, where each node is a variant and two nodes are connected by a link if the nucleotide distance between them is 1. The work focused on large components having > 5% of all reads, which in average account for 93.7% of all reads found in a patient. The distance between any two variants calculated over the component correlated strongly with nucleotide distances (r = 0.9499; p = 0.0001), a better correlation than the one obtained with Neighbour-Joining trees (r = 0.7624; p = 0.0001). In each patient, components were well separated, with the average distance between (6.53%) being 10 times greater than within each component (0.68%). The ratio of nonsynonymous to synonymous changes was calculated and some patients (6.9%) showed a mixture of networks under strong negative and positive selection. All components were robust to in silico stochastic sampling; even after randomly removing 85% of all reads, the largest connected component in the new subsample still involved 82.4% of remaining nodes. In vitro sampling showed that 93.02% of components present in the original sample were also found in experimental replicas, with 81.6% of reads found in both. When syringe-sharing transmission events were simulated, 91.2% of all simulated transmission events seeded all components present in the source.Conclusions: Most intra-host variants are organized into distinct single-mutation components that are: well separated from each other, represent genetic distances between viral variants, robust to sampling, reproducible and likely seeded during transmission events. Facilitated by NGS, large components offer a novel evolutionary framework for genetic analysis of intra-host viral populations and understanding transmission, immune escape and drug resistance.
Resumo:
HLA-E is a non-classical Human Leucocyte Antigen class I gene with immunomodulatory properties. Whereas HLA-E expression usually occurs at low levels, it is widely distributed amongst human tissues, has the ability to bind self and non-self antigens and to interact with NK cells and T lymphocytes, being important for immunosurveillance and also for fighting against infections. HLA-E is usually the most conserved locus among all class I genes. However, most of the previous studies evaluating HLA-E variability sequenced only a few exons or genotyped known polymorphisms. Here we report a strategy to evaluate HLA-E variability by next-generation sequencing (NGS) that might be used to other HLA loci and present the HLA-E haplotype diversity considering the segment encoding the entire HLA-E mRNA (including 5'UTR, introns and the 3'UTR) in two African population samples, Susu from Guinea-Conakry and Lobi from Burkina Faso. Our results indicate that (a) the HLA-E gene is indeed conserved, encoding mainly two different protein molecules; (b) Africans do present several unknown HLA-E alleles presenting synonymous mutations; (c) the HLA-E 3'UTR is quite polymorphic and (d) haplotypes in the HLA-E 3'UTR are in close association with HLA-E coding alleles. NGS has proved to be an important tool on data generation for future studies evaluating variability in non-classical MHC genes.
Resumo:
Le tecniche di next generation sequencing costituiscono un potente strumento per diverse applicazioni, soprattutto da quando i loro costi sono iniziati a calare e la qualità dei loro dati a migliorare. Una delle applicazioni del sequencing è certamente la metagenomica, ovvero l'analisi di microorganismi entro un dato ambiente, come per esempio quello dell'intestino. In quest'ambito il sequencing ha permesso di campionare specie batteriche a cui non si riusciva ad accedere con le tradizionali tecniche di coltura. Lo studio delle popolazioni batteriche intestinali è molto importante in quanto queste risultano alterate come effetto ma anche causa di numerose malattie, come quelle metaboliche (obesità, diabete di tipo 2, etc.). In questo lavoro siamo partiti da dati di next generation sequencing del microbiota intestinale di 5 animali (16S rRNA sequencing) [Jeraldo et al.]. Abbiamo applicato algoritmi ottimizzati (UCLUST) per clusterizzare le sequenze generate in OTU (Operational Taxonomic Units), che corrispondono a cluster di specie batteriche ad un determinato livello tassonomico. Abbiamo poi applicato la teoria ecologica a master equation sviluppata da [Volkov et al.] per descrivere la distribuzione dell'abbondanza relativa delle specie (RSA) per i nostri campioni. La RSA è uno strumento ormai validato per lo studio della biodiversità dei sistemi ecologici e mostra una transizione da un andamento a logserie ad uno a lognormale passando da piccole comunità locali isolate a più grandi metacomunità costituite da più comunità locali che possono in qualche modo interagire. Abbiamo mostrato come le OTU di popolazioni batteriche intestinali costituiscono un sistema ecologico che segue queste stesse regole se ottenuto usando diverse soglie di similarità nella procedura di clustering. Ci aspettiamo quindi che questo risultato possa essere sfruttato per la comprensione della dinamica delle popolazioni batteriche e quindi di come queste variano in presenza di particolari malattie.
Resumo:
In chronic myeloid leukemia and Philadelphia-positive acute lymphoblastic leukemia patients resistant to tyrosine kinase inhibitors (TKIs), BCR-ABL kinase domain mutation status is an essential component of the therapeutic decision algorithm. The recent development of Ultra-Deep Sequencing approach (UDS) has opened the way to a more accurate characterization of the mutant clones surviving TKIs conjugating assay sensitivity and throughput. We decided to set-up and validated an UDS-based for BCR-ABL KD mutation screening in order to i) resolve qualitatively and quantitatively the complexity and the clonal structure of mutated populations surviving TKIs, ii) study the dynamic of expansion of mutated clones in relation to TKIs therapy, iii) assess whether UDS may allow more sensitive detection of emerging clones, harboring critical 2GTKIs-resistant mutations predicting for an impending relapse, earlier than SS. UDS was performed on a Roche GS Junior instrument, according to an amplicon sequencing design and protocol set up and validated in the framework of the IRON-II (Interlaboratory Robustness of Next-Generation Sequencing) International consortium.Samples from CML and Ph+ ALL patients who had developed resistance to one or multiple TKIs and collected at regular time-points during treatment were selected for this study. Our results indicate the technical feasibility, accuracy and robustness of our UDS-based BCR-ABL KD mutation screening approach. UDS was found to provide a more accurate picture of BCR-ABL KD mutation status, both in terms of presence/absence of mutations and in terms of clonal complexity and showed that BCR-ABL KD mutations detected by SS are only the “tip of iceberg”. In addition UDS may reliably pick 2GTKIs-resistant mutations earlier than SS in a significantly greater proportion of patients.The enhanced sensitivity as well as the possibility to identify low level mutations point the UDS-based approach as an ideal alternative to conventional sequencing for BCR-ABL KD mutation screening in TKIs-resistant Ph+ leukemia patients
Resumo:
Pediatric acute myeloid leukemia (AML) is a molecularly heterogeneous disease that arises from genetic alterations in pathways that regulate self-renewal and myeloid differentiation. While the majority of patients carry recurrent chromosomal translocations, almost 20% of childhood AML do not show any recognizable cytogenetic alteration and are defined as cytogenetically normal (CN)-AML. CN-AML patients have always showed a great variability in response to therapy and overall outcome, underlining the presence of unknown genetic changes, not detectable by conventional analyses, but relevant for pathogenesis, and outcome of AML. The development of novel genome-wide techniques such as next-generation sequencing, have tremendously improved our ability to interrogate the cancer genome. Based on this background, the aim of this research study was to investigate the mutational landscape of pediatric CN-AML patients negative for all the currently known somatic mutations reported in AML through whole-transcriptome sequencing (RNA-seq). RNA-seq performed on diagnostic leukemic blasts from 19 pediatric CN-AML cases revealed a considerable incidence of cryptic chromosomal rearrangements, with the identification of 21 putative fusion genes. Several of the fusion genes that were identified in this study are recurrent and might have a prognostic and/or therapeutic relevance. A paradigm of that is the CBFA2T3-GLIS2 fusion, which has been demonstrated to be a common alteration in pediatric CN-AML, predicting poor outcome. Important findings have been also obtained in the identification of novel therapeutic targets. On one side, the identification of NUP98-JARID1A fusion suggests the use of disulfiram; on the other, here we describe alteration-activating tyrosine kinases, providing functional data supporting the use of tyrosine kinase inhibitors to specifically inhibit leukemia cells. This study provides new insights in the knowledge of genetic alterations underlying pediatric AML, defines novel prognostic markers and putative therapeutic targets, and prospectively ensures a correct risk stratification and risk-adapted therapy also for the “all-neg” AML subgroup.