846 resultados para Whole genome sequencing
Resumo:
Progress in agricultural and environmental technologies is hampered by a slower rate of gene discovery in plants than animals. The vast pool of genes in plants, however, will be an important resource for insertion of genes, via biotechnological procedures, into an array of plants, generating unique germ plasms not achievable by conventional breeding. It just became clear that genomes of grasses have evolved in a manner analogous to Lego blocks. Large chromosome segments have been reshuffled and stuffer pieces added between genes. Although some genomes have become very large, the genome with the fewest stuffer pieces, the rice genome, is the Rosetta Stone of all the bigger grass genomes. This means that sequencing the rice genome as anchor genome of the grasses will provide instantaneous access to the same genes in the same relative physical position in other grasses (e.g., corn and wheat), without the need to sequence each of these genomes independently. (i) The sequencing of the entire genome of rice as anchor genome for the grasses will accelerate plant gene discovery in many important crops (e.g., corn, wheat, and rice) by several orders of magnitudes and reduce research and development costs for government and industry at a faster pace. (ii) Costs for sequencing entire genomes have come down significantly. Because of its size, rice is only 12% of the human or the corn genome, and technology improvements by the human genome project are completely transferable, translating in another 50% reduction of the costs. (iii) The physical mapping of the rice genome by a group of Japanese researchers provides a jump start for sequencing the genome and forming an international consortium. Otherwise, other countries would do it alone and own proprietary positions.
Resumo:
The genetic basis for virulence in influenza virus is largely unknown. To explore the mutational basis for increased virulence in the lung, the H3N2 prototype clinical isolate, A/HK/1/68, was adapted to the mouse. Genomic sequencing provided the first demonstration, to our knowledge, that a group of 11 mutations can convert an avirulent virus to a virulent variant that can kill at a minimal dose. Thirteen of the 14 amino acid substitutions (93%) detected among clonal isolates were likely instrumental in adaptation because of their positive selection, location in functional regions, and/or independent occurrence in other virulent influenza viruses. Mutations in virulent variants repeatedly involved nuclear localization signals and sites of protein and RNA interaction, implicating them as novel modulators of virulence. Mouse-adapted variants with the same hemagglutinin mutations possessed different pH optima of fusion, indicating that fusion activity of hemagglutinin can be modulated by other viral genes. Experimental adaptation resulted in the selection of three mutations that were in common with the virulent human H5N1 isolate A/HK/156/97 and that may be instrumental in its extreme virulence. Analysis of viral adaptation by serial passage appears to provide the identification of biologically relevant mutations.
Resumo:
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery.
Resumo:
The genome of the pufferfish (Fugu rubripes) (400 Mb) is approximately 7.5 times smaller than the human genome, but it has a similar gene repertoire to that of man. If regions of the two genomes exhibited conservation of gene order (i.e., were syntenic), it should be possible to reduce dramatically the effort required for identification of candidate genes in human disease loci by sequencing syntenic regions of the compact Fugu genome. We have demonstrated that three genes (dihydrolipoamide succinyltransferase, S31iii125, and S20i15), which are linked to FOS in the familial Alzheimer disease focus (AD3) on human chromosome 14, have homologues in the Fugu genome adjacent to Fugu cFOS. The relative gene order of cFOS, S31iii125, and S20i15 was the same in both genomes, but in Fugu these three genes lay within a 12.4-kb region, compared to >600 kb in the human AD3 locus. These results demonstrate the conservation of synteny between the genomes of Fugu and man and highlight the utility of this approach for sequence-based identification of genes in human disease loci.
Resumo:
The mouse is the best model system for the study of mammalian genetics and physiology. Because of the feasibility and importance of studying genetic crosses, the mouse genetic map has received tremendous attention in recent years. It currently contains over 14,000 genetically mapped markers, including 700 mutant loci, 3500 genes, and 6500 simple sequence length polymorphisms (SSLPs). The mutant loci and genes allow insights and correlations concerning physiology and development. The SSLPs provide highly polymorphic anchor points that allow inheritance to be traced in any cross and provide a scaffold for assembling physical maps. Adequate physical mapping resources--notably large-insert yeast artificial chromosome (YAC) libraries--are available to support positional cloning projects based on the genetic map, but a comprehensive physical map is still a few years away. Large-scale sequencing efforts have not yet begun in mouse, but comparative sequence analysis between mouse and human is likely to provide tremendous information about gene structure and regulation.
Resumo:
The human squamous cell carcinoma cell line SCC83-01-82 (SCC) contains mutations in both the H-ras and p53 genes, but it exhibits a nontumorigenic phenotype in nude mice. This cell line can be converted into a cell line with a tumorigenic phenotype, SCC83-01-82CA (CA), by treatment with the mutagen methyl methanesulfonate (MMS). This indicates that additional genetic events leading to expression of a cooperating tumor susceptibility gene(s) may be required for tumorigenicity. To identify the cooperating gene(s), an expression cDNA library was made from tumorigenic Ca cells. The library DNA was transfected into nontumorigenic SCC cells and the transfected SCC cells were then injected into nude mice for the selection of a tumorigenic phenotype. Tumors developed in 3 of the 18 mice after injection. Several new cell lines were established from these transfected cell-induced tumors and designated as CATR cells. Tumor histology and karyotype analysis of these cells indicated that they were of human epithelial cell origin. All the CATR cells have the library vector sequence integrated in their genome. Cell line CATR1 expressed a single message from the integrated library representing a 1.3-kb cDNA insert that was absent from untransfected SCC cells or MMS-converted CA cells. This 1.3-kb cDNA insert was cloned by PCR amplification of reverse-transcribed CATR1 total RNA and was designated CATR1.3. The nucleotide sequence of CATR1.3 encodes a peptide of 79 amino acids, has a long 3' untranslated region, and represents an unknown gene product that was associated with the tumorigenic conversion due to the transfected expression library.
Resumo:
We report a general mass spectrometric approach for the rapid identification and characterization of proteins isolated by preparative two-dimensional polyacrylamide gel electrophoresis. This method possesses the inherent power to detect and structurally characterize covalent modifications. Absolute sensitivities of matrix-assisted laser desorption ionization and high-energy collision-induced dissociation tandem mass spectrometry are exploited to determine the mass and sequence of subpicomole sample quantities of tryptic peptides. These data permit mass matching and sequence homology searching of computerized peptide mass and protein sequence data bases for known proteins and design of oligonucleotide probes for cloning unknown proteins. We have identified 11 proteins in lysates of human A375 melanoma cells, including: alpha-enolase, cytokeratin, stathmin, protein disulfide isomerase, tropomyosin, Cu/Zn superoxide dismutase, nucleoside diphosphate kinase A, galaptin, and triosephosphate isomerase. We have characterized several posttranslational modifications and chemical modifications that may result from electrophoresis or subsequent sample processing steps. Detection of comigrating and covalently modified proteins illustrates the necessity of peptide sequencing and the advantages of tandem mass spectrometry to reliably and unambiguously establish the identity of each protein. This technology paves the way for studies of cell-type dependent gene expression and studies of large suites of cellular proteins with unprecedented speed and rigor to provide information complementary to the ongoing Human Genome Project.
Resumo:
Objective: In Southern European countries up to one-third of the patients with hereditary hemochromatosis (HH) do not present the common HFE risk genotype. In order to investigate the molecular basis of these cases we have designed a gene panel for rapid and simultaneous analysis of 6 HH-related genes (HFE, TFR2, HJV, HAMP, SLC40A1 and FTL) by next-generation sequencing (NGS). Materials and Methods: Eighty-eight iron overload Portuguese patients, negative for the common HFE mutations, were analysed. A TruSeq Custom Amplicon kit (TSCA, by Illumina) was designed in order to generate 97 amplicons covering exons, intron/exon junctions and UTRs of the mentioned genes with a cumulative target sequence of 12115bp. Amplicons were sequenced in the MiSeq instrument (IIlumina) using 250bp paired-end reads. Sequences were aligned against human genome reference hg19 using alignment and variant caller algorithms in the MiSeq reporter software. Novel variants were validated by Sanger sequencing and their pathogenic significance were assessed by in silico studies. Results: We found a total of 55 different genetic variants. These include novel pathogenic missense and splicing variants (in HFE and TFR2), a very rare variant in IRE of FTL, a variant that originates a novel translation initiation codon in the HAMP gene, among others. Conclusion: The merging of TSCA methodology and NGS technology appears to be an appropriate tool for simultaneous and fast analysis of HH-related genes in a large number of samples. However, establishing the clinical relevance of NGS-detected variants for HH development remains a hard-working task, requiring further functional studies.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The mapping and sequencing of the human genome has generated a large resource for answering questions about human disease. This achievement is akin in scientific importance to developing the periodic table of elements. Plastic surgery has always been at the frontier medical research. This resource will help us to improve our understanding on the many unknown physiological and pathogical conditions we deal with daily, such as wound heating keloid scar formation, Dupuytren's disease, rheumatoid arthritis, vascular malformation and carcinogenesis. We are primed in obtaining both disease and normal tissues to use this resource and applying it to clinical use. This review is about the human genome, the basis of gene expression profiling and how it will affect our clinical and research practices in the future and for those embarking on the use of this new technology as a research tool, we provide a brief insight on its limitations and pitfalls. (C) 2006 The British Association of Plastic Surgeons. Published by Elsevier Ltd. All rights reserved.
Resumo:
Of the ~1.7 million SINE elements in the human genome, only a tiny number are estimated to be active in transcription by RNA polymerase (Pol) III. Tracing the individual loci from which SINE transcripts originate is complicated by their highly repetitive nature. By exploiting RNA-Seq datasets and unique SINE DNA sequences, we devised a bioinformatic pipeline allowing us to identify Pol III-dependent transcripts of individual SINE elements. When applied to ENCODE transcriptomes of seven human cell lines, this search strategy identified ~1300 Alu loci and ~1100 MIR loci corresponding to detectable transcripts, with ~120 and ~60 respectively Alu and MIR loci expressed in at least three cell lines. In vitro transcription of selected SINEs did not reflect their in vivo expression properties, and required the native 5’-flanking region in addition to internal promoter. We also identified a cluster of expressed AluYa5-derived transcription units, juxtaposed to snaR genes on chromosome 19, formed by a promoter-containing left monomer fused to an Alu-unrelated downstream moiety. Autonomous Pol III transcription was also revealed for SINEs nested within Pol II-transcribed genes raising the possibility of an underlying mechanism for Pol II gene regulation by SINE transcriptional units. Moreover the application of our bioinformatic pipeline to both RNA-seq data of cells subjected to an in vitro pro-oncogenic stimulus and of in vivo matched tumor and non-tumor samples allowed us to detect increased Alu RNA expression as well as the source loci of such deregulation. The ability to investigate SINE transcriptomes at single-locus resolution will facilitate both the identification of novel biologically relevant SINE RNAs and the assessment of SINE expression alteration under pathological conditions.
Resumo:
The Bifibobacterium longum subsp. longum 35624™ strain (formerly named Bifidobacterium longum subsp. infantis) is a well described probiotic with clinical efficacy in Irritable Bowel Syndrome clinical trials and induces immunoregulatory effects in mice and in humans. This paper presents (a) the genome sequence of the organism allowing the assignment to its correct subspeciation longum; (b) a comparative genome assessment with other B. longum strains and (c) the molecular structure of the 35624 exopolysaccharide (EPS624). Comparative genome analysis of the 35624 strain with other B. longum strains determined that the sub-speciation of the strain is longum and revealed the presence of a 35624-specific gene cluster, predicted to encode the biosynthetic machinery for EPS624. Following isolation and acid treatment of the EPS, its chemical structure was determined using gas and liquid chromatography for sugar constituent and linkage analysis, electrospray and matrix assisted laser desorption ionization mass spectrometry for sequencing and NMR. The EPS consists of a branched hexasaccharide repeating unit containing two galactose and two glucose moieties, galacturonic acid and the unusual sugar 6-deoxy-L-talose. These data demonstrate that the B. longum 35624 strain has specific genetic features, one of which leads to the generation of a characteristic exopolysaccharide.
Resumo:
The non-standard decoding of the CUG codon in Candida cylindracea raises a number of questions about the evolutionary process of this organism and other species Candida clade for which the codon is ambiguous. In order to find some answers we studied the transcriptome of C. cylindracea, comparing its behavior with that of Saccharomyces cerevisiae (standard decoder) and Candida albicans (ambiguous decoder). The transcriptome characterization was performed using RNA-seq. This approach has several advantages over microarrays and its application is booming. TopHat and Cufflinks were the software used to build the protocol that allowed for gene quantification. About 95% of the reads were mapped on the genome. 3693 genes were analyzed, of which 1338 had a non-standard start codon (TTG/CTG) and the percentage of expressed genes was 99.4%. Most genes have intermediate levels of expression, some have little or no expression and a minority is highly expressed. The distribution profile of the CUG between the three species is different, but it can be significantly associated to gene expression levels: genes with fewer CUGs are the most highly expressed. However, CUG content is not related to the conservation level: more and less conserved genes have, on average, an equal number of CUGs. The most conserved genes are the most expressed. The lipase genes corroborate the results obtained for most genes of C. cylindracea since they are very rich in CUGs and nothing conserved. The reduced amount of CUG codons that was observed in highly expressed genes may be due, possibly, to an insufficient number of tRNA genes to cope with more CUGs without compromising translational efficiency. From the enrichment analysis, it was confirmed that the most conserved genes are associated with basic functions such as translation, pathogenesis and metabolism. From this set, genes with more or less CUGs seem to have different functions. The key issues on the evolutionary phenomenon remain unclear. However, the results are consistent with previous observations and shows a variety of conclusions that in future analyzes should be taken into consideration, since it was the first time that such a study was conducted.
Resumo:
Relationship between organisms within an ecosystem is one of the main focuses in the study of ecology and evolution. For instance, host-parasite interactions have long been under close interest of ecology, evolutionary biology and conservation science, due to great variety of strategies and interaction outcomes. The monogenean ecto-parasites consist of a significant portion of flatworms. Gyrodactylus salaris is a monogenean freshwater ecto-parasite of Atlantic salmon (Salmo salar) whose damage can make fish to be prone to further bacterial and fungal infections. G. salaris is the only one parasite whose genome has been studied so far. The RNA-seq data analyzed in this thesis has already been annotated by using LAST. The RNA-seq data was obtained from Illumina sequencing i.e. yielded reads were assembled into 15777 transcripts. Last resulted in annotation of 46% transcripts and remaining were left unknown. This thesis work was started with whole data and annotation process was continued by the use of PANNZER, CDD and InterProScan. This annotation resulted in 56% successfully annotated sequences having parasite specific proteins identified. This thesis represents the first of Monogenean transcriptomic information which gives an important source for further research on this specie. Additionally, comparison of annotation methods interestingly revealed that description and domain based methods perform better than simple similarity search methods. Therefore it is more likely to suggest the use of these tools and databases for functional annotation. These results also emphasize the need for use of multiple methods and databases. It also highlights the need of more genomic information related to G. salaris.