996 resultados para Human Transcriptome
Resumo:
open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.
Resumo:
We report the results of a transcript finishing initiative, undertaken for the purpose of identifying and characterizing novel human transcripts, in which RT-PCR was used to bridge gaps between paired EST Clusters, mapped against the genomic sequence. Each pair of EST Clusters selected for experimental validation was designated a transcript finishing unit (TFU). A total of 489 TFUs were selected for validation, and an overall efficiency of 43.1% was achieved. We generated a total of 59,975 bp of transcribed sequences organized into 432 exons, contributing to the definition of the structure of 211 human transcripts. The structure of several transcripts reported here was confirmed during the course of this project, through the generation of their corresponding full-length cDNA sequences. Nevertheless, for 21% of the validated TFUs, a full-length cDNA sequence is not yet available in public databases, and the structure of 69.2% of these TFUs was not correctly predicted by computer programs. The TF strategy provides a significant contribution to the definition of the complete catalog of human genes and transcripts, because it appears to be particularly useful for identification of low abundance transcripts expressed in a restricted Set of tissues as well as for the delineation of gene boundaries and alternatively spliced isoforms.
Resumo:
Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximate to23,500 genes, of which only approximate to1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.
Resumo:
We have used massively parallel signature sequencing (MPSS) to sample the transcriptomes of 32 normal human tissues to an unprecedented depth, thus documenting the patterns of expression of almost 20,000 genes with high sensitivity and specificity. The data confirm the widely held belief that differences in gene expression between cell and tissue types are largely determined by transcripts derived from a limited number of tissue-specific genes, rather than by combinations of more promiscuously expressed genes. Expression of a little more than half of all known human genes seems to account for both the common requirements and the specific functions of the tissues sampled. A classification of tissues based on patterns of gene expression largely reproduces classifications based on anatomical and biochemical properties. The unbiased sampling of the human transcriptome achieved by MPSS supports the idea that most human genes have been mapped, if not functionally characterized. This data set should prove useful for the identification of tissue-specific genes, for the study of global changes induced by pathological conditions, and for the definition of a minimal set of genes necessary for basic cell maintenance. The data are available on the Web at http://mpss.licr.org and http://sgb.lynxgen.com.
Resumo:
Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.
Resumo:
Understanding alternative splicing is crucial to elucidate the mechanisms behind several biological phenomena, including diseases. The huge amount of expressed sequences available nowadays represents an opportunity and a challenge to catalog and display alternative splicing events (ASEs). Although several groups have faced this challenge with relative success, we still lack a computational tool that uses a simple and straightforward method to retrieve, name and present ASEs. Here we present SPLOOCE, a portal for the analysis of human splicing variants. SPLOOCE uses a method based on regular expressions for retrieval of ASEs. We propose a simple syntax that is able to capture the complexity of ASEs.
Resumo:
Background: Cancer shows a great diversity in its clinical behavior which cannot be easily predicted using the currently available clinical or pathological markers. The identification of pathways associated with lymph node metastasis (N+) and recurrent head and neck squamous cell carcinoma (HNSCC) may increase our understanding of the complex biology of this disease. Methods: Tumor samples were obtained from untreated HNSCC patients undergoing surgery. Patients were classified according to pathologic lymph node status (positive or negative) or tumor recurrence (recurrent or non-recurrent tumor) after treatment (surgery with neck dissection followed by radiotherapy). Using microarray gene expression, we screened tumor samples according to modules comprised by genes in the same pathway or functional category. Results: The most frequent alterations were the repression of modules in negative lymph node (N0) and in non-recurrent tumors rather than induction of modules in N+ or in recurrent tumors. N0 tumors showed repression of modules that contain cell survival genes and in non-recurrent tumors cell-cell signaling and extracellular region modules were repressed. Conclusions: The repression of modules that contain cell survival genes in N0 tumors reinforces the important role that apoptosis plays in the regulation of metastasis. In addition, because tumor samples used here were not microdissected, tumor gene expression data are represented together with the stroma, which may reveal signaling between the microenvironment and tumor cells. For instance, in non-recurrent tumors, extracellular region module was repressed, indicating that the stroma and tumor cells may have fewer interactions, which disable metastasis development. Finally, the genes highlighted in our analysis can be implicated in more than one pathway or characteristic, suggesting that therapeutic approaches to prevent tumor progression should target more than one gene or pathway, specially apoptosis and interactions between tumor cells and the stroma.
Resumo:
Sequencing technologies and new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this objective by generating data from about 1.2 million expressed sequence tags. Approximately 30 of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3421 transcribed regions not associated with annotated transcripts, representing 83.3 of the platform. The total number of differentially expressed sequences was 1007. Also, 28 of analyzed sequences could represent noncoding RNAs. Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of three out of eight potentially tumor markers in prostate tissues. Lists of 1007 differentially expressed sequences, and the 291 potentially noncoding tumor markers were provided.
Resumo:
Cancer/testis Antigens (CTAs) are immunogenic proteins with a restricted expression pattern in normal tissues and aberrant expression in different types of tumors being considered promising candidates for immunotherapy. We used the alignment between EST sequences and the human genome sequence to identify novel CT genes. By examining the EST tissue composition of known CT clusters we defined parameters for the selection of 1184 EST clusters corresponding to putative CT genes. The expression pattern of 70 CT gene candidates was evaluated by RT-PCR in 21 normal tissues, 17 tumor cell lines and 160 primary tumors. We were able to identify 4 CT genes expressed in different types of tumors. The presence of antibodies against the protein encoded by 1 of these 4 CT genes (FAM46D) was exclusively detected in plasma samples from cancer patients. Due to its restricted expression pattern and immunogenicity FAM46D represents a novel target for cancer immunotherapy. (c) 2009 Elsevier Inc. All rights reserved.
Resumo:
Next-generation sequencing offers an unprecedented opportunity to jointly analyze cellular and viral transcriptional activity without prerequisite knowledge of the nature of the transcripts. SupT1 cells were infected with a vesicular stomatitis virus G envelope protein (VSV-G)-pseudotyped HIV vector. At 24 h postinfection, both cellular and viral transcriptomes were analyzed by serial analysis of gene expression followed by high-throughput sequencing (SAGE-Seq). Read mapping resulted in 33 to 44 million tags aligning with the human transcriptome and 0.23 to 0.25 million tags aligning with the genome of the HIV-1 vector. Thus, at peak infection, 1 transcript in 143 is of viral origin (0.7%), including a small component of antisense viral transcription. Of the detected cellular transcripts, 826 (2.3%) were differentially expressed between mock- and HIV-infected samples. The approach also assessed whether HIV-1 infection modulates the expression of repetitive elements or endogenous retroviruses. We observed very active transcription of these elements, with 1 transcript in 237 being of such origin, corresponding on average to 123,123 reads in mock-infected samples (0.40%) and 129,149 reads in HIV-1-infected samples (0.45%) mapping to the genomic Repbase repository. This analysis highlights key details in the generation and interpretation of high-throughput data in the setting of HIV-1 cellular infection.
Resumo:
The reciprocal interaction between cancer cells and the tissue-specific stroma is critical for primary and metastatic tumor growth progression. Prostate cancer cells colonize preferentially bone (osteotropism), where they alter the physiological balance between osteoblast-mediated bone formation and osteoclast-mediated bone resorption, and elicit prevalently an osteoblastic response (osteoinduction). The molecular cues provided by osteoblasts for the survival and growth of bone metastatic prostate cancer cells are largely unknown. We exploited the sufficient divergence between human and mouse RNA sequences together with redefinition of highly species-specific gene arrays by computer-aided and experimental exclusion of cross-hybridizing oligonucleotide probes. This strategy allowed the dissection of the stroma (mouse) from the cancer cell (human) transcriptome in bone metastasis xenograft models of human osteoinductive prostate cancer cells (VCaP and C4-2B). As a result, we generated the osteoblastic bone metastasis-associated stroma transcriptome (OB-BMST). Subtraction of genes shared by inflammation, wound healing and desmoplastic responses, and by the tissue type-independent stroma responses to a variety of non-osteotropic and osteotropic primary cancers generated a curated gene signature ("Core" OB-BMST) putatively representing the bone marrow/bone-specific stroma response to prostate cancer-induced, osteoblastic bone metastasis. The expression pattern of three representative Core OB-BMST genes (PTN, EPHA3 and FSCN1) seems to confirm the bone specificity of this response. A robust induction of genes involved in osteogenesis and angiogenesis dominates both the OB-BMST and Core OB-BMST. This translates in an amplification of hematopoietic and, remarkably, prostate epithelial stem cell niche components that may function as a self-reinforcing bone metastatic niche providing a growth support specific for osteoinductive prostate cancer cells. The induction of this combinatorial stem cell niche is a novel mechanism that may also explain cancer cell osteotropism and local interference with hematopoiesis (myelophthisis). Accordingly, these stem cell niche components may represent innovative therapeutic targets and/or serum biomarkers in osteoblastic bone metastasis.
Resumo:
With the availability of a large amount of genomic data it is expected that the influence of single nucleotide variations (SNVs) in many biological phenomena will be elucidated. Here, we approached the problem of how SNVs affect alternative splicing. First, we observed that SNVs and exonic splicing regulators (ESRs) independently show a biased distribution in alternative exons. More importantly, SNVs map more frequently in ESRs located in alternative exons than in ESRs located in constitutive exons. By looking at SNVs associated with alternative exon/intron borders (by their common presence in the same cDNA molecule), we observed that a specific type of ESR, the exonic splicing silencers (ESSs), are more frequently modified by SNVs. Our results establish a clear association between genetic diversity and alternative splicing involving ESSs.
Resumo:
Adenosine deaminases acting on RNA (ADARs) catalyze the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA) and thereby potentially alter the information content and structure of cellular RNAs. Notably, although the overwhelming majority of such editing events occur in transcripts derived from Alu repeat elements, the biological function of non-coding RNA editing remains uncertain. Here, we show that mutations in ADAR1 (also known as ADAR) cause the autoimmune disorder Aicardi-Goutieres syndrome (AGS). As in Adar1-null mice, the human disease state is associated with upregulation of interferon-stimulated genes, indicating a possible role for ADAR1 as a suppressor of type I interferon signaling. Considering recent insights derived from the study of other AGS-related proteins, we speculate that ADAR1 may limit the cytoplasmic accumulation of the dsRNA generated from genomic repetitive elements.
Resumo:
The reciprocal interaction between cancer cells and the tissue-specific stroma is critical for primary and metastatic tumor growth progression. Prostate cancer cells colonize preferentially bone (osteotropism), where they alter the physiological balance between osteoblast-mediated bone formation and osteoclast-mediated bone resorption, and elicit prevalently an osteoblastic response (osteoinduction). The molecular cues provided by osteoblasts for the survival and growth of bone metastatic prostate cancer cells are largely unknown. We exploited the sufficient divergence between human and mouse RNA sequences together with redefinition of highly species-specific gene arrays by computer-aided and experimental exclusion of cross-hybridizing oligonucleotide probes. This strategy allowed the dissection of the stroma (mouse) from the cancer cell (human) transcriptome in bone metastasis xenograft models of human osteoinductive prostate cancer cells (VCaP and C4-2B). As a result, we generated the osteoblastic bone metastasis-associated stroma transcriptome (OB-BMST). Subtraction of genes shared by inflammation, wound healing and desmoplastic responses, and by the tissue type-independent stroma responses to a variety of non-osteotropic and osteotropic primary cancers generated a curated gene signature ("Core" OB-BMST) putatively representing the bone marrow/bone-specific stroma response to prostate cancer-induced, osteoblastic bone metastasis. The expression pattern of three representative Core OB-BMST genes (PTN, EPHA3 and FSCN1) seems to confirm the bone specificity of this response. A robust induction of genes involved in osteogenesis and angiogenesis dominates both the OB-BMST and Core OB-BMST. This translates in an amplification of hematopoietic and, remarkably, prostate epithelial stem cell niche components that may function as a self-reinforcing bone metastatic niche providing a growth support specific for osteoinductive prostate cancer cells. The induction of this combinatorial stem cell niche is a novel mechanism that may also explain cancer cell osteotropism and local interference with hematopoiesis (myelophthisis). Accordingly, these stem cell niche components may represent innovative therapeutic targets and/or serum biomarkers in osteoblastic bone metastasis.