917 resultados para RNA-seq data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the extent of genomic transcription and its functional relevance is a central goal in genomics research. However, detailed genome-wide investigations of transcriptome complexity in major mammalian organs have been scarce. Here, using extensive RNA-seq data, we show that transcription of the genome is substantially more widespread in the testis than in other organs across representative mammals. Furthermore, we reveal that meiotic spermatocytes and especially postmeiotic round spermatids have remarkably diverse transcriptomes, which explains the high transcriptome complexity of the testis as a whole. The widespread transcriptional activity in spermatocytes and spermatids encompasses protein-coding and long noncoding RNA genes but also poorly conserves intergenic sequences, suggesting that it may not be of immediate functional relevance. Rather, our analyses of genome-wide epigenetic data suggest that this prevalent transcription, which most likely promoted the birth of new genes during evolution, is facilitated by an overall permissive chromatin in these germ cells that results from extensive chromatin remodeling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Les dinoflagellés sont des eucaryotes unicellulaires que l’on retrouve autant en eau douce qu’en milieu marin. Ils sont particulièrement connus pour causer des fleurs d’algues toxiques nommées ‘marée-rouge’, ainsi que pour leur symbiose avec les coraux et pour leur importante contribution à la fixation du carbone dans les océans. Au point de vue moléculaire, ils sont aussi connus pour leur caractéristiques nucléaires uniques, car on retrouve généralement une quantité immense d’ADN dans leurs chromosomes et ceux-ci sont empaquetés et condensés sous une forme cristalline liquide au lieu de nucléosomes. Les gènes encodés par le noyau sont souvent présents en multiples copies et arrangés en tandem et aucun élément de régulation transcriptionnelle, y compris la boite TATA, n’a encore été observé. L’organisation unique de la chromatine des dinoflagellés suggère que différentes stratégies sont nécessaires pour contrôler l’expression des gènes de ces organismes. Dans cette étude, j’ai abordé ce problème en utilisant le dinoflagellé photosynthétique Lingulodinium polyedrum comme modèle. L. polyedrum est d’un intérêt particulier, car il a plusieurs rythmes circadiens (journalier). À ce jour, toutes les études sur l’expression des gènes lors des changements circadiens ont démontrées une régulation à un niveau traductionnel. Pour mes recherches, j’ai utilisé les approches transcriptomique, protéomique et phosphoprotéomique ainsi que des études biochimiques pour donner un aperçu de la mécanique de la régulation des gènes des dinoflagellés, ceci en mettant l’accent sur l’importance de la phosphorylation du système circadien de L. polyedrum. L’absence des protéines histones et des nucléosomes est une particularité des dinoflagellés. En utilisant la technologie RNA-Seq, j’ai trouvé des séquences complètes encodant des histones et des enzymes modifiant les histones. L polyedrum exprime donc des séquences conservées codantes pour les histones, mais le niveau d’expression protéique est plus faible que les limites de détection par immunodétection de type Western. Les données de séquençage RNA-Seq ont également été utilisées pour générer un transcriptome, qui est une liste des gènes exprimés par L. polyedrum. Une recherche par homologie de séquences a d’abord été effectuée pour classifier les transcrits en diverses catégories (Gene Ontology; GO). Cette analyse a révélé une faible abondance des facteurs de transcription et une surprenante prédominance, parmi ceux-ci, des séquences à domaine Cold Shock. Chez L. polyedrum, plusieurs gènes sont répétés en tandem. Un alignement des séquences obtenues par RNA-Seq avec les copies génomiques de gènes organisés en tandem a été réalisé pour examiner la présence de transcrits polycistroniques, une hypothèse formulée pour expliquer le manque d’élément promoteur dans la région intergénique de la séquence de ces gènes. Cette analyse a également démontré une très haute conservation des séquences codantes des gènes organisés en tandem. Le transcriptome a également été utilisé pour aider à l’identification de protéines après leur séquençage par spectrométrie de masse, et une fraction enrichie en phosphoprotéines a été déterminée comme particulièrement bien adapté aux approches d’analyse à haut débit. La comparaison des phosphoprotéomes provenant de deux périodes différentes de la journée a révélée qu’une grande partie des protéines pour lesquelles l’état de phosphorylation varie avec le temps est reliées aux catégories de liaison à l’ARN et de la traduction. Le transcriptome a aussi été utilisé pour définir le spectre des kinases présentes chez L. polyedrum, qui a ensuite été utilisé pour classifier les différents peptides phosphorylés qui sont potentiellement les cibles de ces kinases. Plusieurs peptides identifiés comme étant phosphorylés par la Casein Kinase 2 (CK2), une kinase connue pour être impliquée dans l’horloge circadienne des eucaryotes, proviennent de diverses protéines de liaison à l’ARN. Pour évaluer la possibilité que quelques-unes des multiples protéines à domaine Cold Shock identifiées dans le transcriptome puissent moduler l’expression des gènes de L. polyedrum, tel qu’observé chez plusieurs autres systèmes procaryotiques et eucaryotiques, la réponse des cellules à des températures froides a été examinée. Les températures froides ont permis d’induire rapidement un enkystement, condition dans laquelle ces cellules deviennent métaboliquement inactives afin de résister aux conditions environnementales défavorables. Les changements dans le profil des phosphoprotéines seraient le facteur majeur causant la formation de kystes. Les phosphosites prédits pour être phosphorylés par la CK2 sont la classe la plus fortement réduite dans les kystes, une découverte intéressante, car le rythme de la bioluminescence confirme que l’horloge a été arrêtée dans le kyste.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

EIF4E, le facteur d’initiation de la traduction chez les eucaryotes est un oncogène puissant et qui se trouve induit dans plusieurs types de cancers, parmi lesquels les sous-types M4 et M5 de la leucémie aiguë myéloblastique (LAM). EIF4E est régulé à plusieurs niveaux cependant, la régulation transcriptionnelle de ce gène est peu connue. Mes résultats montrent que EIF4E est une cible transcriptionnelle directe du facteur nucléaire « kappa-light- chain- enhancer of activated B cells » (NF-κB).Dans les cellules hématopoïétiques primaires et les lignées cellulaires, les niveaux de EIF4E sont induits par des inducteurs de NF-κB. En effet, l’inactivation pharmaceutique ou génétique de NF-κB réprime l’activation de EIF4E. En effet, suite à l’activation de NF-κB chez l’humain, le promoteur endogène de EIF4E recrute p65 (RelA) et c-Rel aux sites évolutionnaires conservés κB in vitro et in vivo en même temps que p300 ainsi que la forme phosphorylée de Pol II. De plus, p65 est sélectivement associé au promoteur de EIF4E dans les sous-types LAM M4/M5 mais non pas dans les autres sous-types LAM ou dans les cellules hématopoïétiques primaires normales. Ceci indique que ce processus représente un facteur essentiel qui détermine l’expression différentielle de EIF4E dans la LAM. Les analyses de données d’expressions par séquençage de l’ARN provenant du « Cancer Genome Atlas » (TCGA) suggèrent que les niveaux d’ARNm de EIF4E et RELA se trouvent augmentés dans les cas LAM à pronostic intermédiaire ou faible mais non pas dans les groupes cytogénétiquement favorables. De plus, des niveaux élevés d’ARNm de EIF4E et RELA sont significativement associés avec un taux de survie relativement bas chez les patients. En effet, les sites uniques κB se trouvant dans le promoteur de EIF4E recrutent le régulateur de transcription NF-κB p65 dans 47 nouvelles cibles prévues. Finalement, 6 nouveaux facteurs de transcription potentiellement impliqués dans la régulation du gène EIF4E ont été prédits par des analyses de données ChIP-Seq provenant de l’encyclopédie des éléments d’ADN (ENCODE). Collectivement, ces résultats fournissent de nouveaux aperçus sur le control transcriptionnel de EIF4E et offrent une nouvelle base moléculaire pour sa dérégulation dans au moins un sous-groupe de spécimens de LAM. L’étude et la compréhension de ce niveau de régulation dans le contexte de spécimens de patients s’avère important pour le développement de nouvelles stratégies thérapeutiques ciblant l’expression du gène EIF4E moyennant des inhibiteurs de NF-κB en combinaison avec la ribavirine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Root-yield-1.06 is a major QTL affecting root system architecture (RSA) and other agronomic traits in maize. The effect of this QTL has been evaluated with the development of near isogenic lines (NILs) differing at the QTL position. The objective of this study was to fine map qroot-yield-1.06 by marker-assisted searching for chromosome recombinants in the QTL interval and concurrent root phenotyping in both controlled and field conditions, through successive generations. Complementary approaches such as QTL meta-analysis and RNA-seq were deployed in order to help prioritizing candidate genes within the QTL target region. Using a selected group of genotypes, field based root analysis by ‘shovelomics’ enabled to accurately collect RSA information of adult maize plants. Shovelomics combined with software-assisted root imaging analysis proved to be an informative and relatively highly automated phenotyping protocol. A QTL interval mapping was conducted using a segregating population at the seedling stage grown in controlled environment. Results enabled to narrow down the QTL interval and to identify new polymorphic markers for MAS in field experiments. A collection of homozygous recombinant NILs was developed by screening segregating populations with markers flanking qroot-yield-1.06. A first set of lines from this collection was phenotyped based on the adapted shovelomics protocol. QTL analysis based on these data highlighted an interval of 1.3 Mb as completely linked with the target QTL but, a larger safer interval of 4.1 Mb was selected for further investigations. QTL meta-analysis allows to synthetize information on root QTLs and two mQTLs were identified in the qroot-yield-1.06 interval. Trascriptomics analysis based on RNA-seq data of the two contrasting QTL-NILs, confirmed alternative haplotypes at chromosome bin 1.06. qroot-yield-1.06 has now been delimited to a 4.1-Mb interval, and thanks to the availability of additional untested homozygous recombinant NILs, the potentially achievable mapping resolution at qroot-yield-1.06 is c. 50 kb.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mesenchymale Stamzellen (MSC) sind Vertreter der adulten Stammzellen. Sie bergen durch ihre große Plastizität ein immenses Potential für die klinische Nutzung in Form von Stammzelltherapien. Zellen dieses Typs kommen vornehmlich im Knochenmark der großen Röhrenknochen vor und können zu Knochen, Knorpel und Fettzellen differenzieren. MSC leisten einen wichtigen Beitrag im Rahmen regenerativer Prozesse, beispielsweise zur Heilung von Frakturen. Breite Studien demonstrieren bereits jetzt auch bei komplexeren Erkrankungen (z.B. Osteoporose) therapeutisch vielversprechende Einsatzmöglichkeiten. Oft kommen hierbei aus MSC gezielt differenzierte Folgelinien aus Zellkulturen zum Einsatz. Dies bedingt eine kontrollierte Steuerung der Differenzierungsprozesse in vitro. Der Differenzierung einer Stammzelle liegt eine komplexe Veränderung ihrer Genexpression zugrunde. Genexpressionsmuster zur Erhaltung und Proliferation der Stammzellen müssen durch solche, die der linienspezifischen Differenzierung dienen, ersetzt werden. Die mit der Differenzierung einhergehende, transkriptomische Neuausrichtung ist für das Verständnis der Prozesse grundlegend und wurde bislang nur unzureichend untersucht. Ziel der vorliegenden Arbeit ist eine transkriptomweite und vergleichende Genexpressionsanalyse Mesenchymaler Stammzellen und deren in vitro differenzierten Folgelinien mittels Plasmid - DNA Microarrays und Sequenziertechniken der nächsten Generation (RNA-Seq, Illumina Plattform). In dieser Arbeit diente das Hausrind (Bos taurus) als Modellorganismus, da es genetisch betrachtet eine hohe Ähnlichkeit zum Menschen aufweist und Knochenmark als Quelle von MSC gut verfügbar ist. Primärkulturen Mesenchymaler Stammzellen konnten aus dem Knochenmark von Rindern erfolgreich isoliert werden. Es wurden in vitro Zellkultur - Versuche durchgeführt, um die Zellen zu Osteoblasten, Chondrozyten und Adipozyten zu differenzieren. Zur Genexpressionsanalyse wurde RNA aus jungen MSC und einer MSC Langzeitkultur („alte MSC“), sowie aus den differenzierten Zelllinien isoliert und für nachfolgende Experimente wo nötig amplifiziert. Der Erfolg der Differenzierungen konnte anhand der Genexpression von spezifischen Markergenen und mittels histologischer Färbungen belegt werden. Hierbei zeigte sich die Differenzierung zu Osteoblasten und Adipozyten erfolgreich, während die Differenzierung zu Chondrozyten trotz diverser Modifikationen am Protokoll nicht erfolgreich durchgeführt werden konnte. Eine vergleichende Hybridisierung zur Bestimmung differentieller Genexpression (MSC vs. Differenzierung) mittels selbst hergestellter Plasmid - DNA Microarrays ergab für die Osteogenese mit Genen wie destrin und enpp1, für die undifferenzierten MSC mit dem Gen sema3c neue Kandidatengene, deren biologische Funktion aufzuklären in zukünftigen Experimenten vielversprechende Ergebnisse liefern sollte. Die Analyse der transkriptomweiten Genexpression mittels NGS lieferte einen noch umfangreicheren Einblick ins Differenzierungsgeschehen. Es zeigte sich eine hohe Ähnlichkeit im Expressionsprofil von jungen MSC und Adipozyten, sowie zwischen den Profilen der alten MSC (eine Langzeitkultur) und Osteoblasten. Die alten MSC wiesen deutliche Anzeichen für eine spontane Differenzierung in die osteogene Richtung auf. Durch Analyse der 100 am stärksten exprimierten Gene jeder Zelllinie ließen sich für junge MSC und Adipozyten besonders Gene der extrazellulären Matrix (z.B col1a1,6 ; fn1 uvm.) auffinden. Sowohl Osteoblasten, als auch die alten MSC exprimieren hingegen verstärkt Gene mit Bezug zur oxidativen Phosphorylierung, sowie ribosomale Proteine. Eine Betrachtung der differentiellen Genexpression (junge MSC vs. Differenzierung) mit anschließender Pathway Analyse und Genontologie Anreicherungsstatistik unterstützt diese Ergebnisse vor allem bei Osteoblasten, wo nun jedoch zusätzlich auch Gene zur Regulation der Knochenentwicklung und Mineralisierung in den Vordergrund treten. Für Adipozyten konnte mit Genen des „Jak-STAT signaling pathway“, der Fokalen Adhäsion, sowie Genen des „Cytokine-cytokine receptor interaction pathway“ sehr spannende Einsichten in die Biologie dieses Zelltyps erlangt werden, die sicher weiterer Untersuchungen bedürfen. In undifferenzierten MSC konnte durch differentielle Genexpressionsanalyse die Rolle des nicht kanonischen Teils des WNT Signalweges als für die Aufrechterhaltung des Stammzellstatus potentiell äußerst einflussreich ermittelt werden. Die hier diskutierten Ergebnisse zeigen beispielhaft, dass besonders mittels Genexpressionsanalyse im Hochdurchsatzverfahren wertvolle Einblicke in die komplexe Biologie der Stammzelldifferenzierung möglich sind. Als Grundlage für nachfolgende Arbeiten konnten interessante Gene ermittelt und Hypothesen zu deren Einfluss auf Stammzelleigenschaften und Differenzierungsprozesse aufgestellt werden. Um einen besseren Einblick in den Differenzierungsverlauf zu ermöglichen, könnten künftig NGS Analysen zu unterschiedlichen Differenzierungszeitpunkten durchgeführt werden. Zudem wären weitere Anstrengungen zur erfolgreichen Etablierung der chondrogenen Differenzierung zur vollständigen Analyse der Genexpression des trilinearen Differenzierungspotentials von MSC wünschenswert.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Leopard complex spotting is a group of white spotting patterns in horses caused by an incompletely dominant gene (LP) where homozygotes (LP/LP) are also affected with congenital stationary night blindness. Previous studies implicated Transient Receptor Potential Cation Channel, Subfamily M, Member 1 (TRPM1) as the best candidate gene for both CSNB and LP. RNA-Seq data pinpointed a 1378 bp insertion in intron 1 of TRPM1 as the potential cause. This insertion, a long terminal repeat (LTR) of an endogenous retrovirus, was completely associated with LP, testing 511 horses (χ(2)=1022.00, p<0.0005), and CSNB, testing 43 horses (χ(2)=43, p<0.0005). The LTR was shown to disrupt TRPM1 transcription by premature poly-adenylation. Furthermore, while deleterious transposable element insertions should be quickly selected against the identification of this insertion in three ancient DNA samples suggests it has been maintained in the horse gene pool for at least 17,000 years. This study represents the first description of an LTR insertion being associated with both a pigmentation phenotype and an eye disorder.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent years have led to increasing interest and appreciation of the possible importance of single cell heterogeneity in various biological processes. One of the examples of phenotypic heterogeneity in bacterial populations is antibiotic tolerant persister cells. Such an antibiotic tolerance phenotype is of considerable clinical relevance since dormant bacteria can re-establish infections rapidly after the antibiotic treatment has been terminated. Up to now mechanisms for establishing the persistence phenomenon in bacteria have remained largely enigmatic. Persisters are cells considered to be in a dormant state with down regulated gene expression. Only recently small regulatory RNAs (sRNAs) have been appreciated as important regulators of gene expression in response to environmental stimuli and several theoretical studies have suggested a possible involvement of sRNAs in the mechanisms of regulated heterogeneity in bacteria. We have experimentally addressed this potential link between sRNAs and persistence/dormancy in E. coli as an example of heterogeneity. Beside classical sRNAs we are focusing also on sRNAs directly associating with and possibly regulating the ribosome, the central enzyme of gene expression. The persister and dormant cell specific sRNA profile is studied by the comparative analysis of sRNA profile changes of the whole bacterial population after antibiotic killing. From RNA-Seq data ~ 25 000 potentially stable RNA fragments were identified and initial analysis predicted ~300 of them to be dormant/persister cell specific. After further evaluation the most prominent dormant/persister cell specific sRNAs are functionally characterized and their potential role in the persistence/dormancy will be evaluated by applying genetic, molecular and biochemical tools. The potential results of this project will provide a better understanding on the molecular mechanism of bacterial persistence/dormancy and on the role of ribosome-bound sRNA molecules in fine-tuning gene expression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent years have led to increasing interest and appreciation of the possible importance of single cell heterogeneity in various biological processes. One of the examples of phenotypic heterogeneity in bacterial populations is antibiotic tolerant persister cells. Such an antibiotic tolerance phenotype is of considerable clinical relevance since dormant bacteria can re-establish infections rapidly after the antibiotic treatment has been terminated. Up to now mechanisms for establishing the persistence phenomenon in bacteria have remained largely enigmatic. Persisters are cells considered to be in a dormant state with down regulated gene expression. Only recently small regulatory RNAs (sRNAs) have been appreciated as important regulators of gene expression in response to environmental stimuli and several theoretical studies have suggested a possible involvement of sRNAs in the mechanisms of regulated heterogeneity in bacteria. We have experimentally addressed this potential link between sRNAs and persistence/dormancy in E. coli as an example of heterogeneity. Beside classical sRNAs we are focusing also on sRNAs directly associating with and possibly regulating the ribosome, the central enzyme of gene expression. The persister and dormant cell specific sRNA profile is studied by the comparative analysis of sRNA profile changes of the whole bacterial population after antibiotic killing. From RNA-Seq data ~ 25 000 potentially stable RNA fragments were identified and initial analysis predicted ~300 of them to be dormant/persister cell specific. After further evaluation the most prominent dormant/persister cell specific sRNAs are functionally characterized and their potential role in the persistence/dormancy will be evaluated by applying genetic, molecular and biochemical tools. The potential results of this project will provide a better understanding on the molecular mechanism of bacterial persistence/dormancy and on the role of ribosome-bound sRNA molecules in fine-tuning gene expression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent years have led to increasing interest and appreciation of the possible importance of single cell heterogeneity in various biological processes. One of the examples of phenotypic heterogeneity in bacterial populations is antibiotic tolerant persister cells. Such an antibiotic tolerance phenotype is of considerable clinical relevance since dormant bacteria can re-establish infections rapidly after the antibiotic treatment has been terminated. Up to now mechanisms for establishing the persistence phenomenon in bacteria have remained largely enigmatic. Persisters are cells considered to be in a dormant state with down regulated gene expression. Only recently small regulatory RNAs (sRNAs) have been appreciated as important regulators of gene expression in response to environmental stimuli and several theoretical studies have suggested a possible involvement of sRNAs in the mechanisms of regulated heterogeneity in bacteria. We have experimentally addressed this potential link between sRNAs and persistence/dormancy in E. coli as an example of heterogeneity. Beside classical sRNAs we are focusing also on sRNAs directly associating with and possibly regulating the ribosome, the central enzyme of gene expression. The persister and dormant cell specific sRNA profile is studied by the comparative analysis of sRNA profile changes of the whole bacterial population after antibiotic killing. From RNA-Seq data ~ 25 000 potentially stable RNA fragments were identified and initial analysis predicted ~300 of them to be dormant/persister cell specific. After further evaluation the most prominent dormant/persister cell specific sRNAs are functionally characterized and their potential role in the persistence/dormancy will be evaluated by applying genetic, molecular and biochemical tools. The potential results of this project will provide a better understanding on the molecular mechanism of bacterial persistence/dormancy and on the role of ribosome-bound sRNA molecules in fine-tuning gene expression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relationship between organisms within an ecosystem is one of the main focuses in the study of ecology and evolution. For instance, host-parasite interactions have long been under close interest of ecology, evolutionary biology and conservation science, due to great variety of strategies and interaction outcomes. The monogenean ecto-parasites consist of a significant portion of flatworms. Gyrodactylus salaris is a monogenean freshwater ecto-parasite of Atlantic salmon (Salmo salar) whose damage can make fish to be prone to further bacterial and fungal infections. G. salaris is the only one parasite whose genome has been studied so far. The RNA-seq data analyzed in this thesis has already been annotated by using LAST. The RNA-seq data was obtained from Illumina sequencing i.e. yielded reads were assembled into 15777 transcripts. Last resulted in annotation of 46% transcripts and remaining were left unknown. This thesis work was started with whole data and annotation process was continued by the use of PANNZER, CDD and InterProScan. This annotation resulted in 56% successfully annotated sequences having parasite specific proteins identified. This thesis represents the first of Monogenean transcriptomic information which gives an important source for further research on this specie. Additionally, comparison of annotation methods interestingly revealed that description and domain based methods perform better than simple similarity search methods. Therefore it is more likely to suggest the use of these tools and databases for functional annotation. These results also emphasize the need for use of multiple methods and databases. It also highlights the need of more genomic information related to G. salaris.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the dynamics of eukaryotic transcriptome is essential for studying the complexity of transcriptional regulation and its impact on phenotype. However, comprehensive studies of transcriptomes at single base resolution are rare, even for modern organisms, and lacking for rice. Here, we present the first transcriptome atlas for eight organs of cultivated rice. Using high-throughput paired-end RNA-seq, we unambiguously detected transcripts expressing at an extremely low level, as well as a substantial number of novel transcripts, exons, and untranslated regions. An analysis of alternative splicing in the rice transcriptome revealed that alternative cis-splicing occurred in similar to 33% of all rice genes. This is far more than previously reported. In addition, we also identified 234 putative chimeric transcripts that seem to be produced by trans-splicing, indicating that transcript fusion events are more common than expected. In-depth analysis revealed a multitude of fusion transcripts that might be by-products of alternative splicing. Validation and chimeric transcript structural analysis provided evidence that some of these transcripts are likely to be functional in the cell. Taken together, our data provide extensive evidence that transcriptional regulation in rice is vastly more complex than previously believed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Tumorigenesis is characterised by changes in transcriptional control. Extensive transcript expression data have been acquired over the last decade and used to classify prostate cancers. Prostate cancer is, however, a heterogeneous multifocal cancer and this poses challenges in identifying robust transcript biomarkers.

METHODS: In this study, we have undertaken a meta-analysis of publicly available transcriptomic data spanning datasets and technologies from the last decade and encompassing laser capture microdissected and macrodissected sample sets.

RESULTS: We identified a 33 gene signature that can discriminate between benign tissue controls and localised prostate cancers irrespective of detection platform or dissection status. These genes were significantly overexpressed in localised prostate cancer versus benign tissue in at least three datasets within the Oncomine Compendium of Expression Array Data. In addition, they were also overexpressed in a recent exon-array dataset as well a prostate cancer RNA-seq dataset generated as part of the The Cancer Genomics Atlas (TCGA) initiative. Biologically, glycosylation was the single enriched process associated with this 33 gene signature, encompassing four glycosylating enzymes. We went on to evaluate the performance of this signature against three individual markers of prostate cancer, v-ets avian erythroblastosis virus E26 oncogene homolog (ERG) expression, prostate specific antigen (PSA) expression and androgen receptor (AR) expression in an additional independent dataset. Our signature had greater discriminatory power than these markers both for localised cancer and metastatic disease relative to benign tissue, or in the case of metastasis, also localised prostate cancer.

CONCLUSION: In conclusion, robust transcript biomarkers are present within datasets assembled over many years and cohorts and our study provides both examples and a strategy for refining and comparing datasets to obtain additional markers as more data are generated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Nicotiana benthamiana has been widely used for transient gene expression assays and as a model plant in the study of plant-microbe interactions, lipid engineering and RNA silencing pathways. Assembling the sequence of its transcriptome provides information that, in conjunction with the genome sequence, will facilitate gaining insight into the plant's capacity for high-level transient transgene expression, generation of mobile gene silencing signals, and hyper-susceptibility to viral infection. Methodology/Results: RNA-seq libraries from 9 different tissues were deep sequenced and assembled, de novo, into a representation of the transcriptome. The assembly, of16GB of sequence, yielded 237,340 contigs, clustering into 119,014 transcripts (unigenes). Between 80 and 85% of reads from all tissues could be mapped back to the full transcriptome. Approximately 63% of the unigenes exhibited a match to the Solgenomics tomato predicted proteins database. Approximately 94% of the Solgenomics N. benthamiana unigene set (16,024 sequences) matched our unigene set (119,014 sequences). Using homology searches we identified 31 homologues that are involved in RNAi-associated pathways in Arabidopsis thaliana, and show that they possess the domains characteristic of these proteins. Of these genes, the RNA dependent RNA polymerase gene, Rdr1, is transcribed but has a 72 nt insertion in exon1 that would cause premature termination of translation. Dicer-like 3 (DCL3) appears to lack both the DEAD helicase motif and second dsRNA binding motif, and DCL2 and AGO4b have unexpectedly high levels of transcription. Conclusions: The assembled and annotated representation of the transcriptome and list of RNAi-associated sequences are accessible at www.benthgenome.com alongside a draft genome assembly. These genomic resources will be very useful for further study of the developmental, metabolic and defense pathways of N. benthamiana and in understanding the mechanisms behind the features which have made it such a well-used model plant. © 2013 Nakasugi et al.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. Results This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. Conclusions The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC’s can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser webcite or http://sourceforge.net/projects/rnaseqbrowser/ webcite