915 resultados para RNA-seq


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human neurodegenerative diseases, such as Parkinson’s disease (PD) and the neuromuscular disorders called dystroglycanopathies (DGPs), cause retinal impairments. We have used RNA-Seq technology to catalog all known genes linked to PD and DGPs expressed in the human retina and quantitate their mRNA levels in terms of FPKM. We have also characterized their expression profiles in the retina by determining their exonic, intronic and exon-intron junction expression levels, as well as the alternative splicing pattern of particular genes. We believe these data could pave the way toward understanding the molecular bases of sight deficiencies associated with neurodegenerative disorders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

MOTIVATION: Data from RNA-seq experiments provide us with many new possibilities to gain insights into biological and disease mechanisms of cellular functioning. However, the reproducibility and robustness of RNA-seq data analysis results is often unclear. This is in part attributed to the two counter acting goals of (a) a cost efficient and (b) an optimal experimental design leading to a compromise, e.g., in the sequencing depth of experiments.

RESULTS: We introduce an R package called samExploreR that allows the subsampling (m out of n bootstraping) of short-reads based on SAM files facilitating the investigation of sequencing depth related questions for the experimental design. Overall, this provides a systematic way for exploring the reproducibility and robustness of general RNA-seq studies. We exemplify the usage of samExploreR by studying the influence of the sequencing depth and the annotation on the identification of differentially expressed genes.

AVAILABILITY: Availability: samExploreR is available as an R package from Bioconductor (after acceptance of the paper, download link: http://www.bio-complexity.com/samExploreR_1.0.0.tar.gz).


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cnidarians are often considered simple animals, but the more than 13,000 estimated species (e.g., corals, hydroids and jellyfish) of the early diverging phylum exhibit a broad diversity of forms, functions and behaviors, some of which are demonstrably complex. In particular, cubozoans (box jellyfish) are cnidarians that have evolved a number of distinguishing features. Some cubozoan species possess complex mating behaviors or particularly potent stings, and all possess well-developed light sensation involving image-forming eyes. Like all cnidarians, cubozoans have specialized subcellular structures called nematocysts that are used in prey capture and defense. The objective of this study is to contribute to the development of the box jellyfish Alatina alata as a model cnidarian. This cubozoan species offers numerous advantages for investigating morphological and molecular traits underlying complex processes and coordinated behavior in free-living medusozoans (i.e., jellyfish), and more broadly throughout Metazoa. First, I provide an overview of Cnidaria with an emphasis on the current understanding of genes and proteins implicated in complex biological processes in a few select cnidarians. Second, to further develop resources for A. alata, I provide a formal redescription of this cubozoan and establish a neotype specimen voucher, which serve to stabilize the taxonomy of the species. Third, I generate the first functionally annotated transcriptome of adult and larval A. alata tissue and apply preliminary differential expression analyses to identify candidate genes implicated broadly in biological processes related to prey capture and defense, vision and the phototransduction pathway and sexual reproduction and gametogenesis. Fourth, to better understand venom diversity and mechanisms controlling venom synthesis in A. alata, I use bioinformatics to investigate gene candidates with dual roles in venom and digestion, and review the biology of prey capture and digestion in cubozoans. The morphological and molecular resources presented herein contribute to understanding the evolution of cubozoan characteristics and serve to facilitate further research on this emerging cubozoan model.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Telomerase RNAs (TERs) are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER) that contains a 5' spliced leader (SL) cap, a putative 3' polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5'SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT) in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs) and its role in parasite telomere biology.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

UNLABELLED: In vivo transcriptional analyses of microbial pathogens are often hampered by low proportions of pathogen biomass in host organs, hindering the coverage of full pathogen transcriptome. We aimed to address the transcriptome profiles of Candida albicans, the most prevalent fungal pathogen in systemically infected immunocompromised patients, during systemic infection in different hosts. We developed a strategy for high-resolution quantitative analysis of the C. albicans transcriptome directly from early and late stages of systemic infection in two different host models, mouse and the insect Galleria mellonella. Our results show that transcriptome sequencing (RNA-seq) libraries were enriched for fungal transcripts up to 1,600-fold using biotinylated bait probes to capture C. albicans sequences. This enrichment biased the read counts of only ~3% of the genes, which can be identified and removed based on a priori criteria. This allowed an unprecedented resolution of C. albicans transcriptome in vivo, with detection of over 86% of its genes. The transcriptional response of the fungus was surprisingly similar during infection of the two hosts and at the two time points, although some host- and time point-specific genes could be identified. Genes that were highly induced during infection were involved, for instance, in stress response, adhesion, iron acquisition, and biofilm formation. Of the in vivo-regulated genes, 10% are still of unknown function, and their future study will be of great interest. The fungal RNA enrichment procedure used here will help a better characterization of the C. albicans response in infected hosts and may be applied to other microbial pathogens. IMPORTANCE: Understanding the mechanisms utilized by pathogens to infect and cause disease in their hosts is crucial for rational drug development. Transcriptomic studies may help investigations of these mechanisms by determining which genes are expressed specifically during infection. This task has been difficult so far, since the proportion of microbial biomass in infected tissues is often extremely low, thus limiting the depth of sequencing and comprehensive transcriptome analysis. Here, we adapted a technology to capture and enrich C. albicans RNA, which was next used for deep RNA sequencing directly from infected tissues from two different host organisms. The high-resolution transcriptome revealed a large number of genes that were so far unknown to participate in infection, which will likely constitute a focus of study in the future. More importantly, this method may be adapted to perform transcript profiling of any other microbes during host infection or colonization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The importance of Helicobacter pylori as a human pathogen is underlined by the plethora of diseases it is responsible for. The capacity of H. pylori to adapt to the restricted host-associated environment andto evade the host immune response largely depends on a streamlined signalling network. The peculiar H. pylori small genome size combined with its paucity of transcriptional regulators highlights the relevance of post-transcriptional regulatory mechanisms as small non-coding RNAs (sRNAs). However, among the 8 RNases represented in H. pylori genome, a regulator guiding sRNAs metabolism is still not well studied. We investigated for the first time the physiological role in H. pylori G27 strain of the RNase Y enzyme. In the first line of research we provide a comprehensive characterization of the RNase Y activity by analysing its genomic organization and the factors that orchestrate its expression. Then, based on bioinformatic prediction models, we depict the most relevant determinants of RNase Y function, demonstrating a correlation of both structure and domain organization with orthologues represented in Gram-positive bacteria. To unveil the post-transcriptional regulatory effect exerted by the RNase Y, we compared the transcriptome of an RNase Y knock-out mutant to the parental wild type strain by RNA-seq approach. In the second line of research we characterized the activity of this single strand specific endoribonuclease on cag-PAI non coding RNA 1 (CncR1) sRNA. We found that deletion or inactivation of RNase Y led to the accumulation of a 3’-extended CncR1 (CncR1-L) transcript over time. Moreover, beneath its increased half-life, CncR1-L resembled a CncR1 inactive phenotype. Finally, we focused on the characterization of the in vivo interactome of CncR1. We set up a preliminary MS2-affinity purification coupled with RNA-sequencing (MAPS) approach and we evaluated the enrichment of specific targets, demonstrating the suitability of the technique in the H. pylori G27 strain.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Witches' broom disease (WBD), caused by the hemibiotrophic fungus Moniliophthora perniciosa, is one of the most devastating diseases of Theobroma cacao, the chocolate tree. In contrast to other hemibiotrophic interactions, the WBD biotrophic stage lasts for months and is responsible for the most distinctive symptoms of the disease, which comprise drastic morphological changes in the infected shoots. Here, we used the dual RNA-seq approach to simultaneously assess the transcriptomes of cacao and M. perniciosa during their peculiar biotrophic interaction. Infection with M. perniciosa triggers massive metabolic reprogramming in the diseased tissues. Although apparently vigorous, the infected shoots are energetically expensive structures characterized by the induction of ineffective defense responses and by a clear carbon deprivation signature. Remarkably, the infection culminates in the establishment of a senescence process in the host, which signals the end of the WBD biotrophic stage. We analyzed the pathogen's transcriptome in unprecedented detail and thereby characterized the fungal nutritional and infection strategies during WBD and identified putative virulence effectors. Interestingly, M. perniciosa biotrophic mycelia develop as long-term parasites that orchestrate changes in plant metabolism to increase the availability of soluble nutrients before plant death. Collectively, our results provide unique insight into an intriguing tropical disease and advance our understanding of the development of (hemi)biotrophic plant-pathogen interactions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We report the first quantitative and qualitative analysis of the poly (A)(+) transcriptome of two human mammary cell lines, differentially expressing (human epidermal growth factor receptor) an oncogene over-expressed in approximately 25% of human breast tumors. Full-length cDNA populations from the two cell lines were digested enzymatically, individually tagged according to a customized method for library construction, and simultaneously sequenced by the use of the Titanium 454-Roche-platform. Comprehensive bioinformatics analysis followed by experimental validation confirmed novel genes, splicing variants, single nucleotide polymorphisms, and gene fusions indicated by RNA-seq data from both samples. Moreover, comparative analysis showed enrichment in alternative events, especially in the exon usage category, in ERBB2 over-expressing cells, data indicating regulation of alternative splicing mediated by the oncogene. Alterations in expression levels of genes, such as LOX, ATP5L, GALNT3, and MME revealed by large-scale sequencing were confirmed between cell lines as well as in tumor specimens with different ERBB2 backgrounds. This approach was shown to be suitable for structural, quantitative, and qualitative assessment of complex transcriptomes and revealed new events mediated by ERBB2 overexpression, in addition to potential molecular targets for breast cancer that are driven by this oncogene.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Alternative splicing of gene transcripts greatly expands the functional capacity of the genome, and certain splice isoforms may indicate specific disease states such as cancer. Splice junction microarrays interrogate thousands of splice junctions, but data analysis is difficult and error prone because of the increased complexity compared to differential gene expression analysis. We present Rank Change Detection (RCD) as a method to identify differential splicing events based upon a straightforward probabilistic model comparing the over-or underrepresentation of two or more competing isoforms. RCD has advantages over commonly used methods because it is robust to false positive errors due to nonlinear trends in microarray measurements. Further, RCD does not depend on prior knowledge of splice isoforms, yet it takes advantage of the inherent structure of mutually exclusive junctions, and it is conceptually generalizable to other types of splicing arrays or RNA-Seq. RCD specifically identifies the biologically important cases when a splice junction becomes more or less prevalent compared to other mutually exclusive junctions. The example data is from different cell lines of glioblastoma tumors assayed with Agilent microarrays.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O mamoeiro (Carica papaya L.) é uma das fruteiras mais cultivadas nas regiões tropicais e subtropicais do mundo. O Brasil faz parte do grupo dos países que mais produzem e exportam mamão no mundo. O Espírito Santo e a Bahia são responsáveis por mais de 70% da área brasileira produtora deste fruto. Porém, doenças causadas por microrganismos infecciosos afetam de modo considerável sua produção. Entre as principais doenças, destaca-se a meleira do mamoeiro, causada pelo Papaya meleira virus (PMeV), que ainda não possui uma cultivar resistente. Interessantemente os sintomas somente são desencadeados após a frutificação. Os mecanismos moleculares envolvidos no desenvolvimento dos sintomas e na resposta de defesa da planta ao PMeV ainda não foram esclarecidos. Para entender os pontos chaves desta interação, que permitam o desenvolvimento de metodologias de melhoramento genético, um estudo transcriptômico foi abordado. A tecnologia RNA-seq foi usada para o sequenciamento do transcriptoma a partir de plantas com 3, 6 e 8 meses de idade após plantio, inoculadas e não inoculadas com o PMeV. Os genes diferencialmente expressos nos 3 tempos e nas duas condições foram preditos e analisados. Estas análises revelaram um padrão de expressão geral dos genes envolvidos nesta interação. Foram encontrados 21 genes com o perfil de expressão alterado nas plantas inoculadas exclusivamente nos seis meses de idade. Destes, 8 genes envolvidos em processos de respostas de defesa e morte celular, resposta ao estresse e resposta ao estímulo biótico e abiótico foram reprimidos; enquanto os demais (13 genes), envolvidos principalmente em processos metabólicos primários, biogêneses, diferenciação e ciclo celular, comunicação e crescimento celular, bem como processos envolvidos em reprodução, e desenvolvimento da floração, foram superexpressos. Estes resultados sugerem que, aos seis meses de idade, a planta é obrigada a alterar seu programa de expressão gênica, direcionando a resposta para os processos próprios do desenvolvimento, requeridos nesse estádio fisiológico, que primam sob a resposta ao estresse, fato que finalmente leva ao desenvolvimento dos sintomas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tese de Doutoramento, Ciências do Mar, especialidade de Biologia Marinha, 19 de Dezembro de 2015, Universidade dos Açores.