88 resultados para Tiling Arrays

em Université de Lausanne, Switzerland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT: Identification of small polymorphisms from next generation sequencing short read data is relatively easy, but detection of larger deletions is less straightforward. Here, we analyzed four divergent Arabidopsis accessions and found that intersection of absent short read coverage with weak tiling array hybridization signal reliably flags deletions. Interestingly, individual deletions were frequently observed in two or more of the accessions examined, suggesting that variation in gene content partly reflects a common history of deletion events.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study of transcription using genomic tiling arrays has lead to the identification of numerous additional exons. One example is the MECP2 gene on the X chromosome; using 5'RACE and RT-PCR in human tissues and cell lines, we have found more than 70 novel exons (RACEfrags) connecting to at least one annotated exon.. We sequenced all MECP2-connected exons and flanking sequences in 3 groups: 46 patients with the Rett syndrome and without mutations in the currently annotated exons of the MECP2 and CDKL5 genes; 32 patients with the Rett syndrome and identified mutations in the MECP2 gene; 100 control individuals from the same geoethnic group. Approximately 13 kb were sequenced per sample, (2.4 Mb of DNA resequencing). A total of 75 individuals had novel rare variants (mostly private variants) but no statistically significant difference was found among the 3 groups. These results suggest that variants in the newly discovered exons may not contribute to Rett syndrome. Interestingly however, there are about twice more variants in the novel exons than in the flanking sequences (44 vs. 21 for approximately 1.3 Mb sequenced for each class of sequences, p=0.0025). Thus the evolutionary forces that shape these novel exons may be different than those of neighboring sequences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recently, the introduction of second generation sequencing and further advance-ments in confocal microscopy have enabled system-level studies for the functional characterization of genes. The degree of complexity intrinsic to these approaches needs the development of bioinformatics methodologies and computational models for extracting meaningful biological knowledge from the enormous amount of experi¬mental data which is continuously generated. This PhD thesis presents several novel bioinformatics methods and computational models to address specific biological questions in Plant Biology by using the plant Arabidopsis thaliana as a model system. First, a spatio-temporal qualitative analysis of quantitative transcript and protein profiles is applied to show the role of the BREVIS RADIX (BRX) protein in the auxin- cytokinin crosstalk for root meristem growth. Core of this PhD work is the functional characterization of the interplay between the BRX protein and the plant hormone auxin in the root meristem by using a computational model based on experimental evidence. Hyphotesis generated by the modelled to the discovery of a differential endocytosis pattern in the root meristem that splits the auxin transcriptional response via the plasma membrane to nucleus partitioning of BRX. This positional information system creates an auxin transcriptional pattern that deviates from the canonical auxin response and is necessary to sustain the expression of a subset of BRX-dependent auxin-responsive genes to drive root meristem growth. In the second part of this PhD thesis, we characterized the genome-wide impact of large scale deletions on four divergent Arabidopsis natural strains, through the integration of Ultra-High Throughput Sequencing data with data from genomic hybridizations on tiling arrays. Analysis of the identified deletions revealed a considerable portion of protein coding genes affected and supported a history of genomic rearrangements shaped by evolution. In the last part of the thesis, we showed that VIP3 gene in Arabidopsis has an evo-lutionary conserved role in the 3' to 5' mRNA degradation machinery, by applying a novel approach for the analysis of mRNA-Seq data from random-primed mRNA. Altogether, this PhD research contains major advancements in the study of natural genomic variation in plants and in the application of computational morphodynamics models for the functional characterization of biological pathways essential for the plant. - Récemment, l'introduction du séquençage de seconde génération et les avancées dans la microscopie confocale ont permis des études à l'échelle des différents systèmes cellulaires pour la caractérisation fonctionnelle de gènes. Le degrés de complexité intrinsèque à ces approches ont requis le développement de méthodologies bioinformatiques et de modèles mathématiques afin d'extraire de la masse de données expérimentale générée, des information biologiques significatives. Ce doctorat présente à la fois des méthodes bioinformatiques originales et des modèles mathématiques pour répondre à certaines questions spécifiques de Biologie Végétale en utilisant la plante Arabidopsis thaliana comme modèle. Premièrement, une analyse qualitative spatio-temporelle de profiles quantitatifs de transcripts et de protéines est utilisée pour montrer le rôle de la protéine BREVIS RADIX (BRX) dans le dialogue entre l'auxine et les cytokinines, des phytohormones, dans la croissance du méristème racinaire. Le noyau de ce travail de thèse est la caractérisation fonctionnelle de l'interaction entre la protéine BRX et la phytohormone auxine dans le méristème de la racine en utilisant des modèles informatiques basés sur des preuves expérimentales. Les hypothèses produites par le modèle ont mené à la découverte d'un schéma différentiel d'endocytose dans le méristème racinaire qui divise la réponse transcriptionnelle à l'auxine par le partitionnement de BRX de la membrane plasmique au noyau de la cellule. Cette information positionnelle crée une réponse transcriptionnelle à l'auxine qui dévie de la réponse canonique à l'auxine et est nécessaire pour soutenir l'expression d'un sous ensemble de gènes répondant à l'auxine et dépendant de BRX pour conduire la croissance du méristème. Dans la seconde partie de cette thèse de doctorat, nous avons caractérisé l'impact sur l'ensemble du génome des délétions à grande échelle sur quatre souches divergentes naturelles d'Arabidopsis, à travers l'intégration du séquençage à ultra-haut-débit avec l'hybridation génomique sur puces ADN. L'analyse des délétions identifiées a révélé qu'une proportion considérable de gènes codant était affectée, supportant l'idée d'un historique de réarrangement génomique modelé durant l'évolution. Dans la dernière partie de cette thèse, nous avons montré que le gène VÏP3 dans Arabidopsis a conservé un rôle évolutif dans la machinerie de dégradation des ARNm dans le sens 3' à 5', en appliquant une nouvelle approche pour l'analyse des données de séquençage d'ARNm issue de transcripts amplifiés aléatoirement. Dans son ensemble, cette recherche de doctorat contient des avancées majeures dans l'étude des variations génomiques naturelles des plantes et dans l'application de modèles morphodynamiques informatiques pour la caractérisation de réseaux biologiques essentiels à la plante. - Le développement des plantes est écrit dans leurs codes génétiques. Pour comprendre comment les plantes sont capables de s'adapter aux changements environnementaux, il est essentiel d'étudier comment leurs gènes gouvernent leur formation. Plus nous essayons de comprendre le fonctionnement d'une plante, plus nous réalisons la complexité des mécanismes biologiques, à tel point que l'utilisation d'outils et de modèles mathématiques devient indispensable. Dans ce travail, avec l'utilisation de la plante modèle Arabidopsis thalicinci nous avons résolu des problèmes biologiques spécifiques à travers le développement et l'application de méthodes informatiques concrètes. Dans un premier temps, nous avons investigué comment le gène BREVIS RADIX (BRX) régule le développement de la racine en contrôlant la réponse à deux hormones : l'auxine et la cytokinine. Nous avons employé une analyse statistique sur des mesures quantitatives de transcripts et de produits de gènes afin de démontrer que BRX joue un rôle antagonisant dans le dialogue entre ces deux hormones. Lorsque ce-dialogue moléculaire est perturbé, la racine primaire voit sa longueur dramatiquement réduite. Pour comprendre comment BRX répond à l'auxine, nous avons développé un modèle informatique basé sur des résultats expérimentaux. Les simulations successives ont mené à la découverte d'un signal positionnel qui contrôle la réponse de la racine à l'auxine par la régulation du mouvement intracellulaire de BRX. Dans la seconde partie de cette thèse, nous avons analysé le génome entier de quatre souches naturelles d'Arabidopsis et nous avons trouvé qu'une grande partie de leurs gènes étaient manquant par rapport à la souche de référence. Ce résultat indique que l'historique des modifications génomiques conduites par l'évolution détermine une disponibilité différentielle des gènes fonctionnels dans ces plantes. Dans la dernière partie de ce travail, nous avons analysé les données du transcriptome de la plante où le gène VIP3 était non fonctionnel. Ceci nous a permis de découvrir le rôle double de VIP3 dans la régulation de l'initiation de la transcription et dans la dégradation des transcripts. Ce rôle double n'avait jusqu'alors été démontrée que chez l'homme. Ce travail de doctorat supporte le développement et l'application de méthodologies informatiques comme outils inestimables pour résoudre la complexité des problèmes biologiques dans la recherche végétale. L'intégration de la biologie végétale et l'informatique est devenue de plus en plus importante pour l'avancée de nos connaissances sur le fonctionnement et le développement des plantes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5' regulatory sequence variation in the corresponding genes is indeed increased. However, approximately 42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The RP protein (RPP) array approach immobilizes minute amounts of cell lysates or tissue protein extracts as distinct microspots on NC-coated slide. Subsequent detection with specific antibodies allows multiplexed quantification of proteins and their modifications at a scale that is beyond what traditional techniques can achieve. Cellular functions are the result of the coordinated action of signaling proteins assembled in macromolecular complexes. These signaling complexes are highly dynamic structures that change their composition with time and space to adapt to cell environment. Their comprehensive analysis requires until now relatively large amounts of cells (>5 x 10(7)) due to their low abundance and breakdown during isolation procedure. In this study, we combined small scale affinity capture of the T-cell receptor (TCR) and RPP arrays to follow TCR signaling complex assembly in human ex vivo isolated CD4 T-cells. Using this strategy, we report specific recruitment of signaling components to the TCR complex upon T-cell activation in as few as 0.5 million of cells. Second- to fourth-order TCR interacting proteins were accurately quantified, making this strategy specially well-suited to the analysis of membrane-associated signaling complexes in limited amounts of cells or tissues, e.g., ex vivo isolated cells or clinical specimens.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: DNA sequence integrity, mRNA concentrations and protein-DNA interactions have been subject to genome-wide analyses based on microarrays with ever increasing efficiency and reliability over the past fifteen years. However, very recently novel technologies for Ultra High-Throughput DNA Sequencing (UHTS) have been harnessed to study these phenomena with unprecedented precision. As a consequence, the extensive bioinformatics environment available for array data management, analysis, interpretation and publication must be extended to include these novel sequencing data types. DESCRIPTION: MIMAS was originally conceived as a simple, convenient and local Microarray Information Management and Annotation System focused on GeneChips for expression profiling studies. MIMAS 3.0 enables users to manage data from high-density oligonucleotide SNP Chips, expression arrays (both 3'UTR and tiling) and promoter arrays, BeadArrays as well as UHTS data using MIAME-compliant standardized vocabulary. Importantly, researchers can export data in MAGE-TAB format and upload them to the EBI's ArrayExpress certified data repository using a one-step procedure. CONCLUSION: We have vastly extended the capability of the system such that it processes the data output of six types of GeneChips (Affymetrix), two different BeadArrays for mRNA and miRNA (Illumina) and the Genome Analyzer (a popular Ultra-High Throughput DNA Sequencer, Illumina), without compromising on its flexibility and user-friendliness. MIMAS, appropriately renamed into Multiomics Information Management and Annotation System, is currently used by scientists working in approximately 50 academic laboratories and genomics platforms in Switzerland and France. MIMAS 3.0 is freely available via http://multiomics.sourceforge.net/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recently released Affymetrix Human Gene 1.0 ST array has two major differences compared with standard 3' based arrays: (i) it interrogates the entire mRNA transcript, and (ii) it uses DNA targets. To assess the impact of these differences on array performance, we performed a series of comparative hybridizations between the Human Gene 1.0 ST and the Affymetrix HG-U133 Plus 2.0 and the Illumina HumanRef-8 BeadChip arrays. Additionally, both RNA and DNA targets were hybridized on HG-U133 Plus 2.0 arrays. The results show that the overall reproducibility of the Gene 1.0 ST array is best. When looking only at the high intensity probes, the reproducibility of the Gene 1.0 ST array and the Illumina BeadChip array is equally good. Concordance of array results was assessed using different inter-platform mappings. Agreements are best between the two labeling protocols using HG-U133 Plus 2.0 array. The Gene 1.0 ST array is most concordant with the HG-U133 array hybridized with cDNA targets. This may reflect the impact of the target type. Overall, the high degree of correspondence provides strong evidence for the reliability of the Gene 1.0 ST array.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Malignant melanoma, the deadliest form of skin cancer, is characterized by a predominant mutation in the BRAF gene. Drugs that target tumours carrying this mutation have recently entered the clinic. Accordingly, patients are routinely screened for mutations in this gene to determine whether they can benefit from this type of treatment. The current gold standard for mutation screening uses real-time polymerase chain reaction and sequencing methods. Here we show that an assay based on microcantilever arrays can detect the mutation nanomechanically without amplification in total RNA samples isolated from melanoma cells. The assay is based on a BRAF-specific oligonucleotide probe. We detected mutant BRAF at a concentration of 500 pM in a 50-fold excess of the wild-type sequence. The method was able to distinguish melanoma cells carrying the mutation from wild-type cells using as little as 20 ng µl(-1) of RNA material, without prior PCR amplification and use of labels.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Over the last three decades, cytogenetic analysis of malignancies has become an integral part of disease evaluation and prediction of prognosis or responsiveness to therapy. In most diagnostic laboratories, conventional karyotyping, in conjunction with targeted fluorescence in situ hybridization analysis, is routinely performed to detect recurrent aberrations with prognostic implications. However, the genetic complexity of cancer cells requires a sensitive genome-wide analysis, enabling the detection of small genomic changes in a mixed cell population, as well as of regions of homozygosity. The advent of comprehensive high-resolution genomic tools, such as molecular karyotyping using comparative genomic hybridization or single-nucleotide polymorphism microarrays, has overcome many of the limitations of traditional cytogenetic techniques and has been used to study complex genomic lesions in, for example, leukemia. The clinical impact of the genomic copy-number and copy-neutral alterations identified by microarray technologies is growing rapidly and genome-wide array analysis is evolving into a diagnostic tool, to better identify high-risk patients and predict patients' outcomes from their genomic profiles. Here, we review the added clinical value of an array-based genome-wide screen in leukemia, and discuss the technical challenges and an interpretation workflow in applying arrays in the acquired cytogenetic diagnostic setting.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background and Aims The frequency at which males can be maintained with hermaphrodites in androdioecious populations is predicted to depend on the selfing rate, because self-fertilization by hermaphrodites reduces prospective siring opportunities for males. In particular, high selfing rates by hermaphrodites are expected to exclude males from a population. Here, the first estimates are provided of the mating system from two wild hexaploid populations of the androdioecious European wind-pollinated plant M. annua with contrasting male frequencies.Methods Four diploid microsatellite loci were used to genotype 19-20 progeny arrays from two populations of M. annua, one with males and one without. Mating-system parameters were estimated using the program MLTR.Key Results Both populations had similar, intermediate outcrossing rates (t(m) = 0.64 and 0.52 for the population with and without males, respectively). The population without males showed a lower level of correlated paternity and biparental inbreeding and higher allelic richness and gene diversity than the population with males.Conclusions The results demonstrate the utility of new diploid microsatellite loci for mating system analysis in a hexaploid plant. It would appear that androdioecious M. annua has a mixed-mating system in the wild, an uncommon finding for wind-pollinated species. This study sets a foundation for future research to assess the relative importance of the sexual system, plant-density variation and stochastic processes for the regulation of male frequencies in M. annua over space and time.