926 resultados para Genomic data integration
Resumo:
PURPOSE: A number of microarray studies have reported distinct molecular profiles of breast cancers (BC), such as basal-like, ErbB2-like, and two to three luminal-like subtypes. These were associated with different clinical outcomes. However, although the basal and the ErbB2 subtypes are repeatedly recognized, identification of estrogen receptor (ER) -positive subtypes has been inconsistent. Therefore, refinement of their molecular definition is needed. MATERIALS AND METHODS: We have previously reported a gene expression grade index (GGI), which defines histologic grade based on gene expression profiles. Using this algorithm, we assigned ER-positive BC to either high-or low-genomic grade subgroups and compared these with previously reported ER-positive molecular classifications. As further validation, we classified 666 ER-positive samples into subtypes and assessed their clinical outcome. RESULTS: Two ER-positive molecular subgroups (high and low genomic grade) could be defined using the GGI. Despite tracking a single biologic pathway, these were highly comparable to the previously described luminal A and B classification and significantly correlated to the risk groups produced using the 21-gene recurrence score. The two subtypes were associated with statistically distinct clinical outcome in both systemically untreated and tamoxifen-treated populations. CONCLUSION: The use of genomic grade can identify two clinically distinct ER-positive molecular subtypes in a simple and highly reproducible manner across multiple data sets. This study emphasizes the important role of proliferation-related genes in predicting prognosis in ER-positive BC.
Resumo:
This paper examines factors explaining subcontracting decisions in the construction industry. Rather than the more common cross-sectional analyses, we use panel data to evaluate the influence of all relevant variables. We design and use a new index of the closeness to small numbers situations to estimate the extent of hold-up problems. Results show that as specificity grows, firms tend to subcontract less. The opposite happens when output heterogeneity and the use of intangible assets and capabilities increase. Neither temporary shortage of capacity nor geographical dispersion of activities seem to affect the extent of subcontracting. Finally, proxies for uncertainty do not show any clear effect.
Resumo:
We study the link between corruption and economic integration. We show that if an economic union establishes a common regulation for public procurement, the country more prone to corruption benefits more from integration. However, if the propensities to corruption are too distinct, the less corrupt country will not be willing to join the union. This difference in corruption propensities can be offset by a difference in efficiency. We also show that corruption is lower if integration occurs. A panel data analysis for the European Union confirms that more corrupt countries are more favorable towards integration but less acceptable as potential new members.
Resumo:
Although it is commonly accepted that most macroeconomic variables are nonstationary, it is often difficult to identify the source of the non-stationarity. In particular, it is well-known that integrated and short memory models containing trending components that may display sudden changes in their parameters share some statistical properties that make their identification a hard task. The goal of this paper is to extend the classical testing framework for I(1) versus I(0)+ breaks by considering a a more general class of models under the null hypothesis: non-stationary fractionally integrated (FI) processes. A similar identification problem holds in this broader setting which is shown to be a relevant issue from both a statistical and an economic perspective. The proposed test is developed in the time domain and is very simple to compute. The asymptotic properties of the new technique are derived and it is shown by simulation that it is very well-behaved in finite samples. To illustrate the usefulness of the proposed technique, an application using inflation data is also provided.
Resumo:
In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Resumo:
Background and aim of the study: Genomic gains and losses play a crucial role in the development and progression of DLBCL and are closely related to gene expression profiles (GEP), including the germinal center B-cell like (GCB) and activated B-cell like (ABC) cell of origin (COO) molecular signatures. To identify new oncogenes or tumor suppressor genes (TSG) involved in DLBCL pathogenesis and to determine their prognostic values, an integrated analysis of high-resolution gene expression and copy number profiling was performed. Patients and methods: Two hundred and eight adult patients with de novo CD20+ DLBCL enrolled in the prospective multicentric randomized LNH-03 GELA trials (LNH03-1B, -2B, -3B, 39B, -5B, -6B, -7B) with available frozen tumour samples, centralized reviewing and adequate DNA/RNA quality were selected. 116 patients were treated by Rituximab(R)-CHOP/R-miniCHOP and 92 patients were treated by the high dose (R)-ACVBP regimen dedicated to patients younger than 60 years (y) in frontline. Tumour samples were simultaneously analysed by high resolution comparative genomic hybridization (CGH, Agilent, 144K) and gene expression arrays (Affymetrix, U133+2). Minimal common regions (MCR), as defined by segments that affect the same chromosomal region in different cases, were delineated. Gene expression and MCR data sets were merged using Gene expression and dosage integrator algorithm (GEDI, Lenz et al. PNAS 2008) to identify new potential driver genes. Results: A total of 1363 recurrent (defined by a penetrance > 5%) MCRs within the DLBCL data set, ranging in size from 386 bp, affecting a single gene, to more than 24 Mb were identified by CGH. Of these MCRs, 756 (55%) showed a significant association with gene expression: 396 (59%) gains, 354 (52%) single-copy deletions, and 6 (67%) homozygous deletions. By this integrated approach, in addition to previously reported genes (CDKN2A/2B, PTEN, DLEU2, TNFAIP3, B2M, CD58, TNFRSF14, FOXP1, REL...), several genes targeted by gene copy abnormalities with a dosage effect and potential physiopathological impact were identified, including genes with TSG activity involved in cell cycle (HACE1, CDKN2C) immune response (CD68, CD177, CD70, TNFSF9, IRAK2), DNA integrity (XRCC2, BRCA1, NCOR1, NF1, FHIT) or oncogenic functions (CD79b, PTPRT, MALT1, AUTS2, MCL1, PTTG1...) with distinct distribution according to COO signature. The CDKN2A/2B tumor suppressor locus (9p21) was deleted homozygously in 27% of cases and hemizygously in 9% of cases. Biallelic loss was observed in 49% of ABC DLBCL and in 10% of GCB DLBCL. This deletion was strongly correlated to age and associated to a limited number of additional genetic abnormalities including trisomy 3, 18 and short gains/losses of Chr. 1, 2, 19 regions (FDR < 0.01), allowing to identify genes that may have synergistic effects with CDKN2A/2B inactivation. With a median follow-up of 42.9 months, only CDKN2A/2B biallelic deletion strongly correlates (FDR p.value < 0.01) to a poor outcome in the entire cohort (4y PFS = 44% [32-61] respectively vs. 74% [66-82] for patients in germline configuration; 4y OS = 53% [39-72] vs 83% [76-90]). In a Cox proportional hazard prediction of the PFS, CDKN2A/2B deletion remains predictive (HR = 1.9 [1.1-3.2], p = 0.02) when combined with IPI (HR = 2.4 [1.4-4.1], p = 0.001) and GCB status (HR = 1.3 [0.8-2.3], p = 0.31). This difference remains predictive in the subgroup of patients treated by R-CHOP (4y PFS = 43% [29-63] vs. 66% [55-78], p=0.02), in patients treated by R-ACVBP (4y PFS = 49% [28-84] vs. 83% [74-92], p=0.003), and in GCB (4y PFS = 50% [27-93] vs. 81% [73-90], p=0.02), or ABC/unclassified (5y PFS = 42% [28-61] vs. 67% [55-82] p = 0.009) molecular subtypes (Figure 1). Conclusion: We report for the first time an integrated genetic analysis of a large cohort of DLBCL patients included in a prospective multicentric clinical trial program allowing identifying new potential driver genes with pathogenic impact. However CDKN2A/2B deletion constitutes the strongest and unique prognostic factor of chemoresistance to R-CHOP, regardless the COO signature, which is not overcome by a more intensified immunochemotherapy. Patients displaying this frequent genomic abnormality warrant new and dedicated therapeutic approaches.
Resumo:
PURPOSE OF REVIEW: One of the seven key scientific priorities identified in the road map on HIV cure research is to 'determine the host mechanisms that control HIV replication in the absence of therapy'. This review summarizes the recent work in genomics and in epigenetic control of viral replication that is relevant for this mission. RECENT FINDINGS: New technologies allow the joint analysis of host and viral transcripts. They identify the patterns of antisense transcription of the viral genome and its role in gene regulation. High-throughput studies facilitate the assessment of integration at the genome scale. Integration site, orientation and host genomic context modulate the transcription and should also be assessed at the level of single cells. The various models of latency in primary cells can be followed using dynamic study designs to acquire transcriptome and proteome data of the process of entry, maintenance and reactivation of latency. Dynamic studies can be applied to the study of transcription factors and chromatin modifications in latency and upon reactivation. SUMMARY: The convergence of primary cell models of latency, new high-throughput quantitative technologies applied to the study of time series and the identification of compounds that reactivate viral transcription bring unprecedented precision to the study of viral latency.
Resumo:
Ground-penetrating radar (GPR) and microgravimetric surveys have been conducted in the southern Jura mountains of western Switzerland in order to map subsurface karstic features. The study site, La Grande Rolaz cave, is an extensive system in which many portions have been mapped. By using small station spacing and careful processing for the geophysical data, and by modeling these data with topographic information from within the cave, accurate interpretations have been achieved. The constraints on the interpreted geologic models are better when combining the geophysical methods than when using only one of the methods, despite the general limitations of two-dimensional (2D) profiling. For example, microgravimetry can complement GPR methods for accurately delineating a shallow cave section approximately 10 X 10 mt in size. Conversely, GPR methods can be complementary in determining cavity depths and in verifying the presence of off-line features and numerous areas of small cavities and fractures, which may be difficult to resolve in microgravimetric data.
Resumo:
The integration of specific institutions for teacher education into the higher education system represents a milestone in the Swiss educational policy and has broad implications. This thesis explores organizational and institutional change resulting from this policy reform, and attempts to assess structural change in terms of differentiation and convergence within the system of higher education. Key issues that are dealt with are, on the one hand, the adoption of a research function by the newly conceptualized institutions of teacher education, and on the other, the positioning of the new institutions within the higher education system. Drawing on actor-centred approaches to differentiation, this dissertation discusses system-level specificities of tertiarized teacher education and asks how this affects institutional configurations and actor constellations. On the basis of qualitative and quantitative empirical data, a comparative analysis has been carried out including case studies of four universities of teacher education as well as multivariate regression analysis of micro-level data on students' educational choices. The study finds that the process of system integration and adaption to the research function by the various institutions have unfolded differently depending on the institutional setting and the specific actor constellations. The new institutions have clearly made a strong push to position themselves as a new institutional type and to find their identity beyond the traditional binary divide which assigns the universities of teacher education to the college sector. Potential conflicts have been identified in divergent cognitive normative orientations and perceptions of researchers, teacher educators, policy-makers, teachers, and students as to the mission and role of the new type of higher education institution. - L'intégration dans le système d'enseignement supérieur d'institutions qui ont pour tâche spécifique de former des enseignants peut être considérée comme un événement majeur dans la politique éducative suisse, qui se trouve avoir des conséquences importantes à plusieurs niveaux. Cette thèse explore les changements organisationnels et institutionnels résultant de cette réforme politique, et elle se propose d'évaluer en termes de différentiation et de convergence les changements structurels intervenus dans le système d'éducation tertiaire. Les principaux aspects traités sont d'une part la nouvelle mission de recherche attribuée à ces institutions de formation pédagogique, et de l'autre la place par rapport aux autres institutions du système d'éducation tertiaire. Recourant à une approche centrée sur les acteurs pour étudier les processus de différen-tiation, la thèse met en lumière et en discussion les spécificités inhérentes au système tertiaire au sein duquel se joue la formation des enseignants nouvellement conçue et soulève la question des effets de cette nouvelle façon de former les enseignants sur les configurations institutionnelles et les constellations d'acteurs. Une analyse comparative a été réalisée sur la base de données qualitatives et quantitatives issues de quatre études de cas de hautes écoles pédagogiques et d'analyses de régression multiple de données de niveau micro concernant les choix de carrière des étudiants. Les résultats montrent à quel point le processus d'intégration dans le système et la nouvelle mission de recherche peuvent apparaître de manière différente selon le cadre institutionnel d'une école et la constellation spécifique des acteurs influents. A pu clairement être observée une forte aspiration des hautes écoles pédagogiques à se créer une identité au-delà de la structure binaire du système qui assigne la formation des enseignants au secteur des hautes écoles spéciali-sées. Des divergences apparaissent dans les conceptions et perceptions cognitives et normatives des cher-cheurs, formateurs, politiciens, enseignants et étudiants quant à la mission et au rôle de ce nouveau type de haute école. - Die Integration spezieller Institutionen für die Lehrerbildung ins Hochschulsystem stellt einen bedeutsamen Schritt mit weitreichenden Folgen in der Entwicklung des schweizerischen Bildungswesens dar. Diese Dissertation untersucht die mit der Neuerung verbundenen Veränderungen auf organisatorischer und institutioneller Ebene und versucht, die strukturelle Entwicklung unter den Gesichtspunkten von Differenzierung und Konvergenz innerhalb des tertiären Bildungssystems einzuordnen. Zentrale Themen sind dabei zum einen die Einführung von Forschung und Entwicklung als zusätzlichem Leistungsauftrag in der Lehrerbildung und zum andern die Positionierung der pädagogischen Hochschulen innerhalb des Hochschulsystems. Anhand akteurzentrierter Ansätze zur Differenzierung werden die Besonderheiten einer tertiarisierten Lehrerbildung hinsichtlich der Systemebenen diskutiert und Antworten auf die Frage gesucht, wie die Reform die institutionellen Konfigurationen und die Akteurkonstellationen beeinflusst. Auf der Grundlage qualitativer und quantitativer Daten wurde eine vergleichende Analyse durchgeführt, welche Fallstudien zu vier pädagogischen Hochschulen umfasst sowie Regressionsanalysen von Mikrodaten zur Studienwahl von Maturanden. Die Ergebnisse machen deutlich, dass sich der Prozess der Systemintegration und die Einführung von Forschung in die Lehrerbildung in Abhängigkeit von institutionellen Ordnungen und der jeweiligen Akteurkonstellation unterschiedlich gestalten. Es lässt sich bei den neu gegründeten pädagogischen Hochschulen ein starkes Bestreben feststellen, sich als neuen Hochschultypus zu positionieren und sich eine Identität zu schaffen jenseits der herkömmlichen binären Struktur, welche die pädagogischen Hochschulen dem Fachhochschul-Sektor zuordnet. Potentielle Konflikte zeichnen sich ab in den divergierenden kognitiven und normativen Orientierungen und Wahrnehmungen von Forschern, Ausbildern, Bildungspolitikern, Lehrern und Studierenden hinsichtlich des Auftrags und der Rolle dieses neuen Typs Hochschule.
Resumo:
The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
Clozapine (CLO), an atypical antipsychotic, depends mainly on cytochrome P450 1A2 (CYP1A2) for its metabolic clearance. Four patients treated with CLO, who were smokers, were nonresponders and had low plasma levels while receiving usual doses. Their plasma levels to dose ratios of CLO (median; range, 0.34; 0.22 to 0.40 ng x day/mL x mg) were significantly lower than ratios calculated from another study with 29 patients (0.75; 0.22 to 2.83 ng x day/mL x mg; P < 0.01). These patients were confirmed as being CYP1A2 ultrarapid metabolizers by the caffeine phenotyping test (median systemic caffeine plasma clearance; range, 3.85; 3.33 to 4.17 mL/min/kg) when compared with previous studies (0.3 to 3.33 mL/min/kg). The sequencing of the entire CYP1A2 gene from genomic DNA of these patients suggests that the -164C > A mutation (CYP1A2*1F) in intron 1, which confers a high inducibility of CYP1A2 in smokers, is the most likely explanation for their ultrarapid CYP1A2 activity. A marked (2 patients) or a moderate (2 patients) improvement of the clinical state of the patients occurred after the increase of CLO blood levels above the therapeutic threshold by the increase of CLO doses to very high values (ie, up to 1400 mg/d) or by the introduction of fluvoxamine, a potent CYP1A2 inhibitor, at low dosage (50 to 100 mg/d). Due to the high frequency of smokers among patients with schizophrenia and to the high frequency of the -164C > A polymorphism, CYP1A2 genotyping could have important clinical implications for the treatment of patients with CLO.
Resumo:
La mousse haplobiontique Physcomitrella patens est utilisée comme système génétique modèle pour l'étude du développement des plantes. Cependant, l'absence d'un protocole efficace de transformation a constitué jusqu'à présent un gros désavantage méthodologique pour le développement futur de ce système expérimental. Les résultats présentés dans le premier chapitre relatent la mise au point d'un protocole de transformation basé sur la technique de transfert direct de gènes dans des protoplastes par précipitation au PEG. Un essai d'expression transitoire de gènes a été mis au point. Ce protocole a été adapté afin de permettre l'introduction in vivo d'anticorps dans des protoplastes. Le protocole modifié permet d'introduire simultanément du DNA et des IgG dans les cellules, et nous avons démontré que ces anticorps peuvent inactiver spécifiquement le produit d'un gène co-introduit (GUS), ainsi que certaines protéines impliquées dans des processus cellulaires (tubuline). Cet essai, baptisé "essai transitoire d'immuno-inactivation in vivo", devrait être directement applicable à d'autres protoplastes végétaux, et permettre l'élaboration de nouvelles stratégies dans l'étude de processus cellulaires. Le second chapitre est consacré aux expériences de transformation de la mousse avec des gènes conférant une résistance à des antibiotiques. Nos résultats démontrent que l'intégration de gènes de résistance dans le génome de P. patens est possible, mais que cet événement est rare. Il s'agit là néanmoins de la première démonstration d'une transformation génétique réussie de cet organisme. L'introduction de gènes de résistance aux antibiotiques dans les protoplastes de P. patens génère à haute fréquence des clones résistants instables. Deux classes de clones instables ont été identifiés. La caractérisation phénotypique, génétique et moléculaire de ces clones suggère fortement que les séquences transformantes sont concaténées pour former des structures de haut poids moléculaire, et que ces structures sont efficacement répliquées et maintenues dans les cellules résistantes en tant qu'éléments génétiques extrachromosomaux. Ce type de transformation nous permet d'envisager des expériences permettant l'identification des séquences génomiques impliquées dans la replication de l'ADN de mousse. Plusieurs lignées transgéniques ont été retransformées avec des plasmides portant des séquences homologues aux séquences intégrées dans le génome, mais conférant une résistance à un autre antibiotique. Les résultats présentés dans le troisième chapitre montrent que les fréquences de transformation intégrative dans les lignées transgéniques sont 10 fois plus élevées que dans la lignée sauvage, et que cette augmentation est associée à une coségrégation des gènes de résistance dans la plupart des clones testés. Ces résultats génétiques indiquent que l'intégration de séquences d'ADN étranger dans le génome de P. patens a lieu en moyenne 10 fois plus fréquemment par recombinaison homologue que par intégration aléatoire. Ce rapport homologue/aléatoire est 10000 fois supérieur aux rapports obtenus avec d'autres plantes, et fournit l'outil indispensable à la réalisation d'expériences de génétique inverse dans cet organisme à haplophase dominante. THESIS SUMMARY The moss Physcomitrella patens is used as a model genetic system to study plant development, taking advantage of the fact that the haploid gametophyte dominates in its life cycle. But further development of this model system was hampered by the lack of a protocol allowing the genetic transformation of this plant. We have developed a transformation protocol based on PEG-mediated direct gene transfer to protoplasts. Our data demonstrate that this procedure leads to the establishment of an efficient transient gene expression assay. A slightly modified protocol has been developed allowing the in vivo introduction of antibodies in moss protoplasts. Both DNA and IgGs can be loaded simultaneously, and specific antibodies can immunodeplete the product of an expression cassette (GUS) as well as proteins involved in cellular processes (tubulins). This assay, named transient in vivo immunodepletion assay, should be applicable to other plant protoplasts, and offers new approaches to study cellular processes. Transformations have been performed with bacterial plasmids carrying antibiotic resistance expression cassette. Our data demonstrate that integrative transformation occurs, but at low frequencies. This is the first demonstration of a successful genetic transformation of mosses. Resistant unstable colonies are recovered at high frequencies following transformation, and two different classes of unstable clones have been identified. Phenotypical, genetic and molecular characterisation of these clones strongly suggests that bacterial plasmids are concatenated to form high molecular arrays which are efficiently replicated and maintained as extrachromosomal elements in the resistant cells. Replicative transformation in P. patens should allow the design of experiments aimed at the identification of genomic sequences involved in moss DNA replication. Transgenic strains have been retransformed with bacterial plasmids carrying sequences homologous to the integrated transloci, but conferring resistance to another antibiotic. Our results demonstrate an order of magnitude increase of integrative transformation frequencies in transgenic strains as compared to wild-type, associated with cosegregation of the resistance genes in most of these double resistant transgenic strains. These observations provide strong genetic evidence that gene targeting occurs about ten times more often than random integration in the genome of P. patens. Such ratio of targeted to random integration is about 10 000 times higher than previous reports of gene targeting in plants, and provides the essential requirement for the development of efficient reverse genetics in the haplodiplobiontic P. patens.
Resumo:
AbstractIn addition to genetic changes affecting the function of gene products, changes in gene expression have been suggested to underlie many or even most of the phenotypic differences among mammals. However, detailed gene expression comparisons were, until recently, restricted to closely related species, owing to technological limitations. Thus, we took advantage of the latest technologies (RNA-Seq) to generate extensive qualitative and quantitative transcriptome data for a unique collection of somatic and germline tissues from representatives of all major mammalian lineages (placental mammals, marsupials and monotremes) and birds, the evolutionary outgroup.In the first major project of my thesis, we performed global comparative analyses of gene expression levels based on these data. Our analyses provided fundamental insights into the dynamics of transcriptome change during mammalian evolution (e.g., the rate of expression change across species, tissues and chromosomes) and allowed the exploration of the functional relevance and phenotypic implications of transcription changes at a genome-wide scale (e.g., we identified numerous potentially selectively driven expression switches).In a second project of my thesis, which was also based on the unique transcriptome data generated in the context of the first project we focused on the evolution of alternative splicing in mammals. Alternative splicing contributes to transcriptome complexity by generating several transcript isoforms from a single gene, which can, thus, perform various functions. To complete the global comparative analysis of gene expression changes, we explored patterns of alternative splicing evolution. This work uncovered several general and unexpected patterns of alternative splicing evolution (e.g., we found that alternative splicing evolves extremely rapidly) as well as a large number of conserved alternative isoforms that may be crucial for the functioning of mammalian organs.Finally, the third and final project of my PhD consisted in analyzing in detail the unique functional and evolutionary properties of the testis by exploring the extent of its transcriptome complexity. This organ was previously shown to evolve rapidly both at the phenotypic and molecular level, apparently because of the specific pressures that act on this organ and are associated with its reproductive function. Moreover, my analyses of the amniote tissue transcriptome data described above, revealed strikingly widespread transcriptional activity of both functional and nonfunctional genomic elements in the testis compared to the other organs. To elucidate the cellular source and mechanisms underlying this promiscuous transcription in the testis, we generated deep coverage RNA-Seq data for all major testis cell types as well as epigenetic data (DNA and histone methylation) using the mouse as model system. The integration of these complete dataset revealed that meiotic and especially post-meiotic germ cells are the major contributors to the widespread functional and nonfunctional transcriptome complexity of the testis, and that this "promiscuous" spermatogenic transcription is resulting, at least partially, from an overall transcriptionally permissive chromatin state. We hypothesize that this particular open state of the chromatin results from the extensive chromatin remodeling that occurs during spermatogenesis which ultimately leads to the replacement of histones by protamines in the mature spermatozoa. Our results have important functional and evolutionary implications (e.g., regarding new gene birth and testicular gene expression evolution).Generally, these three large-scale projects of my thesis provide complete and massive datasets that constitute valuables resources for further functional and evolutionary analyses of mammalian genomes.