969 resultados para coding sequence
Resumo:
BACKGROUND: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species. RESULTS: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility. CONCLUSIONS: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.
Resumo:
We describe the unusual structure of a vaccinia virus late mRNA. In these molecules, the protein-coding sequences of a major late structural polypeptide are preceded by long leader RNAs, which in some cases are thousands of nucleotides long. These sequences map to different regions of the viral genome and in one instance are separated from the late gene by more than 100 kb of DNA. Moreover, the leader sequences map either upstream or downstream of the late gene, are transcribed from either DNA strand, and are fused to the late gene coding sequence via a poly(A) stretch. This demonstrates that vaccinia virus produces late mRNAs by tagging the protein-coding sequences onto the 3' end of other RNAs.
Resumo:
RAPPORT DE SYNTHÈSE : Pip5k3 : Pip5k3 is a kinase responsible for fleck corneal dystrophy when mutated. It is a well conserved gene that has only been characterized in human and mouse. Characterization of pip5k3 in zebrafish was necessary before using it as a model. The protein is 70 % similar to the human homologue. The full coding sequence encompasses 6303 by and presented four isoforms. They were differentially expressed during development. All the analyzed organs of the adult zebrafish expressed pip5k3. The adult eye expressed pip5k3 in the cornea, lens, ganglion cell layer (GCL), inner nuclear layer (INL) and outer limiting membrane (OLM). During development, pip5k3 was first uniformly expressed before to be restricted to the head region and to the somites. The expression of pip5k3 in the cornea of the larval eye could make possible the study of fleck corneal dystrophy on this animal. NkxS-3 : NKXS-3 is a transcription factor responsible for a new oculo-auricular syndrome in human when mutated. This recessive disorder is characterized by defects in ear lobule and multiple defects in eye, including microphthalmia and cataract. During development, the zebrafish expressed nkx5-3 in the lens, in the anterior retina and in otic vesicles. Knockdown experiments partially phenocopied the human disease. Microphthalmia and cataract were reproduced, but zebrafish showed also defects in the cartilage of the jaw associated with a microcephaly and fins abnormalities. The retinal cell differentiation was delayed, possibly linked with the delayed expression of at`h5 and crx also observed in morphants. Shh, a regulator of ath5, was normally expressed in morphant. Overexpression of nkx5-3 lead to an anophthalmia, suggesting a role at the early organogenesis of the eye. All the phenotypes observed in morphants and embryos overexpressing nkx5-3 suggest a potential involvement of the FGF and hedgehog signaling pathways.
Resumo:
Phylogenetic reconstructions have supported several independent appearances of C₄ photosynthesis within grasses (Poaceae). These recurrent appearances appear to contradict the large number of biochemical and morphological changes required to change from C₃ to C₄, a paradox that leads to questions about the genetic changes underlying C₄ evolution. In this study, we analysed sequences encoding phosphoenolpyruvate carboxylases (PEPCs) in grasses in order to gain insights into the origin of the ppc-C₄ gene, which encodes a key enzyme in the C₄ pathway. We screened databanks for PEPC genes or cDNAs in grasses. A coding sequence of 1130 base pairs was used to build phylogenetic trees that supported the existence of four distinct PEPC gene lineages. Ppc-C₄ present in all C₄ grasses was also found in two C₃ species. The ppc-C₄ clade was congruent with the species tree, suggesting orthologous evolution. This result would imply that ppc-C₄ appeared without any duplication event. Nevertheless, caution is needed since the sampling of our study is still far from comprehensive. Further investigation with an increased sampling is recommended to elucidate the evolutionary changes underlying ppc-C₄ gene evolution in grasses.
Resumo:
Hepatitis C virus (HCV) replicates its genome in a membrane-associated replication complex, composed of viral proteins, replicating RNA and altered cellular membranes. We describe here HCV replicons that allow the direct visualization of functional HCV replication complexes. Viable replicons selected from a library of Tn7-mediated random insertions in the coding sequence of nonstructural protein 5A (NS5A) allowed the identification of two sites near the NS5A C terminus that tolerated insertion of heterologous sequences. Replicons encoding green fluorescent protein (GFP) at these locations were only moderately impaired for HCV RNA replication. Expression of the NS5A-GFP fusion protein could be demonstrated by immunoblot, indicating that the GFP was retained during RNA replication and did not interfere with HCV polyprotein processing. More importantly, expression levels were robust enough to allow direct visualization of the fusion protein by fluorescence microscopy. NS5A-GFP appeared as brightly fluorescing dot-like structures in the cytoplasm. By confocal laser scanning microscopy, NS5A-GFP colocalized with other HCV nonstructural proteins and nascent viral RNA, indicating that the dot-like structures, identified as membranous webs by electron microscopy, represent functional HCV replication complexes. These findings reveal an unexpected flexibility of the C-terminal domain of NS5A and provide tools for studying the formation and turnover of HCV replication complexes in living cells.
Resumo:
The isolation of the four Xenopus laevis vitellogenin genes has been completed by the purification from a DNA library of the B2 gene together with its flanking sequences. The overlapping DNA fragments analyzed cover 34 kilobases. The B2 gene which has a length of 17.5 kilobases was characterized by heteroduplex and R-loop mapping in the electron microscope and by in vitro transcription in a HeLa whole-cell extract. Its structural organization is compared with that of the closely related B1 gene. The mRNA-coding sequence of about 6 kilobases is interrupted 34 times in the B1 gene and 33 times in the B2 gene. Sequence homology between the two genes was not only found in exons. In addition, 54% of the intron sequences as well as 63% and 48.5% respectively of the 5' and 3' flanking sequences, show enough homology to form stable duplexes. These findings are compared with earlier results obtained with the two other closely related members of the vitellogenin gene family, the A1 and the A2 genes.
Resumo:
Odorant receptor (OR) genes constitute with 1200 members the largest gene family in the mouse genome. A mature olfactory sensory neuron (OSN) is thought to express just one OR gene, and from one allele. The cell bodies of OSNs that express a given OR gene display a mosaic pattern within a particular region of the main olfactory epithelium. The mechanisms and cis-acting DNA elements that regulate the expression of one OR gene per OSN - OR gene choice - remain poorly understood. Here, we describe a reporter assay to identify minimal promoters for OR genes in transgenic mice, which are produced by the conventional method of pronuclear injection of DNA. The promoter transgenes are devoid of an OR coding sequence, and instead drive expression of the axonal marker tau-β-galactosidase. For four mouse OR genes (M71, M72, MOR23, and P3) and one human OR gene (hM72), a mosaic, OSN-specific pattern of reporter expression can be obtained in transgenic mice with contiguous DNA segments of only ~300 bp that are centered around the transcription start site (TSS). The ~150bp region upstream of the TSS contains three conserved sequence motifs, including homeodomain (HD) binding sites. Such HD binding sites are also present in the H and P elements, DNA sequences that are known to strongly influence OR gene expression. When a 19mer encompassing a HD binding site from the P element is multimerized nine times and added upstream of a MOR23 minigene that contains the MOR23 coding region, we observe a dramatic increase in the number of transgene-expressing founders and lines and in the number of labeled OSNs. By contrast, a nine times multimerized 19mer with a mutant HD binding site does not have these effects. We hypothesize that HD binding sites in the H and P elements and in OR promoters modulate the probability of OR gene choice.
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
During my PhD, my aim was to provide new tools to increase our capacity to analyse gene expression patterns, and to study on a large-scale basis the evolution of gene expression in animals. Gene expression patterns (when and where a gene is expressed) are a key feature in understanding gene function, notably in development. It appears clear now that the evolution of developmental processes and of phenotypes is shaped both by evolution at the coding sequence level, and at the gene expression level.Studying gene expression evolution in animals, with complex expression patterns over tissues and developmental time, is still challenging. No tools are available to routinely compare expression patterns between different species, with precision, and on a large-scale basis. Studies on gene expression evolution are therefore performed only on small genes datasets, or using imprecise descriptions of expression patterns.The aim of my PhD was thus to develop and use novel bioinformatics resources, to study the evolution of gene expression. To this end, I developed the database Bgee (Base for Gene Expression Evolution). The approach of Bgee is to transform heterogeneous expression data (ESTs, microarrays, and in-situ hybridizations) into present/absent calls, and to annotate them to standard representations of anatomy and development of different species (anatomical ontologies). An extensive mapping between anatomies of species is then developed based on hypothesis of homology. These precise annotations to anatomies, and this extensive mapping between species, are the major assets of Bgee, and have required the involvement of many co-workers over the years. My main personal contribution is the development and the management of both the Bgee database and the web-application.Bgee is now on its ninth release, and includes an important gene expression dataset for 5 species (human, mouse, drosophila, zebrafish, Xenopus), with the most data from mouse, human and zebrafish. Using these three species, I have conducted an analysis of gene expression evolution after duplication in vertebrates.Gene duplication is thought to be a major source of novelty in evolution, and to participate to speciation. It has been suggested that the evolution of gene expression patterns might participate in the retention of duplicate genes. I performed a large-scale comparison of expression patterns of hundreds of duplicated genes to their singleton ortholog in an outgroup, including both small and large-scale duplicates, in three vertebrate species (human, mouse and zebrafish), and using highly accurate descriptions of expression patterns. My results showed unexpectedly high rates of de novo acquisition of expression domains after duplication (neofunctionalization), at least as high or higher than rates of partitioning of expression domains (subfunctionalization). I found differences in the evolution of expression of small- and large-scale duplicates, with small-scale duplicates more prone to neofunctionalization. Duplicates with neofunctionalization seemed to evolve under more relaxed selective pressure on the coding sequence. Finally, even with abundant and precise expression data, the majority fate I recovered was neither neo- nor subfunctionalization of expression domains, suggesting a major role for other mechanisms in duplicate gene retention.
Resumo:
Although the importance of the NOD-like receptor family, pyrin domain containing 3 (NLRP3) inflammasome in health and disease is well appreciated, a precise characterization of NLRP3 expression is yet undetermined. To this purpose, we generated a knock-in mouse in which the Nlrp3 coding sequence was substituted for the GFP (enhanced GFP [egfp]) gene. In this way, the expression of eGFP is driven by the endogenous regulatory elements of the Nlrp3 gene. In this study, we show that eGFP expression indeed mirrors that of NLRP3. Interestingly, splenic neutrophils, macrophages, and, in particular, monocytes and conventional dendritic cells showed robust eGFP fluorescence, whereas lymphoid subsets, eosinophils, and plasmacytoid dendritic cells showed negligible eGFP levels. NLRP3 expression was highly inducible in macrophages, both by MyD88- and Trif-dependent pathways. In vivo, when mice were challenged with diverse inflammatory stimuli, differences in both the number of eGFP-expressing cells and fluorescence intensity were observed in the draining lymph node. Thus, NLRP3 levels at the site of adaptive response initiation are controlled by recruitment of NLRP3-expressing cells and by NLRP3 induction.
Resumo:
Background: Functional hypothalamic amenorrhea is a reversible form of gonadotropin-releasing hormone (GnRH) deficiency commonly triggered by stressors such as excessive exercise, nutritional deficits, or psychological distress. Women vary in their susceptibility to inhibition of the reproductive axis by such stressors, but it is unknown whether this variability reflects a genetic predisposition to hypothalamic amenorrhea. We hypothesized that mutations in genes involved in idiopathic hypogonadotropic hypogonadism, a congenital form of GnRH deficiency, are associated with hypothalamic amenorrhea. Methods: We analyzed the coding sequence of genes associated with idiopathic hypogonadotropic hypogonadism in 55 women with hypothalamic amenorrhea and performed in vitro studies of the identified mutations. Results: Six heterozygous mutations were identified in 7 of the 55 patients with hypothalamic amenorrhea: two variants in the fibroblast growth factor receptor 1 gene FGFR1 (G260E and R756H), two in the prokineticin receptor 2 gene PROKR2 (R85H and L173R), one in the GnRH receptor gene GNRHR (R262Q), and one in the Kallmann syndrome 1 sequence gene KAL1 (V371I). No mutations were found in a cohort of 422 controls with normal menstrual cycles. In vitro studies showed that FGFR1 G260E, FGFR1 R756H, and PROKR2 R85H are loss-of-function mutations, as has been previously shown for PROKR2 L173R and GNRHR R262Q. Conclusions: Rare variants in genes associated with idiopathic hypogonadotropic hypogonadism are found in women with hypothalamic amenorrhea, suggesting that these mutations may contribute to the variable susceptibility of women to the functional changes in GnRH secretion that characterize hypothalamic amenorrhea. Our observations provide evidence for the role of rare variants in common multifactorial disease. (Funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development and others; ClinicalTrials.gov number, NCT00494169.)
Differences in the evolutionary history of disease genes affected by dominant or recessive mutations
Resumo:
Background: Global analyses of human disease genes by computational methods have yielded important advances in the understanding of human diseases. Generally these studies have treated the group of disease genes uniformly, thus ignoring the type of disease-causing mutations (dominant or recessive). In this report we present a comprehensive study of the evolutionary history of autosomal disease genes separated by mode of inheritance.Results: We examine differences in protein and coding sequence conservation between dominant and recessive human disease genes. Our analysis shows that disease genes affected by dominant mutations are more conserved than those affected by recessive mutations. This could be a consequence of the fact that recessive mutations remain hidden from selection while heterozygous. Furthermore, we employ functional annotation analysis and investigations into disease severity to support this hypothesis. Conclusion: This study elucidates important differences between dominantly- and recessively-acting disease genes in terms of protein and DNA sequence conservation, paralogy and essentiality. We propose that the division of disease genes by mode of inheritance will enhance both understanding of the disease process and prediction of candidate disease genes in the future.
Resumo:
The murine immediate-early (IE) protein pp89 is a nonstructural virus-encoded phosphoprotein residing in the nucleus of infected cells, where it acts as transcriptional activator. Frequency analysis has shown that in BALB/c mice the majority of virus-specific CTL recognize IE antigens. The present study was performed to assess whether pp89 causes membrane antigen expression detected by IE-specific CTL. Site-directed mutagenesis has been used to delete the introns from gene ieI, encoding pp89, for subsequent integration of the continuous coding sequence into the vaccinia virus genome. After infection with the vaccinia recombinant, the authentic pp89 was expressed in cells that became susceptible to lysis by an IE-specific CTL clone. Priming of mice with the vaccinia recombinant sensitized polyclonal CTL that recognized MCMV-infected cells and transfected cells expressing pp89. Thus, a herpesviral IE polypeptide with essential function in viral transcriptional regulation can also serve as a dominant antigen for the specific CTL response of the host.
Resumo:
Positive selection is widely estimated from protein coding sequence alignments by the nonsynonymous-to-synonymous ratio omega. Increasingly elaborate codon models are used in a likelihood framework for this estimation. Although there is widespread concern about the robustness of the estimation of the omega ratio, more efforts are needed to estimate this robustness, especially in the context of complex models. Here, we focused on the branch-site codon model. We investigated its robustness on a large set of simulated data. First, we investigated the impact of sequence divergence. We found evidence of underestimation of the synonymous substitution rate for values as small as 0.5, with a slight increase in false positives for the branch-site test. When dS increases further, underestimation of dS is worse, but false positives decrease. Interestingly, the detection of true positives follows a similar distribution, with a maximum for intermediary values of dS. Thus, high dS is more of a concern for a loss of power (false negatives) than for false positives of the test. Second, we investigated the impact of GC content. We showed that there is no significant difference of false positives between high GC (up to similar to 80%) and low GC (similar to 30%) genes. Moreover, neither shifts of GC content on a specific branch nor major shifts in GC along the gene sequence generate many false positives. Our results confirm that the branch-site is a very conservative test.