888 resultados para Genome Sequence
Resumo:
Alu repeats are interspersed repetitive DNA elements specific to primates that are present in 500,000 to 1 million copies. We show here that an Alu sequence encodes functional binding sites for retinoic acid receptors, which are members of the nuclear receptor family of transcription factors. The consensus sequences for the evolutionarily recent Alu subclasses contain three hexamer half sites, related to the consensus AGGTCA, arranged as direct repeats with a spacing of 2 bp, which is consistent with the binding specificities of retinoic acid receptors. An analysis was made of the DNA binding and transactivation potential of these sites from an Alu sequence that has been previously implicated in the regulation of the keratin K18 gene. These Alu double half sites are shown to bind bacterially synthesized retinoic acid receptors as assayed by electrophoretic mobility shift assays. These sites are further shown to function as a retinoic acid response element in transiently transfected CV-1 cells, increasing transcription of a reporter gene by a factor of approximately 35-fold. This transactivation requires cotransfection with vectors expressing retinoic acid receptors, as well as the presence of all-trans-retinoic acid, which is consistent with the known function of retinoic acid receptors as ligand-inducible transcription factors. The random insertion of potentially thousands of Alu repeats containing retinoic acid response elements throughout the primate genome is likely to have altered the expression of numerous genes, thereby contributing to evolutionary potential.
Resumo:
Simple sequence repeats (SSRs), consisting of tandemly repeated multiple copies of mono-, di-, tri-, or tetranucleotide motifs, are ubiquitous in eukaryotic genomes and are frequently used as genetic markers, taking advantage of their length polymorphism. We have examined the polymorphism of such sequences in the chloroplast genomes of plants, by using a PCR-based assay. GenBank searches identified the presence of several (dA)n.(dT)n mononucleotide stretches in chloroplast genomes. A chloroplast (cp) SSR was identified in three pine species (Pinus contorta, Pinus sylvestris, and Pinus thunbergii) 312 bp upstream of the psbA gene. DNA amplification of this repeated region from 11 pine species identified nine length variants. The polymorphic amplified fragments were isolated and the DNA sequence was determined, confirming that the length polymorphism was caused by variation in the length of the repeated region. In the pines, the chloroplast genome is transmitted through pollen and this PCR assay may be used to monitor gene flow in this genus. Analysis of 305 individuals from seven populations of Pinus leucodermis Ant. revealed the presence of four variants with intrapopulational diversities ranging from 0.000 to 0.629 and an average of 0.320. Restriction fragment length polymorphism analysis of cpDNA on the same populations previously failed to detect any variation. Population subdivision based on cpSSR was higher (Gst = 0.22, where Gst is coefficient of gene differentiation) than that revealed in a previous isozyme study (Gst = 0.05). We anticipate that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations.
Resumo:
We have identified an antigen recognized on a human melanoma by autologous cytolytic T lymphocytes. It is encoded by a gene that is expressed in many normal tissues. Remarkably, the sequence coding for the antigenic peptide is located across an exon-intron junction. A point mutation is present in the intron that generates an amino acid change that is essential for the recognition of the peptide by the anti-tumor cytotoxic T lymphocytes. This observation suggests that the T-cell-mediated surveillance of the integrity of the genome may extend to some intronic regions.
Resumo:
Open reading frames in the Plasmodium falciparum genome encode domains homologous to the adhesive domains of the P. falciparum EBA-175 erythrocyte-binding protein (eba-175 gene product) and those of the Plasmodium vivax and Plasmodium knowlesi Duffy antigen-binding proteins. These domains are referred to as Duffy binding-like (DBL), after the receptor that determines P. vivax invasion of Duffy blood group-positive human erythrocytes. Using oligonucleotide primers derived from short regions of conserved sequence, we have developed a reverse transcription-PCR method that amplifies sequences encoding the DBL domains of expressed genes. Products of these reverse transcription-PCR amplifications include sequences of single-copy genes (including eba-175) and variably transcribed genes that cross-hybridize to multiple regions of the genome. Restriction patterns of the multicopy genes show a high degree of polymorphism among different parasite lines, whereas single-copy genes are generally conserved. Characterization of the single-copy genes has identified a gene (ebl-1) that is related to eba-175 and is likely to be involved in erythrocyte invasion.
Resumo:
Chromosome I from the yeast Saccharomyces cerevisiae contains a DNA molecule of approximately 231 kbp and is the smallest naturally occurring functional eukaryotic nuclear chromosome so far characterized. The nucleotide sequence of this chromosome has been determined as part of an international collaboration to sequence the entire yeast genome. The chromosome contains 89 open reading frames and 4 tRNA genes. The central 165 kbp of the chromosome resembles other large sequenced regions of the yeast genome in both its high density and distribution of genes. In contrast, the remaining sequences flanking this DNA that comprise the two ends of the chromosome and make up more than 25% of the DNA molecule have a much lower gene density, are largely not transcribed, contain no genes essential for vegetative growth, and contain several apparent pseudogenes and a 15-kbp redundant sequence. These terminally repetitive regions consist of a telomeric repeat called W', flanked by DNA closely related to the yeast FLO1 gene. The low gene density, presence of pseudogenes, and lack of expression are consistent with the idea that these terminal regions represent the yeast equivalent of heterochromatin. The occurrence of such a high proportion of DNA with so little information suggests that its presence gives this chromosome the critical length required for proper function.
Resumo:
Common bean is a major dietary component in several countries, but its productivity is negatively affected by abiotic stresses. Dissecting candidate genes involved in abiotic stress tolerance is a paramount step toward the improvement of common bean performance under such constraints. Thereby, this thesis presents a systematic analysis of the DEHYDRATION RESPONSIVE ELEMENT-BINDING (DREB) gene subfamily, which encompasses genes that regulate several processes during stress responses, but with limited information for common bean. First, a series of in silico analyses with sequences retrieved from the P. vulgaris genome on Phytozome supported the categorization of 54 putative PvDREB genes distributed within six phylogenetic subgroups (A-1 to A-6), along the 11 chromosomes. Second, we cloned four novel PvDREB genes and determined their inducibility-factors, including the dehydration-, salinity- and cold-inducible genes PvDREB1F and PvDREB5A, and the dehydration- and cold-inducible genes PvDREB2A and PvDREB6B. Afterwards, nucleotide polymorphisms were searched through Sanger sequencing along those genes, revealing a high number of single nucleotide polymorphisms within PvDREB6B by the comparison of Mesoamerican and Andean genotypes. The nomenclature of PvDREB6B is discussed in details. Furthermore, we used the BARCBean6K_3 SNP platform to identify and genotype the closest SNP to each one of the 54 PvDREB genes. We selected PvDREB6B for a broader study encompassing a collection of wild common bean accessions of Mesoamerican origin. The population structure of the wild beans was accessed using sequence polymorphisms of PvDREB6B. The genetic clusters were partially associated with variation in latitude, altitude, precipitation and temperature throughout the areas such beans are distributed. With an emphasis on drought stress, an adapted tube-screening method in greenhouse conditions enabled the phenotyping of several drought-related traits in the wild collection. Interestingly, our data revealed a correlation between root depth, plant height and biomass and the environmental data of the location of the accessions. Correlation was also observed between the population structure determined through PvDREB6B and the environmental data. An association study combining data from the SNP array and DREB polymorphisms enabled the detection of SNP associated with drought-related traits through a compressed mixed linear model (CMLM) analysis. This thesis highlighted important features of DREB genes in common bean, revealing candidates for further strategies aimed at improvement of abiotic stress tolerance, with emphasis on drought tolerance
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The C2 domain is one of the most frequent and widely distributed calcium-binding motifs. Its structure comprises an eight-stranded beta-sandwich with two structural types as if the result of a circular permutation. Combining sequence, structural and modelling information, we have explored, at different levels of granularity, the functional characteristics of several families of C2 domains. At the coarsest level,the similarity correlates with key structural determinants of the C2 domain fold and, at the finest level, with the domain architecture of the proteins containing them, highlighting the functional diversity between the various subfamilies. The functional diversity appears as different conserved surface patches throughout this common fold. In some cases, these patches are related to substrate-binding sites whereas in others they correspond to interfaces of presumably permanent interaction between other domains within the same polypeptide chain. For those related to substrate-binding sites, the predictions overlap with biochemical data in addition to providing some novel observations. For those acting as protein-protein interfaces' our modelling analysis suggests that slight variations between families are a result of not only complementary adaptations in the interfaces involved but also different domain architecture. In the light of the sequence and structural genomic projects, the work presented here shows that modelling approaches along with careful sub-typing of protein families will be a powerful combination for a broader coverage in proteomics. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
The maT clade of transposons is a group of transposable elements intermediate in sequence and predicted protein structure to mariner and T-C transposons, with a distribution thus far limited to a few invertebrate species. In the nematode Caenorhabditis elegans, there are eight copies of CemaT1 that are predicted to encode a functional transposase, with five copies being >99% identical. We present evidence, based on searches of publicly available databases and on PCR-based mobility assays, that the CemaT1 transposase is expressed in C. elegans and that the CemaT transposons are capable of excising in both somatic and germline tissues. We also show that the frequency of CemaT1 excisions within the genome of the N2 strain of C. elegans is comparable to that of the Tc1 transposon. However, unlike T-C transposons in mutator strains of C elegans, maT transposons do not exhibit increased frequencies of mobility, suggesting that maT is not regulated by the same factors that control T-C activity in these strains. Finally, we show that CemaT1 transposons are capable of precise transpositions as well as orientation inversions at some loci, and thereby become members of an increasing number of identified active transposons within the C. elegans genome. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
The product of the gene (ATM) mutated in the human genetic disorder ataxia-telangiectasia (A-T) is a high molecular weight, protein (similar to350 kDa) containing a C-terminal protein kinase domain and a number of other putative domains not yet functionally defined. The majority of ATM gene mutations in A-T patients are truncating, resulting in prematurely terminated products that are highly unstable. Missense mutations within the kinase domain and elsewhere in the molecule alter the stability of the protein and lead to loss of protein kinase activity. Only rarely are patients observed with two missense mutations and this gives rise to a milder disease phenotype. Evidence for a dominant interfering effect on normal ATM kinase activity has been reported in cell lines transfected with missense mutant ATM and in cell lines from some A-T heterozygotes. The dominant negative effect of mutant ATM is manifested by an enhancement of cellular radiosensitivity and may be responsible for the cancer predisposition observed in carriers of ATM missense mutations. In this review, we explore the domain structure of the ATM molecule, sites of interaction with other proteins and the consequences of specific amino acid changes on function. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
The SOX family of transcription factors are found throughout the animal kingdom and are important in a variety of developmental contexts. Genome analysis has identified 20 Sox genes in human and mouse, which can be subdivided into 8 groups, based on sequence comparison and intron-exon structure. Most of the SOX groups identified in mammals are represented by a single SOX sequence in invertebrate model organisms, suggesting a duplication and divergence mechanism has operated during vertebrate evolution. We have now analysed the Sox gene complement in the pufferfish, Fugu rubripes, in order to shed further light on the diversity and origins of the Sox gene family. Major differences were found between the Sox family in Fugu and those in humans and mice. In particular, Fugu does not have orthologues of Sry, Sox,15 and Sox30, which appear to be specific to mammals, while Sox19, found in Fugu and zebrafish but absent in mammals, seems to be specific to fishes. Six mammalian Sox genes are represented by two copies each in Fugu, indicating a large-scale gene duplication in the fish lineage. These findings point to recent Sox gene loss, duplication and divergence occurring during the evolution of tetrapod and teleost lineages, and provide further evidence for large-scale segmental or a whole-genome duplication occurring early in the radiation of teleosts. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
The maT clade of transposons is a group of transposable elements intermediate in sequence and predicted protein structure to mariner and Tc transposons, with a distribution thus far limited to a few invertebrate species. We present evidence, based on searches of publicly available databases, that the nematode Caenorhabditis briggsae has several maT-like transposons, which we have designated as CbmaT elements, dispersed throughout its genome. We also describe two additional transposon sequences that probably share their evolutionary history with the CbmaT transposons. One resembles a fold back variant of a CbmaT element, with long (380-bp) inverted terminal repeats (ITRs) that show a high degree (71%) of identity to CbmaT1. The other, which shares only the 26-bp ITR sequences with one of the CbmaT variants, is present in eight nearly identical copies, but does not have a transposase gene and may therefore be cross mobilised by a CbmaT transposase. Using PCR-based mobility assays, we show that CbmaT1 transposons are capable of excising from the C. briggsae genome. CbmaT1 excised approximately 500 times less frequently than Tcb1 in the reference strain AF16, but both CbmaT1 and Tcb1 excised at extremely high frequencies in the HK105 strain. The HK105 strain also exhibited a high frequency of spontaneous induction of unc-22 mutants, suggesting that it may be a mutator strain of C. briggsae.
Resumo:
Recent large-scale analyses of mainly full-length cDNA libraries generated from a variety of mouse tissues indicated that almost half of all representative cloned sequences did flat contain ail apparent protein-coding sequence, and were putatively derived from non-protein-coding RNA (ncRNA) genes. However, many of these clones were singletons and the majority were unspliced, raising the possibility that they may be derived from genomic DNA or unprocessed pre-rnRNA contamination during library construction, or alternatively represent nonspecific transcriptional noise. Here we Show, using reverse transcriptase-dependent PCR, microarray, and Northern blot analyses, that many of these clones were derived from genuine transcripts Of unknown function whose expression appears to be regulated. The ncRNA transcripts have larger exons and fewer introns than protein-coding transcripts. Analysis of the genomic landscape around these sequences indicates that some cDNA clones were produced not from terminal poly(A) tracts but internal priming sites within longer transcripts, only a minority of which is encompassed by known genes. A significant proportion of these transcripts exhibit tissue-specific expression patterns, as well as dynamic changes in their expression in macrophages following lipopolysaccharide Stimulation. Taken together, the data provide strong support for the conclusion that ncRNAs are an important, regulated component of the mammalian transcriptome.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a welldefined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.