25 resultados para 270202 Genome Structure

em National Center for Biotechnology Information - NCBI


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-Inter­national databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that insert between genes. These retroelements are less abundant in smaller genome plants, including rice and sorghum. Although 5- to 200-kb blocks of methylated, presumably heterochromatic, retrotransposons flank most maize genes, rice and sorghum genes are often adjacent. Similar genes are commonly found in the same relative chromosomal locations and orientations in each of these three species, although there are numerous exceptions to this collinearity (i.e., rearrangements) that can be detected at the levels of both the recombinational map and cloned DNA. Evolutionarily conserved sequences are largely confined to genes and their regulatory elements. Our results indicate that a knowledge of grass genome structure will be a useful tool for gene discovery and isolation, but the general rules and biological significance of grass genome organization remain to be determined. Moreover, the nature and frequency of exceptions to the general patterns of grass genome structure and collinearity are still largely unknown and will require extensive further investigation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The function of a protein generally is determined by its three-dimensional (3D) structure. Thus, it would be useful to know the 3D structure of the thousands of protein sequences that are emerging from the many genome projects. To this end, fold assignment, comparative protein structure modeling, and model evaluation were automated completely. As an illustration, the method was applied to the proteins in the Saccharomyces cerevisiae (baker’s yeast) genome. It resulted in all-atom 3D models for substantial segments of 1,071 (17%) of the yeast proteins, only 40 of which have had their 3D structure determined experimentally. Of the 1,071 modeled yeast proteins, 236 were related clearly to a protein of known structure for the first time; 41 of these previously have not been characterized at all.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Schizosaccharomyces pombe sod2 gene, located near the telomere on the long arm of chromosome I, encodes a Na+ (or Li+)/H+ antiporter. Amplification of sod2 has previously been shown to confer resistance to LiCl. We analyzed 20 independent LiCl-resistant strains and found that the only observed mechanism of resistance is amplification of sod2. The amplicons are linear, extrachromosomal elements either 225 or 180 kb long, containing both sod2 and telomere sequences. To determine whether proximity to a telomere is necessary for sod2 amplification, a strain was constructed in which the gene was moved to the middle of the same chromosomal arm. Selection of LiCl-resistant strains in this genetic background also yielded amplifications of sod2, but in this case the amplified DNA was exclusively chromosomal. Thus, proximity to a telomere is not a prerequisite for gene amplification in S. pombe but does affect the mechanism. Relative to wild-type cells, mutants with defects in the DNA damage aspect of the rad checkpoint control pathway had an increased frequency of sod2 amplification, whereas mutants defective in the S-phase completion checkpoint did not. Two models for generating the amplified DNA are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Integration of transgenic DNA into the plant genome was investigated in 13 transgenic oat (Avena sativa L.) lines produced using microprojectile bombardment with one or two cotransformed plasmids. In all transformation events, the transgenic DNA integrated into the plant genome consisted of intact transgene copies that were accompanied by multiple, rearranged, and/or truncated transgene fragments. All fragments of transgenic DNA cosegregated, indicating that they were integrated at single gene loci. Analysis of the structure of the transgenic loci indicated that the transgenic DNA was interspersed by the host genomic DNA. The number of insertions of transgenic DNA within the transgene loci varied from 2 to 12 among the 13 lines. Restriction endonucleases that do not cleave the introduced plasmids produced restriction fragments ranging from 3.6 to about 60 kb in length hybridizing to a probe comprising the introduced plasmids. Although the size of the interspersing host DNA within the transgene locus is unknown, the sizes of the transgene-hybridizing restriction fragments indicated that the entire transgene locus must be at least from 35–280 kb. The observation that all transgenic lines analyzed exhibited genomic interspersion of multiple clustered transgenes suggests a predominating integration mechanism. We propose that transgene integration at multiple clustered DNA replication forks could account for the observed interspersion of transgenic DNA with host genomic DNA within transgenic loci.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The RNA phage Qβ requires for the replication of its genome an RNA binding protein called Qβ host factor or Hfq protein. Our previous results suggested that this protein mediates the access of replicase to the 3′-end of the Qβ plus strand RNA. Here we report the results of an evolutionary experiment in which phage Qβ was adapted to an Escherichia coli Q13 host strain with an inactivated host factor (hfq) gene. This strain initially produced phage at a titer ≈10,000-fold lower than the wild-type strain and with minute plaque morphology, but after 12 growth cycles, phage titer and plaque size had evolved to levels near those of the wild-type host. RNAs isolated from adapted Qβ mutants were efficient templates for replicase without host factor in vitro. Electron microscopy showed that mutant RNAs, in contrast to wild-type RNA, efficiently interacted with replicase at the 3′-end in the absence of host factor. The same set of four mutations in the 3′-terminal third of the genome was found in several independently evolved phage clones. One mutation disrupts the base pairing of the 3′-terminal CCCoh sequence, suggesting that the host factor stimulates activity of the wild-type RNA template by melting out its 3′-end.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A crucial step in exploiting the information inherent in genome sequences is to assign to each protein sequence its three-dimensional fold and biological function. Here we describe fold assignment for the proteins encoded by the small genome of Mycoplasma genitalium. The assignment was carried out by our computer server (http://www.doe-mbi.ucla.edu/people/frsvr/frsvr.html), which assigns folds to amino acid sequences by comparing sequence-derived predictions with known structures. Of the total of 468 protein ORFs, 103 (22%) can be assigned a known protein fold with high confidence, as cross-validated with tests on known structures. Of these sequences, 75 (16%) show enough sequence similarity to proteins of known structure that they can also be detected by traditional sequence–sequence comparison methods. That is, the difference of 28 sequences (6%) are assignable by the sequence–structure method of the server but not by current sequence–sequence methods. Of the remaining 78% of sequences in the genome, 18% belong to membrane proteins and the remaining 60% cannot be assigned either because these sequences correspond to no presently known fold or because of insensitivity of the method. At the current rate of determination of new folds by x-ray and NMR methods, extrapolation suggests that folds will be assigned to most soluble proteins in the next decade.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chromosomal forms of Anopheles gambiae, given the informal designations Bamako, Mopti, and Savannah, have been recognized by the presence or absence of four paracentric inversions on chromosome 2. Studies of karyotype frequencies at sites where the forms occur in sympatry have led to the suggestion that these forms represent species. We conducted a study of the genetic structure of populations of An. gambiae from two villages in Mali, west Africa. Populations at each site were composed of the Bamako and Mopti forms and the sibling species, Anopheles arabiensis. Karyotypes were determined for each individual mosquito and genotypes at 21 microsatellite loci determined. A number of the microsatellites have been physically mapped to polytene chromosomes, making it possible to select loci based on their position relative to the inversions used to define forms. We found that the chromosomal forms differ at all loci on chromosome 2, but there were few differences for loci on other chromosomes. Geographic variation was small. Gene flow appears to vary among different regions within the genome, being lowest on chromosome 2, probably due to hitchhiking with the inversions. We conclude that the majority of observed genetic divergence between chromosomal forms can be explained by forces that need not involve reproductive isolation, although reproductive isolation is not ruled out. We found low levels of gene flow between the sibling species Anopheles gambiae and Anopheles arabiensis, similar to estimates based on observed frequencies of hybrid karyotypes in natural populations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Submillimolar levels of calcium, similar to the physiological total (bound + free) intranuclear concentration (0.01–1 mM), induced a conformational change within d(TG/AC)n, one of the frequent dinucleotide repeats of the mammalian genome. This change is calcium-specific, because no other tested cation induced it and it was detected as a concentration-dependent transition from B- to a non-B-DNA conformation expanding from 3′ end toward the 5′ of the repeat. Genomic footprinting of various rat brain regions revealed the existence of similar non-B-DNA conformation within a d(TG/AC)28 repeat of the endogenous enkephalin gene only in enkephalin-expressing caudate nucleus and not in the nonexpressing thalamus. Binding assays demonstrated that DNA could bind calcium and can compete with calmodulin for calcium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hepatitis C virus (HCV) helicase, non-structural protein 3 (NS3), is proposed to aid in HCV genome replication and is considered a target for inhibition of HCV. In order to investigate the substrate requirements for nucleic acid unwinding by NS3, substrates were prepared by annealing a 30mer oligonucleotide to a 15mer. The resulting 15 bp duplex contained a single-stranded DNA overhang of 15 nt referred to as the bound strand. Other substrates were prepared in which the 15mer DNA was replaced by a strand of peptide nucleic acid (PNA). The PNA–DNA substrate was unwound by NS3, but the observed rate of strand separation was at least 25-fold slower than for the equivalent DNA–DNA substrate. Binding of NS3 to the PNA–DNA substrate was similar to the DNA–DNA substrate, due to the fact that NS3 initially binds to the single-stranded overhang, which was identical in each substrate. A PNA–RNA substrate was not unwound by NS3 under similar conditions. In contrast, morpholino–DNA and phosphorothioate–DNA substrates were utilized as efficiently by NS3 as DNA–DNA substrates. These results indicate that the PNA–DNA and PNA–RNA heteroduplexes adopt structures that are unfavorable for unwinding by NS3, suggesting that the unwinding activity of NS3 is sensitive to the structure of the duplex.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

DNMT2 is a human protein that displays strong sequence similarities to DNA (cytosine-5)-methyltransferases (m5C MTases) of both prokaryotes and eukaryotes. DNMT2 contains all 10 sequence motifs that are conserved among m5C MTases, including the consensus S-adenosyl-l-methionine-binding motifs and the active site ProCys dipeptide. DNMT2 has close homologs in plants, insects and Schizosaccharomyces pombe, but no related sequence can be found in the genomes of Saccharomyces cerevisiae or Caenorhabditis elegans. The crystal structure of a deletion mutant of DNMT2 complexed with S-adenosyl-l-homocysteine (AdoHcy) has been determined at 1.8 Å resolution. The structure of the large domain that contains the sequence motifs involved in catalysis is remarkably similar to that of M.HhaI, a confirmed bacterial m5C MTase, and the smaller target recognition domains of DNMT2 and M.HhaI are also closely related in overall structure. The small domain of DNMT2 contains three short helices that are not present in M.HhaI. DNMT2 binds AdoHcy in the same conformation as confirmed m5C MTases and, while DNMT2 shares all sequence and structural features with m5C MTases, it has failed to demonstrate detectable transmethylase activity. We show here that homologs of DNMT2, which are present in some organisms that are not known to methylate their genomes, contain a specific target-recognizing sequence motif including an invariant CysPheThr tripeptide. DNMT2 binds DNA to form a denaturant-resistant complex in vitro. While the biological function of DNMT2 is not yet known, the strong binding to DNA suggests that DNMT2 may mark specific sequences in the genome by binding to DNA through the specific target-recognizing motif.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

GOBASE (http://megasun.bch.umontreal.ca/gobase/) is a network-accessible biological database, which is unique in bringing together diverse biological data on organelles with taxonomically broad coverage, and in furnishing data that have been exhaustively verified and completed by experts. So far, we have focused on mitochondrial data: GOBASE contains all published nucleotide and protein sequences encoded by mitochondrial genomes, selected RNA secondary structures of mitochondria-encoded molecules, genetic maps of completely sequenced genomes, taxonomic information for all species whose sequences are present in the database and organismal descriptions of key protistan eukaryotes. All of these data have been integrated and organized in a formal database structure to allow sophisticated biological queries using terms that are inherent in biological concepts. Most importantly, data have been validated, completed, corrected and standardized, a prerequisite of meaningful analysis. In addition, where critical data are lacking, such as genetic maps and RNA secondary structures, they are generated by the GOBASE team and collaborators, and added to the database. The database is implemented in a relational database management system, but features an object-oriented view of the biological data through a Web/Genera-generated World Wide Web interface. Finally, we have developed software for database curation (i.e. data updates, validation and correction), which will be described in some detail in this paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report the genetic organisation of six prophages present in the genome of Lactococcus lactis IL1403. The three larger prophages (36–42 kb), belong to the already described P335 group of temperate phages, whereas the three smaller ones (13–15 kb) are most probably satellites relying on helper phage(s) for multiplication. These data give a new insight into the genetic structure of lactococcal phage populations. P335 temperate phages have variable genomes, sharing homology over only 10–33% of their length. In contrast, virulent phages have highly similar genomes sharing homology over >90% of their length. Further analysis of genetic structure in all known groups of phages active on other bacterial hosts such as Escherichia coli, Bacillus subtilis, Mycobacterium and Streptococcus thermophilus confirmed the existence of two types of genetic structure related to the phage way of life. This might reflect different intensities of horizontal DNA exchange: low among purely virulent phages and high among temperate phages and their lytic homologues. We suggest that the constraints on genetic exchange among purely virulent phages reflect their optimal genetic organisation, adapted to a more specialised and extreme form of parasitism than temperate/lytic phages.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is available on the web at http://bioinfo.mbb.yale.edu/partslist and http://www.partslist.org. The system is based on the existing fold classifications and functions as a form of companion annotation for them, providing ‘global views’ of many already completed fold surveys. The central idea in the system is that of comparison through ranking; PartsList will rank the approximately 420 folds based on more than 180 attributes. These include: (i) occurrence in a number of completely sequenced genomes (e.g. it will show the most common folds in the worm versus yeast); (ii) occurrence in the structure databank (e.g. most common folds in the PDB); (iii) both absolute and relative gene expression information (e.g. most changing folds in expression over the cell cycle); (iv) protein–protein interactions, based on experimental data in yeast and comprehensive PDB surveys (e.g. most interacting fold); (v) sensitivity to inserted transposons; (vi) the number of functions associated with the fold (e.g. most multi-functional folds); (vii) amino acid composition (e.g. most Cys-rich folds); (viii) protein motions (e.g. most mobile folds); and (ix) the level of similarity based on a comprehensive set of structural alignments (e.g. most structurally variable folds). The integration of whole-genome expression and protein–protein interaction data with structural information is a particularly novel feature of our system. We provide three ways of visualizing the rankings: a profiler emphasizing the progression of high and low ranks across many pre-selected attributes, a dynamic comparer for custom comparisons and a numerical rankings correlator. These allow one to directly compare very different attributes of a fold (e.g. expression level, genome occurrence and maximum motion) in the uniform numerical format of ranks. This uniform framework, in turn, highlights the way that the frequency of many of the attributes falls off with approximate power-law behavior (i.e. according to V–b, for attribute value V and constant exponent b), with a few folds having large values and most having small values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The poly(A)-binding protein (PABP) recognizes the 3′ mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilization/degradation. PABP is a modular protein, with four N-terminal RNA-binding domains and an extensive C terminus. The C-terminal region of PABP is essential for normal growth in yeast and has been implicated in mediating PABP homo-oligomerization and protein–protein interactions. A small, proteolytically stable, highly conserved domain has been identified within this C-terminal segment. Remarkably, this domain is also present in the hyperplastic discs protein (HYD) family of ubiquitin ligases. To better understand the function of this conserved region, an x-ray structure of the PABP-like segment of the human HYD protein has been determined at 1.04-Å resolution. The conserved domain adopts a novel fold resembling a right-handed supercoil of four α-helices. Sequence profile searches and comparative protein structure modeling identified a small ORF from the Arabidopsis thaliana genome that encodes a structurally similar but distantly related PABP/HYD domain. Phylogenetic analysis of the experimentally determined (HYD) and homology modeled (PABP) protein surfaces revealed a conserved feature that may be responsible for binding to a PABP interacting protein, Paip1, and other shared interaction partners.