980 resultados para genome structure
Resumo:
The planctomycetes are a phylum of bacteria that have a unique cell compartmentalisation and yeast-like budding cell division and peptidoglycan-less proteinaceous cell walls. We wished to further our understanding of these unique organisms at the molecular level by searching for conserved amino acid sequence motifs and domains in the proteins encoded by Rhodopirellula baltica. Using BLAST and single-linkage clustering, we have discovered several new protein domains and sequence motifs in this planctomycete. R. baltica has multiple members of the newly discovered GEFGR protein family and the ASPIC C-terminal domain family, whilst most other organisms for which whole genome sequence is available have no more than one. Many of the domains and motifs appear to be restricted to the planctomycetes. It is possible that these protein domains and motifs may have been lost or replaced in other phyla, or they may have undergone multiple duplication events in the planctomycete lineage. One of the novel motifs probably represents a novel N-terminal export signal peptide. With their unique cell biology, it may be that the planctomycete cell compartmentalisation plan in particular needs special membrane transport mechanisms. The discovery of these new domains and motifs, many of which are associated with secretion and cell-surface functions, will help to stimulate experimental work and thus enhance further understanding of this fascinating group of organisms. (C) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.
Resumo:
The EF-hand superfamily of calcium binding proteins includes the S100, calcium binding protein, and troponin subfamilies. This study represents a genome, structure, and expression analysis of the S100 protein family, in mouse, human, and rat. We confirm the high level of conservation between mammalian sequences but show that four members, including S100A12, are present only in the human genome. We describe three new members of the S100 family in the three species and their locations within the S100 genomic clusters and propose a revised nomenclature and phylogenetic relationship between members of the EF-hand superfamily. Two of the three new genes were induced in bone-marrow-derived macrophages activated with bacterial lipopolysaccharide, suggesting a role in inflammation. Normal human and murine tissue distribution profiles indicate that some members of the family are expressed in a specific manner, whereas others are more ubiquitous. Structure-function analysis of the chemotactic properties of murine S100A8 and human S100A12, particularly within the active hinge domain, suggests that the human protein is the functional homolog of the murine protein. Strong similarities between the promoter regions of human S100A12 and murine S100A8 support this possibility. This study provides insights into the possible processes of evolution of the EF-hand protein superfamily. Evolution of the S100 proteins appears to have occurred in a modular fashion, also seen in other protein families such as the C2H2-type zinc-finger family. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
The number of mammalian transcripts identified by full-length cDNA projects and genome sequencing projects is increasing remarkably. Clustering them into a strictly nonredundant and comprehensive set provides a platform for functional analysis of the transcriptome and proteome, but the quality of the clustering and predictive usefulness have previously required manual curation to identify truncated transcripts and inappropriate clustering of closely related sequences. A Representative Transcript and Protein Sets (RTPS) pipeline was previously designed to identify the nonredundant and comprehensive set of mouse transcripts based on clustering of a large mouse full-length cDNA set (FANTOM2). Here we propose an alternative method that is more robust, requires less manual curation, and is applicable to other organisms in addition to mouse. RTPSs of human, mouse, and rat have been produced by this method and used for validation. Their comprehensiveness and quality are discussed by comparison with other clustering approaches. The RTPSs are available at ftp://fantom2.gsc.riken.go.jp/RTPS/. (C). 2004 Elsevier Inc. All rights reserved.
Resumo:
The promoter regions of plant pararetroviruses direct transcription of the full-length viral genome into a pregenomic RNA that is an intermediate in the replication of the virus. It serves as template for reverse transcription and as polycistronic mRNA for translation to viral proteins. We have identified functional promoter elements in the intergenic region of the Cavendish isolate of Banana streak virus (BSV-Cav), a member of the genus Badnavirus. Potential binding sites for plant transcription factors were found both upstream and downstream of the transcription start site by homology search in the PLACE database of plant cis-acting elements. The functionality of these putative cis-acting elements was tested by constructing loss-of-function and regain-of-function mutant promoters whose activity was quantified in embryogenic sugarcane suspension cells. Four regions that are important for activity of the BSV-Cav promoter were identified: the region containing an as-l-like element, the region around-141 and down to -77, containing several putative transcription factor binding sites, the region including the CAAT-box, and the leader region. The results could help explain the high BSV-Cav promoter activity that was observed previously in transgenic sugarcane plants and give more insight into the plant cell-mediated replication of the viral genome in banana streak disease. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
Molecular diversity among 421 clones of cultivated sugarcane and wild relatives was analysed using AFLP markers. Of these clones, 270 were Saccharum officinarum and 151 were either cultivars produced by the Australian breeding program or important parents used in the breeding program. The S. of. cinarum clones were obtained from a collection that contained clones from all the major regions where S. of. cinarum is grown. Five AFLP primer combinations generated 657 markers ofwhich 614 were polymorphic. All clones contained a large number of markers; a result of the polyploid nature and heterozygosity of the genome. S. of. cinarum clones from New Guinea displayed greater diversity than S. of. cinarum clones from other regions. This is in agreement with the hypothesis that New Guinea is the centre of origin of this species. The S. of. cinarum clones from Hawaii and Fiji formed a separate group and may correspond to clones that have been introgressed with other members of the ` Saccharum complex'. Greater diversity was found in the cultivars than in the S. of. cinarum clones due to the introgression of S. spontaneum chromatin. These cultivars clustered as expected based on pedigree. The major contribution of clones QN66- 2008 and Nco310 to Australian sugarcane cultivars divided the cultivars into 2 main groups. Although only a fewS. of. cinarum clones are known to have been used in the breeding of current cultivars, about 90% of markers present in the S. of. cinarum clone collection ( 2n= 80) were also present in the cultivar collection. This suggests that most of the observed genetic diversity in S. of. cinarum has been captured in Australian sugarcane germplasm.
Resumo:
We describe the creation process of the Minimum Information Specification for In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE). Modeled after the existing minimum information specification for microarray data, we created a new specification for gene expression localization experiments, initially to facilitate data sharing within a consortium. After successful use within the consortium, the specification was circulated to members of the wider biomedical research community for comment and refinement. After a period of acquiring many new suggested requirements, it was necessary to enter a final phase of excluding those requirements that were deemed inappropriate as a minimum requirement for all experiments. The full specification will soon be published as a version 1.0 proposal to the community, upon which a more full discussion must take place so that the final specification may be achieved with the involvement of the whole community. This paper is part of the special issue of OMICS on data standards.
Resumo:
BRCA1 is a tumor suppressor that functions in controlling cell growth and maintaining genomic stability. BRCA1 has also been implicated in telomere maintenance through its ability to regulate the transcription of hTERT, the catalytic subunit of telomerase, resulting in telomere shortening, and to colocalize with the telomere-binding protein TRF1. The high incidence of nonreciprocal translocations in tumors arising from BRCA1 mutation carriers and Brca1-null mice also raises the possibility that BRCA1 plays a role in telomere protection. To date, however, the consequences for telomere status of disrupting BRCA1 have not been reported. To examine the role of BRCA1 in telomere regulation, we have expressed a dominant-negative mutant of BRCA1 (trBRCA1), known to disrupt multiple functions of BRCA1, in telomerase-positive mammary epithelial cells (SVCT) and telomerase-negative ALT cells (GM847). In SVCT cells, expression of trBRCA1 resulted in an increased incidence of anaphase bridges and in an increase in telomere length, but no change in telomerase activity. In GM847 cells, trBRCA1 also increased anaphase bridge formation but did not induce any change in telomere length. BRCA1 colocalized with TRF2 in telomerase-positive cells and with a small subset of ALT-associated PML bodies (APBs) in ALT cells. Together, these results raise the possibility that BRCA1 could play a role in telomere protection and suggest a potential mechanism for one of the phenotypes of BRCA1 deficient cells. (c) 2005 Wiley-Liss, Inc.
Resumo:
The arrangement of genes in the mitochondrial (mt) genomes of most insects is the same, or near-identical, to that inferred to be ancestral for insects. We sequenced the entire mt genome of the small pigeon louse, Campanulotes bidentatus compar, and part of the mt genomes of nine other species of lice. These species were from six families and the three main suborders of the order Phthiraptera. There was no variation in gene arrangement among species within a family but there was much variation in gene arrangement among the three suborders of lice. There has been an extraordinary number of gene rearrangements in the mitochondrial genomes of lice!
Resumo:
Background: Current methods to find significantly under- and over-represented gene ontology (GO) terms in a set of genes consider the genes as equally probable balls in a bag, as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm - GONOME - that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least) the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term development is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE) upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes. Inappropriate use of existing algorithms that do not take gene size into account has led to erroneous or suspect conclusions. Reciprocally GONOME may be used to identify new features in genomes that are significantly associated with particular categories of genes.
Resumo:
Using the two largest collections of Mus musculus and Homo sapiens transcription start sites ( TSSs) determined based on CAGE tags, ditags, full- length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis- elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and ciselements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune- response- related genes ( GO: 0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene- finding tools.
Resumo:
The gene content of a mitochondrial (mt) genome, i.e., 37 genes and a large noncoding region (LNR), is usually conserved in Metazoa. The arrangement of these genes and the LNR is generally conserved at low taxonomic levels but varies substantially at high levels. We report here a variation in mt gene content and gene arrangement among chigger mites of the genus Leptotrombidium. We found previously that the mt genome of Leptotrombidium pallidum has an extra gene for large-subunit rRNA (rrnL), a pseudo-gene for small-subunit rRNA (PrrnS), and three extra LNRs, additional to the 37 genes and an LNR typical of Metazoa. Further, the arrangement of mt genes of L. pallidum differs drastically from that of the hypothetical ancestor of the arthropods. To find to what extent the novel gene content and gene arrangement occurred in Leptotrombidium, we sequenced the entire or partial mt genomes of three other species, L. akamushi, L. deliense, and L. fletcheri. These three species share the arrangement of all genes with L. pallidum, except trnQ (for tRNA-glutamine). Unlike L. pallidum, however, these three species do not have extra rrnL or PrrnS and have only one extra LNR. By comparison between Leptotrombidium species and the ancestor of the arthropods, we propose that (1) the type of mt genome present in L. pallidum evolved from the type present in the other three Leptotrombidium species, and (2) three molecular mechanisms were involved in the evolution of mt gene content and gene arrangement in Leptotrombidium species.
Resumo:
Historians of genetics agree that multiple conceptions of the gene have coexisted at each stages in the history of genetics and that the resulting partial ambiguity has often contributed to the success of genetics, both because workers in different areas have needed to communicate and to draw on one another’s results despite wrestled with very different scientific challenges, and because empirical findings have often challenged the presuppositions of existing conceptions of the gene. Today, a number of different conceptions of the gene coexist in the biosciences. An ‘instrumental’ gene similar to that of classical genetics retains a critical role in the construction and interpretation of experiments in which the relationship between genotype and phenotype is explored via hybridization between organisms or directly between nucleic acid molecules. It also plays an important theoretical role in the foundations of disciplines such as quantitative genetics and population genetics. A ‘nominal’ gene, defined by the practice of genetic nomenclature, is a critical practical tool and allows communication between bioscientists in a wide range of fields to be grounded in welldefined sequences of nucleotides. This concept, however, does not embody major theoretical insights into genome structure or function. Instead, a ‘post-genomic’ conception of the gene embodies the continuing project of understanding how genome structure supports genome function, but with a deflationary picture of the gene as a structural unit. This final concept of the gene poses a significant challenge to earlier assumptions about the relationship between genome structure and function, and between genotype and phenotype.
Resumo:
The base composition pattern (BCP) in the putative promoter region (PPRs) up to 5 Kb lengths of 682 human genes on Chromosome 22 (Chr22) was examined. Two-dimensional (2D) and three-dimensional (3D) functions were designed to delineate the DNA base composition, with four major patterns identified. It is found that 17.6% genes include TATA box, 28.0% GC box, 18.9% CAAT box and 38.4% CpG islands, and approximately 10% genes have one of four putative initiator (Inr) motifs. The occurrence of the promoter elements is tightly associated with the base composition features in the promoter regions, and the associations of the base composition features with occurrence of the promoter elements in the promoter regions mediate tissue-wide expression of the genes in human. The occurrence of two or more promoter elements in the promoter regions is required for the medium- and wide-range expression profiles of the human genes on Chr22. Thus, the reported data shed light on the characteristics of the PPRs of the human genes on Chr22, which may improve our understanding of regulatory roles of the PPRs with occurrence of the promoter elements in gene expression.
Resumo:
The flood of new genomic sequence information together with technological innovations in protein structure determination have led to worldwide structural genomics (SG) initiatives. The goals of SG initiatives are to accelerate the process of protein structure determination, to fill in protein fold space and to provide information about the function of uncharacterized proteins. In the long-term, these outcomes are likely to impact on medical biotechnology and drug discovery, leading to a better understanding of disease as well as the development of new therapeutics. Here we describe the high throughput pipeline established at the University of Queensland in Australia. In this focused pipeline, the targets for structure determination are proteins that are expressed in mouse macrophage cells and that are inferred to have a role in innate immunity. The aim is to characterize the molecular structure and the biochemical and cellular function of these targets by using a parallel processing pipeline. The pipeline is designed to work with tens to hundreds of target gene products and comprises target selection, cloning, expression, purification, crystallization and structure determination. The structures from this pipeline will provide insights into the function of previously uncharacterized macrophage proteins and could lead to the validation of new drug targets for chronic obstructive pulmonary disease and arthritis. (c) 2006 Elsevier B.V. All rights reserved.