924 resultados para UCSC genome browser
Resumo:
We completed the genome sequence of Lettuce necrotic yellows virus (LNYV) by determining the nucleotide sequences of the 4a (putative phosphoprotein), 4b, M (matrix protein), G (glycoprotein) and L (polymerase) genes. The genome consists of 12,807 nucleotides and encodes six genes in the order 3' leader-N-4a(P)-4b-M-G-L-5' trailer. Sequences were derived from clones of a cDNA library from LNYV genomic RNA and from fragments amplified using reverse transcription-polymerase chain reaction. The 4a protein has a low isoelectric point characteristic for rhabdovirus phosphoproteins. The 4b protein has significant sequence similarities with the movement proteins of capillo- and trichoviruses and may be involved in cell-to-cell movement. The putative G protein sequence contains a predicted 25 amino acids signal peptide and endopeptidase cleavage site, three predicted glycosylation sites and a putative transmembrane domain. The deduced L protein sequence shows similarities with the L proteins of other plant rhabdoviruses and contains polymerase module motifs characteristic for RNA-dependent RNA polymerases of negative-strand RNA viruses. Phylogenetic analysis of this motif among rhabdoviruses placed LNYV in a group with other sequenced cytorhabdoviruses, most closely related to Strawberry crinkle virus. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Recent large-scale analyses of mainly full-length cDNA libraries generated from a variety of mouse tissues indicated that almost half of all representative cloned sequences did flat contain ail apparent protein-coding sequence, and were putatively derived from non-protein-coding RNA (ncRNA) genes. However, many of these clones were singletons and the majority were unspliced, raising the possibility that they may be derived from genomic DNA or unprocessed pre-rnRNA contamination during library construction, or alternatively represent nonspecific transcriptional noise. Here we Show, using reverse transcriptase-dependent PCR, microarray, and Northern blot analyses, that many of these clones were derived from genuine transcripts Of unknown function whose expression appears to be regulated. The ncRNA transcripts have larger exons and fewer introns than protein-coding transcripts. Analysis of the genomic landscape around these sequences indicates that some cDNA clones were produced not from terminal poly(A) tracts but internal priming sites within longer transcripts, only a minority of which is encompassed by known genes. A significant proportion of these transcripts exhibit tissue-specific expression patterns, as well as dynamic changes in their expression in macrophages following lipopolysaccharide Stimulation. Taken together, the data provide strong support for the conclusion that ncRNAs are an important, regulated component of the mammalian transcriptome.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Leptospirosis is one of the most common zoonotic diseases in the world, resulting in high morbidity and mortality in humans and affecting global livestock production. Most infections are caused by either Leptospira borgpetersenii or Leptospira interrogans, bacteria that vary in their distribution in nature and rely on different modes of transmission. We report the complete genomic sequences of two strains of L. borgpetersenii serovar Hardjo that have distinct phenotypes and virulence. These two strains have nearly identical genetic content, with subtle frameshift and point mutations being a common form of genetic variation. Starkly limited regions of synteny are shared between the large chromosomes of L. borgpetersenii and L. interrogans, probably the result of frequent recombination events between insertion sequences. The L. borgpetersenii genome is ≈700 kb smaller and has a lower coding density than L. interrogans, indicating it is decaying through a process of insertion sequence-mediated genome reduction. Loss of gene function is not random but is centered on impairment of environmental sensing and metabolite transport and utilization. These features distinguish L. borgpetersenii from L. interrogans, a species with minimal genetic decay and that survives extended passage in aquatic environments encountering a mammalian host. We conclude that L. borgpetersenii is evolving toward dependence on a strict host-to-host transmission cycle.
Resumo:
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a welldefined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
Resumo:
Topological measures of large-scale complex networks are applied to a specific artificial regulatory network model created through a whole genome duplication and divergence mechanism. This class of networks share topological features with natural transcriptional regulatory networks. Specifically, these networks display scale-free and small-world topology and possess subgraph distributions similar to those of natural networks. Thus, the topologies inherent in natural networks may be in part due to their method of creation rather than being exclusively shaped by subsequent evolution under selection. The evolvability of the dynamics of these networks is also examined by evolving networks in simulation to obtain three simple types of output dynamics. The networks obtained from this process show a wide variety of topologies and numbers of genes indicating that it is relatively easy to evolve these classes of dynamics in this model. (c) 2006 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Endogenous retroviruses are a common ancestral feature of mammalian genomes with most having been inactivated over time through mutation and deletion(1). A group of more intact endogenous retroviruses are considered to have entered the genomes of some species more recently, through infection by exogenous viruses(2), but this event has never been directly proved. We have previously reported koala retrovirus (KoRV) to be a functional virus that is associated with neoplasia(3). Here we show that KoRV also shows features of a recently inserted endogenous retrovirus that is vertically transmitted. The finding that some isolated koala populations have not yet incorporated KoRV into their genomes, combined with its high level of activity and variability in individual koalas, suggests that KoRV is a virus in transition between an exogenous and endogenous element. This ongoing dynamic interaction with a wild species provides an exciting opportunity to study the process and consequences of retroviral endogenization in action, and is an attractive model for studying the evolutionary event in which a retrovirus invades a mammalian genome.
Resumo:
Full-length genome sequences of five virulent and five avirulent strains of Newcastle disease virus isolated between 1998 and 2002 in Victoria and New South Wales, Australia were determined. Comparisons between these strains revealed that coding sequence variability in the haemagglutinin-neuraminidase (HN), matrix (M) and phosphoprotein (P) gene sequences appeared to be more variable than in the fusion (F), nucleocapsid (N) and RNA dependent-RNA replicase (L) genes. Sequence analysis of a number of other isolates made during the recent virulent NDV outbreaks, also identified the presence of a number of variants with altered F gene cleavage sites, which resulted in altered biological properties of those viruses. Quasispecies analysis of a number of field isolates indicated the presence of virulent virus in one particular isolate. Gene sequence analysis of the progenitor virus isolated in 1998 showed very little sequence variation when compared to that of a progenitor-like virus isolated in 2001 demonstrating that in the field. viral genome sequence variation appears to be biologically restricted to that of a consensus sequence. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
The southern cattle tick, Boophilus microplus (Canestrini), causes annual economic losses in the hundreds of millions of dollars to cattle producers throughout the world, and ranks as the most economically important tick from a global perspective. Control failures attributable to the development of pesticide resistance have become commonplace, and novel control technologies are needed. The availability of the genome sequence will facilitate the development of these new technologies, and we are proposing sequencing to a 4-6X draft coverage. Many existing biological resources are available to facilitate a genome sequencing project, including several inbred laboratory tick strains, a database of approximate to 45,000 expressed sequence tags compiled into a B. microplus Gene Index, a bacterial artificial chromosome (BAC) library, an established B. microplus cell line, and genomic DNA suitable for library synthesis. Collaborative projects are underway to map BACs and cDNAs to specific chromosomes and to sequence selected BAC clones. When completed, the genome sequences from the cow, B. microphis, and the B. microphis-borne pathogens Babesia bovis and Anaplasma marginale will enhance studies of host-vector-pathogen systems. Genes involved in the regeneration of amputated tick limbs and transitions through developmental stages are largely unknown. Studies of these and other interesting biological questions will be advanced by tick genome sequence data. Comparative genomics offers the prospect of new insight into many, perhaps all, aspects of the biology of ticks and the pathogens they transmit to farm animals and people. The B. microplus genome sequence will fill a major gap in comparative genomics: a sequence from the Metastriata lineage of ticks. The purpose of the article is to synergize interest in and provide rationales for sequencing the genome of B. microplus and for publicizing currently available genomic resources for this tick.
Resumo:
The mapping and sequencing of the human genome has generated a large resource for answering questions about human disease. This achievement is akin in scientific importance to developing the periodic table of elements. Plastic surgery has always been at the frontier medical research. This resource will help us to improve our understanding on the many unknown physiological and pathogical conditions we deal with daily, such as wound heating keloid scar formation, Dupuytren's disease, rheumatoid arthritis, vascular malformation and carcinogenesis. We are primed in obtaining both disease and normal tissues to use this resource and applying it to clinical use. This review is about the human genome, the basis of gene expression profiling and how it will affect our clinical and research practices in the future and for those embarking on the use of this new technology as a research tool, we provide a brief insight on its limitations and pitfalls. (C) 2006 The British Association of Plastic Surgeons. Published by Elsevier Ltd. All rights reserved.
Resumo:
Background: Plasma triglyceride concentration is known to be a significant risk factor for cardiovascular disease (CVD). Previous studies have found that the level of triglycerides is strongly influenced by genetic factors. Methods: To identify quantitative trait loci influencing triglycerides, we conducted a genome-wide linkage scan on data from 485 Australian adult dizygotic twin pairs. Prior to linkage analysis, triglyceride values were adjusted for the effects of covariates including age, sex, time since last meal, time of blood collection (CT) and time to plasma separation. Results: The heritability estimate for ln(triglyceride) adjusted for all above fixed effects was 0.49. The highest multipoint LOD score observed was 2.94 (genome-wide p=0.049) on chromosome 7 (at 65cM). This 7p region contains several candidate genes. Two other regions with suggestive multipoint LOD scores were also identified on chromosome 4 (LOD score=2.26 at 62cM) and chromosome X (LOD score=2.01 at 81cM). Conclusions: The linkage peaks found represent newly identified regions for more detailed study, in particular the significant linkage observed on chromosome 7p13. \ (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Boolean models of genetic regulatory networks (GRNs) have been shown to exhibit many of the characteristic dynamics of real GRNs, with gene expression patterns settling to point attractors or limit cycles, or displaying chaotic behaviour, depending upon the connectivity of the network and the relative proportions of excitatory and inhibitory interactions. This range of behaviours is only apparent, however, when the nodes of the GRN are updated synchronously, a biologically implausible state of affairs. In this paper we demonstrate that evolution can produce GRNs with interesting dynamics under an asynchronous update scheme. We use an Artificial Genome to generate networks which exhibit limit cycle dynamics when updated synchronously, but collapse to a point attractor when updated asynchronously. Using a hill climbing algorithm the networks are then evolved using a fitness function which rewards patterns of gene expression which revisit as many previously seen states as possible. The final networks exhibit “fuzzy limit cycle” dynamics when updated asynchronously.