931 resultados para Dna Sequence
Resumo:
Networks exhibiting accelerating growth have total link numbers growing faster than linearly with network size and either reach a limit or exhibit graduated transitions from nonstationary-to-stationary statistics and from random to scale-free to regular statistics as the network size grows. However, if for any reason the network cannot tolerate such gross structural changes then accelerating networks are constrained to have sizes below some critical value. This is of interest as the regulatory gene networks of single-celled prokaryotes are characterized by an accelerating quadratic growth and are size constrained to be less than about 10,000 genes encoded in DNA sequence of less than about 10 megabases. This paper presents a probabilistic accelerating network model for prokaryotic gene regulation which closely matches observed statistics by employing two classes of network nodes (regulatory and non-regulatory) and directed links whose inbound heads are exponentially distributed over all nodes and whose outbound tails are preferentially attached to regulatory nodes and described by a scale-free distribution. This model explains the observed quadratic growth in regulator number with gene number and predicts an upper prokaryote size limit closely approximating the observed value. (c) 2005 Elsevier GmbH. All rights reserved.
Resumo:
Genetic screening of women from multiple-case breast cancer families and other research-based endeavors have identified an extensive collection of germline variations of BRCA1 and BRCA2 that can be classified as deleterious and have clinical relevance. For some variants, such as those in the conserved intronic splice site regions which are highly likely to alter splicing, it is not possible to classify them based on the identified DNA sequence variation alone. We studied 11 multiple-case breast cancer families carrying seven distinct splice site region genetic alterations in BRCA1 or BRCA2 (BRCA1, c.IVS6-2delA, c.IVS9-2A>C, c.IVS4-1G>T, c.IVS20+1G>A and BRCA2, c.IVS17-1G>C, c.IVS20+1G>A, c.IVS7-1G>A) and applied SpliceSiteFinder to predict possible changes in efficiency of splice donor and acceptor sites, characterized the transcripts, and estimated the average age-specific cumulative risk (penetrance) using a modified segregation analysis. SpliceSiteFinder predicted and we identified transcipts that illustrated that all variants caused exon skipping, and all but two led to frameshifts. The risks of breast cancer to age 70 yrs, averaged over all variants, over BRCA1 variants alone, and over BRCA2 variants alone, were 73% (95% confidence interval 47-93), 64% (95%CI 28-96) and 79% (95%CI 48-98) respectively (all P
Resumo:
The heterogeneous nuclear ribonucleoprotein (hnRNP) A2 is a multi-tasking protein that acts in the cytoplasm and nucleus. We have explored the possibility that this protein is associated with telomeres and participates in their maintenance. Rat brain hnRNP A2 was shown to have two nucleic acid binding sites. In the presence of heparin one site binds single-stranded oligodeoxyribonucleotides irrespective of sequence but not the corresponding oligoribonucleotides. Both the hnRNP A2-binding cis-acting element for the cytoplasmic RNA trafficking element, A2RE, and the ssDNA telomere repeat match a consensus sequence for binding to a second sequence-specific site identified by mutational analysis. hnRNP A2 protected the telomeric repeat sequence, but not the complementary sequence, against DNase digestion: the glycine-rich domain was found to be necessary, but not sufficient, for protection. The N-terminal RRM (RNA recognition motif) and tandem RRMs of hnRNP A2 also bind the single-stranded, template-containing segment of telomerase RNA. hnRNP A2 colocalizes with telomeric chromatin in the subset of PML bodies that are a hallmark of ALT cells, reinforcing the evidence for hnRNPs having a role in telomere maintenance. Our results support a model in which hnRNP A2 acts as a molecular adapter between single-stranded telomeric repeats, or telomerase RNA, and another segment of ssDNA.
Resumo:
Epigenetics is the study of heritable changes in gene expression that occur without changes in DNA sequence. It has a role in determining when and where a gene is expressed during development. Perhaps the most well known epigenetic mechanism is DNA methylation whereby cytosines at position 5 in CpG dinucleotides are methylated. Histone modification is another form of epigenetic control, which is quite complex and diverse. Histones and DNA make up the nucleosome which is the structural unit of chromatin which are involved in packaging DNA. Apart from the crucial role epigenetics plays in embryonic development, transcription, chromatin structure, X chromosome inactivation and genomic imprinting, its role in an increasing number of human diseases is more and more recognized. These diseases include cancer, and lung cancer in particular has been increasingly studied for the potential biological role of epigenetic changes with the promise of better and novel diagnostic and therapeutic tools.
Resumo:
Kudoa monodactyli n. sp. is described from the somatic musculature of Monodactylus argenteus from several localities in southern Queensland, Australia. This is the first record of a myxozoan parasite from the family Monodactylidae. The spores typically have five polar capsules, making this species similar to the four other five-valved Kudoa species (K. neurophila, K. muscularis, K. shulmani, K. cutanea) that have been described to date. However, morphometric measurements particularly of spore length and width make the species from M. argenteus distinct from the other species. Comparison of the small subunit ribosomal DNA sequence of this species with its congeners for which sequence data are available, provides further evidence of novelty. Kudoa monodactyli n. sp. displays 38 (of 1,554) nucleotide differences compared with rDNA sequence of Kudoa neurophila, which on phylogenetic analysis places these species in clades exclusive of each other. Phylogenetic analyses also provide evidence that the number of valves per spore in this genus is an imperfect indicator of relatedness.
Resumo:
The arrangement of genes in the mitochondrial (mt) genomes of most insects is the same, or near-identical, to that inferred to be ancestral for insects. We sequenced the entire mt genome of the small pigeon louse, Campanulotes bidentatus compar, and part of the mt genomes of nine other species of lice. These species were from six families and the three main suborders of the order Phthiraptera. There was no variation in gene arrangement among species within a family but there was much variation in gene arrangement among the three suborders of lice. There has been an extraordinary number of gene rearrangements in the mitochondrial genomes of lice!
Resumo:
DNA approaches are now being used routinely for accurate identification of Echinococcus and Taenia species, subspecies and strains, and in molecular epidemiological surveys of echinococcosis/taeniasis in different geographical settings and host assemblages. The publication of the complete sequences of the mitochondrial (int) genomes of E. granulosus, E. multilocularis, T solium and Asian Taenia, and the availability of mtDNA sequences for a number of other taeniid genotypes, has provided additional genetic information that can be used for more in depth phylogenetic and taxonomic studies of these parasites. This very rich sequence information has provided a solid molecular basis, along with a range of different biological, epidemiological, biochemical and other molecular-genetic criteria, for revising the taxonomy of the genus Echinococcus and for estimating the evolutionary time of divergence of the various taxa. Furthermore, the accumulating genetic data has allowed the development of PCR-based tests for unambiguous identification of Echinococcus eggs in the faeces of definitive hosts and in the environment. Molecular phylogenies derived from mtDNA sequence comparisons of geographically distributed samples of T solium provide molecular evidence for two genotypes, one being restricted to Asia, with the other occurring in Africa and America. Whether the two genetic forms of T solium differ in important phenotypic characteristics remains to be determined. As well, minor DNA sequence differences have been reported between isolates of T saginata and Asian Taenia. There has been considerable discussion over a number of years regarding the taxonomic position of Asian Taenia and whether it should be regarded as a genotype, strain, subspecies or sister species of T saginata. The available molecular genetic data do not support independent species status for Asian Taenia and T saginata. What is in agreement is that both taxa are closely related to each other but distantly related to T solium. This is important in public health terms as it predicts that cysticercosis in humans attributable to Asian Taenia does not occur, because cysticercosis is unknown in T saginata. (C) 2005 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Eukaryotic genomes display segmental patterns of variation in various properties, including GC content and degree of evolutionary conservation. DNA segmentation algorithms are aimed at identifying statistically significant boundaries between such segments. Such algorithms may provide a means of discovering new classes of functional elements in eukaryotic genomes. This paper presents a model and an algorithm for Bayesian DNA segmentation and considers the feasibility of using it to segment whole eukaryotic genomes. The algorithm is tested on a range of simulated and real DNA sequences, and the following conclusions are drawn. Firstly, the algorithm correctly identifies non-segmented sequence, and can thus be used to reject the null hypothesis of uniformity in the property of interest. Secondly, estimates of the number and locations of change-points produced by the algorithm are robust to variations in algorithm parameters and initial starting conditions and correspond to real features in the data. Thirdly, the algorithm is successfully used to segment human chromosome 1 according to GC content, thus demonstrating the feasibility of Bayesian segmentation of eukaryotic genomes. The software described in this paper is available from the author's website (www.uq.edu.au/similar to uqjkeith/) or upon request to the author.
Resumo:
Despite our detailed characterization of the human genome at the level of the primary DNA sequence, we are still far from understanding the molecular events underlying phenotypic variation. Epigenetic modifications to the DNA sequence and associated chromatin are known to regulate gene expression and, as such, are a significant contributor to phenotype. Studies of inbred mice and monozygotic twins show that variation in the epigenotype can be seen even between genetically identical individuals and that this, in some cases at least, is associated with phenotypic differences. Moreover, recent evidence suggests that the epigenome can be influenced by the environment and these changes can last a lifetime. However, we also know that epigenetic states in real-time are in continual flux and, as a result, the epigenome exhibits instability both within and across generations. We still do not understand the rules governing the establishment and maintenance of the epigenotype at any particular locus. The underlying DNA sequence itself and the sequence at unlinked loci (modifier loci) are certainly involved. Recent support for the existence of transgenerational epigenetic inheritance in mammals suggests that the epigenetic state of the locus in the previous generation may also play a role. Over the next decade, many of these processes will be better understood, heralding a greater capacity for us to correlate measurable molecular marks with phenotype and providing the opportunity for improved diagnosis and presymptomatic healthcare.
Resumo:
Feline immunodeficiency virus (FIV), a lentivirus, is an important pathogen of domestic cats around the world and has many similarities to human immunodeficiency virus (HIV). A characteristic of these lentiviruses is their extensive genetic diversity which has been an obstacle in the development of successful vaccines. Of the FIV genes, the envelope gene is the most variable and sequence differences in a portion of this gene have been used to define 5 FIV subtypes (A, B, C, D and E). In this study, the proviral DNA sequence of the V3-V5 region of the envelope gene was determined in blood samples from 31 FIV positive cats from 4 different regions of South Africa. Phylogenetic analysis demonstrated the presence of both subtypes A and C, with subtype A predominating. These findings contribute to the understanding of the genetic diversity of FIV
Resumo:
The DNA sequence of the chromosomal gene cluster encoding the SEF14 fimbriae of Salmonella enterica serovar Enteritidis was determined. Five contiguous open reading frames, sefABCDE, were identified. The sefE gene shared significant homology with araC-like positive regulators. Serovar-associated virulence plasmid (SAP) genes orf7,8,9 and pefI were identified immediately adjacent to the sef operon. The pefI gene encoded a putative regulator of the Plasmid-encoded fimbrial antigen (PEF) expression. The entire sef--pef region, flanked by two IS-like elements, was inserted adjacent to leuX that encoded a transfer RNA molecule. The organisation of this region was suggestive of a classic pathogenicity islet. Southern hybridisation confirmed two copies of the SAP derived orf7,8,9 and pefI region in S. Enteritidis, one in the chromosome and one on the SAP. Of other group D Salmonella, only S. Blegdam and S. Moscow harboured both chromosomal and plasmid copies of pefI--orf9 region although polymorphism was evident.
Resumo:
The imidazotetrazinones are clinically active antitumour agents, temozolomide currently proving successful in the treatment of melanomas and gliomas. The exact nature of the biological processes underlying response are as yet unclear.This thesis attempts to identify the cellular targets important to the cytotoxicity of imidazotetrazinones, to elucidate the pathways by which this damage leads to cell death, and to identify mechanisms by which tumour cells may circumvent this action. The levels of the DNA repair enzymes O6-alkylguanine-DNA-alkyltransferase (O6-AGAT) and 3-methyladenine-DNA-glycosylase (3MAG) have been examined in a range of murine and human cell lines with differential sensitivity to temozolomide. All the cell lines were proficient in 3MAG despite there being 40-fold difference in sensitivity to temozolomide. This suggests that while 3-methyladenine is a major product of temozolomide alkylation of DNA it is unlikely to be a cytotoxic lesion. In contrast, there was a 20-fold variation in O6-AGAT levels and the concentration of this repair enzyme correlated with variations in cytotoxicity. Furthermore, depletion of this enzyme in a resistant, O6-AGAT proficient cell line (Raji), by pre-treatment with the free base O6-methylguanine resulted in 54% sensitisation to the effects of temozolomide. These observations have been extended to 3 glioma cell lines; results that support the view that the cytotoxicity of temozolomide is related to alkylation at the O6-position of guanine and that resistance to this drug is determined by efficient repair of this lesion. It is clear, however, the other factors may influence tumour response since temozolomide showed little differential activity towards 3 established solid murine tumours in vivo, despite different tumour O6-AGAT levels. Unlike mitozolomide, temozolomide is incapable of cross-linking DNA and a mechanism by which O6-methylguanine may exert lethality is unclear. The cytotoxicity of the methyl group may be due to its disruption of DNA-protein interactions, or alternatively cell death may not be a direct result of the alkyl group itself, but manifested by DNA single-strand breaks. Enhanced alkaline elution rates were found for the DNA of Raji cells treated with temozolomide following alkyltransferase depletion, suggesting a relationship between O6-methylguanine and the induction single-strand breaks. Such breaks can activate poly(ADP-ribose) synthetase (ADPRT) an enzyme capable of rapid and lethal depletion of cellular NAD levels. However, at concentrations of temozolomlde relevant in vivo little change in adenine nucleotides was detected in cell lines, although this enzyme would appear important in modulating DNA repair since inhibition of ADPRT potentiated temozolomide cytotoxicity in Raji cells but not O6-AGAT deficient GM892A cells. Cell lines have been reported that are O6-AGAT deficient yet resistant to methylating agents. Thus, resistance to temozolomide may arise not only by removal of the methyl group from the O6-position of guanine, but also from another mechanism involving caffeine-sensitive post-replication repair or mismatch repair activity. A modification of the standard Maxam Gilbert sequencing technique was used to determine the sequence specificity of guanine-N7 alkylation. Temozolomide preferentially alkylated runs of guanines with the intensity of reaction increasing with the number of adjacent guanines in the DNA sequence. Comparable results were obtained with a polymerase-stop assay, although neither technique elucidates the sequence specificity of O6-guanine alkylation. The importance of such specificity to cytotoxicity is uncertain, although guanine-rich sequences are common to the promoter regions of oncogenes. Expression of a plasmid reporter gene under the control of the Ha-ras proto~oncogene promoter was inhibited by alkylation with temozolomide when transfected into cancer cell lines, However, this inhibition did not appear to be related to O6~guanine alkylation and therefore would seem unimportant to the chemotherapeutic activity of temozolomide.
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.