952 resultados para Prokaryotic Genomes
Resumo:
Banana leaf streak disease, caused by several species of Banana streak virus (BSV), is widespread in East Africa. We surveyed for this disease in Uganda and Kenya, and used rolling-circle amplification (RCA) to detect the presence of BSV in banana. Six distinct badnavirus sequences, three from Uganda and three from Kenya, were amplified for which only partial sequences were previously available. The complete genomes were sequenced and characterised. The size and organisation of all six sequences was characteristic of other badnaviruses, including conserved functional domains present in the putative polyprotein encoded by open reading frame (ORF) 3. Based on nucleotide sequence analysis within the reverse transcriptase/ribonuclease H-coding region of open reading frame 3, we propose that these sequences be recognised as six new species and be designated as Banana streak UA virus, Banana streak UI virus, Banana streak UL virus, Banana streak UM virus, Banana streak CA virus and Banana streak IM virus. Using PCR and species-specific primers to test for the presence of integrated sequences, we demonstrated that sequences with high similarity to BSIMV only were present in several banana cultivars which had tested negative for episomal BSV sequences.
Resumo:
The proportion of functional sequence in the human genome is currently a subject of debate. The most widely accepted figure is that approximately 5% is under purifying selection. In Drosophila, estimates are an order of magnitude higher, though this corresponds to a similar quantity of sequence. These estimates depend on the difference between the distribution of genomewide evolutionary rates and that observed in a subset of sequences presumed to be neutrally evolving. Motivated by the widening gap between these estimates and experimental evidence of genome function, especially in mammals, we developed a sensitive technique for evaluating such distributions and found that they are much more complex than previously apparent. We found strong evidence for at least nine well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least seven classes in an alignment of four mammals, including human. We also identified at least three rate classes in human ancestral repeats. By positing that the largest of these ancestral repeat classes is neutrally evolving, we estimate that the proportion of nonneutrally evolving sequence is 30% of human ancestral repeats and 45% of the aligned portion of the genome. However, we also question whether any of the classes represent neutrally evolving sequences and argue that a plausible alternative is that they reflect variable structure-function constraints operating throughout the genomes of complex organisms.
Resumo:
Background Chlamydia pneumoniae is a widespread pathogen causing upper and lower respiratory tract infections in addition to a range of other diseases in humans and animals. Previous whole genome analyses have focused on four essentially clonal (> 99% identity) C. pneumoniae human genomes (AR39, CWL029, J138 and TW183), providing relatively little insight into strain diversity and evolution of this species. Results We performed individual gene-by-gene comparisons of the recently sequenced C. pneumoniae koala genome and four C. pneumoniae human genomes to identify species-specific genes, and more importantly, to gain an insight into the genetic diversity and evolution of the species. We selected genes dispersed throughout the chromosome, representing genes that were specific to C. pneumoniae, genes with a demonstrated role in chlamydial biology and/or pathogenicity (n = 49), genes encoding nucleotide salvage or amino acid biosynthesis proteins (n = 6), and extrachromosomal elements (9 plasmid and 2 bacteriophage genes). Conclusions We have identified strain-specific differences and targets for detection of C. pneumoniae isolates from both human and animal origin. Such characterisation is necessary for an improved understanding of disease transmission and intervention.
Resumo:
Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size.
Resumo:
Background Both sorghum (Sorghum bicolor) and sugarcane (Saccharum officinarum) are members of the Andropogoneae tribe in the Poaceae and are each other's closest relatives amongst cultivated plants. Both are relatively recent domesticates and comparatively little of the genetic potential of these taxa and their wild relatives has been captured by breeding programmes to date. This review assesses the genetic gains made by plant breeders since domestication and the progress in the characterization of genetic resources and their utilization in crop improvement for these two related species. Genetic Resources The genome of sorghum has recently been sequenced providing a great boost to our knowledge of the evolution of grass genomes and the wealth of diversity within S. bicolor taxa. Molecular analysis of the Sorghum genus has identified close relatives of S. bicolor with novel traits, endosperm structure and composition that may be used to expand the cultivated gene pool. Mutant populations (including TILLING populations) provide a useful addition to genetic resources for this species. Sugarcane is a complex polyploid with a large and variable number of copies of each gene. The wild relatives of sugarcane represent a reservoir of genetic diversity for use in sugarcane improvement. Techniques for quantitative molecular analysis of gene or allele copy number in this genetically complex crop have been developed. SNP discovery and mapping in sugarcane has been advanced by the development of high-throughput techniques for ecoTILLING in sugarcane. Genetic linkage maps of the sugarcane genome are being improved for use in breeding selection. The improvement of both sorghum and sugarcane will be accelerated by the incorporation of more diverse germplasm into the domesticated gene pools using molecular tools and the improved knowledge of these genomes.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.
Resumo:
A cDNA corresponding to a transcript induced in culture by N starvation, was identified in Colletotrichum gloeosporioides by a differential hybridisation strategy. The cDNA comprised 905 bp and predicted a 215 aa protein; the gene encoding the cDNA was termed CgDN24. No function for CgDN24 could be predicted by database homology searches using the cDNA sequence and no homologues were found in the sequenced fungal genomes. Transcripts of CgDN24 were detected in infected leaves of Stylosanthes guianensis at stages of infection that corresponded with symptom development. The CgDN24 gene was disrupted by homologous recombination and this led to reduced radial growth rates and the production of hyphae with a hyperbranching phenotype. Normal sporulation was observed, and following conidial inoculation of S. guianensis, normal disease development was obtained. These results demonstrate that CgDN24 is necessary for normal hyphal development in axenic culture but dispensable for phytopathogenicity. © 2005 Elsevier GmbH. All rights reserved.
Resumo:
Australasian marsupials include three major radiations, the insectivorous/carnivorous Dasyuromorphia, the omnivorous bandicoots (Peramelemorphia), and the largely herbivorous diprotodontians. Morphologists have generally considered the bandicoots and diprotodontians to be closely related, most prominently because they are both syndactylous (with the 2nd and 3rd pedal digits being fused). Molecular studies have been unable to confirm or reject this Syndactyla hypothesis. Here we present new mitochondrial (mt) genomes from a spiny bandicoot (Echymipera rufescens) and two dasyurids, a fat-tailed dunnart (Sminthopsis crassicaudata) and a northern quoll (Dasyurus hallucatus). By comparing trees derived from pairwise base-frequency differences between taxa with standard (absolute, uncorrected) distance trees, we infer that composition bias among mt protein-coding and RNA sequences is sufficient to mislead tree reconstruction. This can explain incongruence between trees obtained from mt and nuclear data sets. However, after excluding major sources of compositional heterogeneity, both the “reduced-bias” mt and nuclear data sets clearly favor a bandicoot plus dasyuromorphian association, as well as a grouping of kangaroos and possums (Phalangeriformes) among diprotodontians. Notably, alternatives to these groupings could only be confidently rejected by combining the mt and nuclear data. Elsewhere on the tree, Dromiciops appears to be sister to the monophyletic Australasian marsupials, whereas the placement of the marsupial mole (Notoryctes) remains problematic. More generally, we contend that it is desirable to combine mt genome and nuclear sequences for inferring vertebrate phylogeny, but as separately modeled process partitions. This strategy depends on detecting and excluding (or accounting for) major sources of nonhistorical signal, such as from compositional nonstationarity.
Resumo:
We report three developments toward resolving the challenge of the apparent basal polytomy of neoavian birds. First, we describe improved conditional down-weighting techniques to reduce noise relative to signal for deeper divergences and find increased agreement between data sets. Second, we present formulae for calculating the probabilities of finding predefined groupings in the optimal tree. Finally, we report a significant increase in data: nine new mitochondrial (mt) genomes (the dollarbird, New Zealand kingfisher, great potoo, Australian owlet-nightjar, white-tailed trogon, barn owl, a roadrunner [a ground cuckoo], New Zealand long-tailed cuckoo, and the peach-faced lovebird) and together they provide data for each of the six main groups of Neoaves proposed by Cracraft J (2001). We use his six main groups of modern birds as priors for evaluation of results. These include passerines, cuckoos, parrots, and three other groups termed “WoodKing” (woodpeckers/rollers/kingfishers), “SCA” (owls/potoos/owlet-nightjars/hummingbirds/swifts), and “Conglomerati.” In general, the support is highly significant with just two exceptions, the owls move from the “SCA” group to the raptors, particularly accipitrids (buzzards/eagles) and the osprey, and the shorebirds may be an independent group from the rest of the “Conglomerati”. Molecular dating mt genomes support a major diversification of at least 12 neoavian lineages in the Late Cretaceous. Our results form a basis for further testing with both nuclear-coding sequences and rare genomic changes.
Resumo:
Cockatoos are the distinctive family Cacatuidae, a major lineage of the order of parrots (Psittaciformes) and distributed throughout the Australasian region of the world. However, the evolutionary history of cockatoos is not well understood. We investigated the phylogeny of cockatoos based on three mitochondrial and three nuclear DNA genes obtained from 16 of 21 species of Cacatuidae. In addition, five novel mitochondrial genomes were used to estimate time of divergence and our estimates indicate Cacatuidae diverged from Psittacidae approximately 40.7 million years ago (95% CI 51.6–30.3 Ma) during the Eocene. Our data shows Cacatuidae began to diversify approximately 27.9 Ma (95% CI 38.1–18.3 Ma) during the Oligocene. The early to middle Miocene (20–10 Ma) was a significant period in the evolution of modern Australian environments and vegetation, in which a transformation from mainly mesic to xeric habitats (e.g., fire-adapted sclerophyll vegetation and grasslands) occurred. We hypothesize that this environmental transformation was a driving force behind the diversification of cockatoos. A detailed multi-locus molecular phylogeny enabled us to resolve the phylogenetic placements of the Palm Cockatoo (Probosciger aterrimus), Galah (Eolophus roseicapillus), Gang-gang Cockatoo (Callocephalon fimbriatum) and Cockatiel (Nymphicus hollandicus), which have historically been difficult to place within Cacatuidae. When the molecular evidence is analysed in concert with morphology, it is clear that many of the cockatoo species’ diagnostic phenotypic traits such as plumage colour, body size, wing shape and bill morphology have evolved in parallel or convergently across lineages.
Resumo:
Background The genus Rattus is highly speciose and has a complex taxonomy that is not fully resolved. As shown previously there are two major groups within the genus, an Asian and an Australo-Papuan group. This study focuses on the Australo-Papuan group and particularly on the Australian rats. There are uncertainties regarding the number of species within the group and the relationships among them. We analysed 16 mitochondrial genomes, including seven novel genomes from six species, to help elucidate the evolutionary history of the Australian rats. We also demonstrate, from a larger dataset, the usefulness of short regions of the mitochondrial genome in identifying these rats at the species level. Results Analyses of 16 mitochondrial genomes representing species sampled from Australo-Papuan and Asian clades of Rattus indicate divergence of these two groups ~2.7 million years ago (Mya). Subsequent diversification of at least 4 lineages within the Australo-Papuan clade was rapid and occurred over the period from ~ 0.9-1.7 Mya, a finding that explains the difficulty in resolving some relationships within this clade. Phylogenetic analyses of our 126 taxon, but shorter sequence (1952 nucleotides long), Rattus database generally give well supported species clades. Conclusions Our whole mitochondrial genome analyses are concordant with a taxonomic division that places the native Australian rats into the Rattus fuscipes species group. We suggest the following order of divergence of the Australian species. R. fuscipes is the oldest lineage among the Australian rats and is not part of a New Guinean radiation. R. lutreolus is also within this Australian clade and shallower than R. tunneyi while the R. sordidus group is the shallowest lineage in the clade. The divergences within the R. sordidus and R. leucopus lineages occurring about half a million years ago support the hypotheses of more recent interchanges of rats between Australia and New Guinea. While problematic for inference of deeper divergences, we report that the analysis of shorter mitochondrial sequences is very useful for species identification in rats.
Resumo:
Background The gene composition, gene order and structure of the mitochondrial genome are remarkably stable across bilaterian animals. Lice (Insecta: Phthiraptera) are a major exception to this genomic stability in that the canonical single chromosome with 37 genes found in almost all other bilaterians has been lost in multiple lineages in favour of multiple, minicircular chromosomes with less than 37 genes on each chromosome. Results Minicircular mt genomes are found in six of the ten louse species examined to date and three types of minicircles were identified: heteroplasmic minicircles which coexist with full sized mt genomes (type 1); multigene chromosomes with short, simple control regions, we infer that the genome consists of several such chromosomes (type 2); and multiple, single to three gene chromosomes with large, complex control regions (type 3). Mapping minicircle types onto a phylogenetic tree of lice fails to show a pattern of their occurrence consistent with an evolutionary series of minicircle types. Analysis of the nuclear-encoded, mitochondrially-targetted genes inferred from the body louse, Pediculus, suggests that the loss of mitochondrial single-stranded binding protein (mtSSB) may be responsible for the presence of minicircles in at least species with the most derived type 3 minicircles (Pediculus, Damalinia). Conclusions Minicircular mt genomes are common in lice and appear to have arisen multiple times within the group. Life history adaptive explanations which attribute minicircular mt genomes in lice to the adoption of blood-feeding in the Anoplura are not supported by this expanded data set as minicircles are found in multiple non-blood feeding louse groups but are not found in the blood-feeding genus Heterodoxus. In contrast, a mechanist explanation based on the loss of mtSSB suggests that minicircles may be selectively favoured due to the incapacity of the mt replisome to synthesize long replicative products without mtSSB and thus the loss of this gene lead to the formation of minicircles in lice.
Resumo:
Despite their ecological significance as decomposers and their evolutionary significance as the most speciose eusocial insect group outside the Hymenoptera, termite (Blattodea: Termitoidae or Isoptera) evolutionary relationships have yet to be well resolved. Previous morphological and molecular analyses strongly conflict at the family level and are marked by poor support for backbone nodes. A mitochondrial (mt) genome phylogeny of termites was produced to test relationships between the recognised termite families, improve nodal support and test the phylogenetic utility of rare genomic changes found in the termite mt genome. Complete mt genomes were sequenced for 7 of the 9 extant termite families with additional representatives of each of the two most speciose families Rhinotermitidae (3 of 7 subfamilies) and Termitidae (3 of 8 subfamilies). The mt genome of the well supported sister group of termites, the subsocial cockroach Cryptocercus, was also sequenced. A highly supported tree of termite relationships was produced by all analytical methods and data treatment approaches, however the relationship of the termites + Cryptocercus clade to other cockroach lineages was highly affected by the strong nucleotide compositional bias found in termites relative to other dictyopterans. The phylogeny supports previously proposed suprafamilial termite lineages, the Euisoptera and Neoisoptera, a later derived Kalotermitidae as sister group of the Neoisoptera and a monophyletic clade of dampwood (Stolotermitidae, Archotermopsidae) and harvester termites (Hodotermitidae). In contrast to previous termite phylogenetic studies, nodal supports were very high for family-level relationships within termites. Two rare genomic changes in the mt genome control region were found to be molecular synapomorphies for major clades. An elongated stem-loop structure defined the clade Polyphagidae + (Cryptocercus + termites), and a further series of compensatory base changes in this stem loop is synapomorphic for the Neoisoptera. The complicated repeat structures first identified in Reticulitermes, composed of short (A-type) and long (B-type repeats) defines the clade Heterotermitinae + Termitidae, while the secondary loss of A-type repeats is synapomorphic for the non-macrotermitine Termitidae.
Resumo:
Drosophila serrata is a member of the montium group, which contains more than 98 species and until recently was considered a subgroup within the melanogaster group. This Drosophila species is an emerging model system for evolutionary quantitative genetics and has been used in studies of species borders, clinal variation and sexual selection. Despite the importance of D. serrata as a model for evolutionary research, our poor understanding of its genome remains a significant limitation. Here, we provide a first-generation gene-based linkage map and a physical map for this species. Consistent with previous studies of other drosophilids we observed strong conservation of genes within chromosome arms homologous with D. melanogaster but major differences in within-arm synteny. These resources will be a useful complement to ongoing genome sequencing efforts and QTL mapping studies in this species