901 resultados para Whole Genome Sequences


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Representational difference analysis (RDA) has great potential for preferential amplification of unique but uncharacterised DNA sequences present in one source such as a whole genome, but absent from a related genome or other complex population of sequences. While a few examples of its successful exploitation have been published, the method has not been well dissected and robust, detailed published protocols are lacking. Here we examine the method in detail, suggest improvements and provide a protocol that has yielded key unique sequences from a pathogenic bacterial genome. © 2003 Elsevier Science B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ratites are large, flightless birds and include the ostrich, rheas, kiwi, emu, and cassowaries, along with extinct members, such as moa and elephant birds. Previous phylogenetic analyses of complete mitochondrial genome sequences have reinforced the traditional belief that ratites are monophyletic and tinamous are their sister group. However, in these studies ratite monophyly was enforced in the analyses that modeled rate heterogeneity among variable sites. Relaxing this topological constraint results in strong support for the tinamous (which fly) nesting within ratites. Furthermore, upon reducing base compositional bias and partitioning models of sequence evolution among protein codon positions and RNA structures, the tinamou–moa clade grouped with kiwi, emu, and cassowaries to the exclusion of the successively more divergent rheas and ostrich. These relationships are consistent with recent results from a large nuclear data set, whereas our strongly supported finding of a tinamou–moa grouping further resolves palaeognath phylogeny. We infer flight to have been lost among ratites multiple times in temporally close association with the Cretaceous–Tertiary extinction event. This circumvents requirements for transient microcontinents and island chains to explain discordance between ratite phylogeny and patterns of continental breakup. Ostriches may have dispersed to Africa from Eurasia, putting in question the status of ratites as an iconic Gondwanan relict taxon. [Base composition; flightless; Gondwana; mitochondrial genome; Palaeognathae; phylogeny; ratites.]

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Members of the Calliphoridae (blowflies) are significant for medical and veterinary management, due to the ability of some species to consume living flesh as larvae, and for forensic investigations due to the ability of others to develop in corpses. Due to the difficulty of accurately identifying larval blowflies to species there is a need for DNA-based diagnostics for this family, however the widely used DNA-barcoding marker, cox1, has been shown to fail for several groups within this family. Additionally, many phylogenetic relationships within the Calliphoridae are still unresolved, particularly deeper level relationships. Sequencing whole mt genomes has been demonstrated both as an effective method for identifying the most informative diagnostic markers and for resolving phylogenetic relationships. Twenty-seven complete, or nearly so, mt genomes were sequenced representing 13 species, seven genera and four calliphorid subfamilies and a member of the related family Tachinidae. PCR and sequencing primers developed for sequencing one calliphorid species could be reused to sequence related species within the same superfamily with success rates ranging from 61% to 100%, demonstrating the speed and efficiency with which an mt genome dataset can be assembled. Comparison of molecular divergences for each of the 13 protein-coding genes and 2 ribosomal RNA genes, at a range of taxonomic scales identified novel targets for developing as diagnostic markers which were 117–200% more variable than the markers which have been used previously in calliphorids. Phylogenetic analysis of whole mt genome sequences resulted in much stronger support for family and subfamily-level relationships. The Calliphoridae are polyphyletic, with the Polleninae more closely related to the Tachinidae, and the Sarcophagidae are the sister group of the remaining calliphorids. Within the Calliphoridae, there was strong support for the monophyly of the Chrysomyinae and Luciliinae and for the sister-grouping of Luciliinae with Calliphorinae. Relationships within Chrysomya were not well resolved. Whole mt genome data, supported the previously demonstrated paraphyly of Lucilia cuprina with respect to L. sericata and allowed us to conclude that it is due to hybrid introgression prior to the last common ancestor of modern sericata populations, rather than due to recent hybridisation, nuclear pseudogenes or incomplete lineage sorting.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background The majority of introns in gene transcripts are found within the coding sequences (CDSs). A small but significant fraction of introns are also found to reside within the untranslated regions (5′UTRs and 3′UTRs) of expressed sequences. Alignment of the whole genome and expressed sequence tags (ESTs) of the model plant Arabidopsis thaliana has identified introns residing in both coding and non-coding regions of the genome. Results A bioinformatic analysis revealed some interesting observations: (1) the density of introns in 5′UTRs is similar to that in CDSs but much higher than that in 3′UTRs; (2) the 5′UTR introns are preferentially located close to the initiating ATG codon; (3) introns in the 5′UTRs are, on average, longer than introns in the CDSs and 3′UTRs; and (4) 5′UTR introns have a different nucleotide composition to that of CDs and 3′UTR introns. Furthermore, we show that the 5′UTR intron of the A. thaliana EFIα-A3 gene affects the gene expression and the size of the 5′UTR intron influences the level of gene expression. Conclusion Introns within the 5′UTR show specific features that distinguish them from introns that reside within the coding sequence and the 3′UTR. In the EFIα-A3 gene, the presence of a long intron in the 5′UTR is sufficient to enhance gene expression in plants in a size dependent manner.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lipooligosaccharide (LOS) is a complex surface structure that is linked to many pathogenic properties of Acinetobacter baumannii. In A. baumannii, the genes responsible for the synthesis of the outer core (OC) component of the LOS are located between ilvE and aspS. The content of the OC locus is usually variable within a species, and examination of 6 complete and 227 draft A. baumannii genome sequences available in GenBank non-redundant and Whole Genome Shotgun databases revealed nine distinct new types, OCL4-OCL12, in addition to the three known ones. The twelve gene clusters fell into two distinct groups, designated Group A and Group B, based on similarities in the genes present. OCL6 (Group B) was unique in that it included genes for the synthesis of L-Rhamnosep. Genetic exchange of the different configurations between strains has occurred as some OC forms were found in several different sequence types (STs). OCL1 (Group A) was the most widely distributed being present in 18 STs, and OCL6 was found in 16 STs. Variation within clones was also observed, with more than one OC locus type found in the two globally disseminated clones, GC1 and GC2, that include the majority of multiply antibiotic resistant isolates. OCL1 was the most abundant gene cluster in both GC1 and GC2 genomes but GC1 isolates also carried OCL2, OCL3 or OCL5, and OCL3 was also present in GC2. As replacement of the OC locus in the major global clones indicates the presence of sub-lineages, a PCR typing scheme was developed to rapidly distinguish Group A and Group B types, and to distinguish the specific forms found in GC1 and GC2 isolates.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Introduction: A number of genetic-association studies have identified genes contributing to ankylosing spondylitis (AS) susceptibility but such approaches provide little information as to the gene activity changes occurring during the disease process. Transcriptional profiling generates a 'snapshot' of the sampled cells' activity and thus can provide insights into the molecular processes driving the disease process. We undertook a whole-genome microarray approach to identify candidate genes associated with AS and validated these gene-expression changes in a larger sample cohort. Methods: A total of 18 active AS patients, classified according to the New York criteria, and 18 gender- and age-matched controls were profiled using Illumina HT-12 whole-genome expression BeadChips which carry cDNAs for 48,000 genes and transcripts. Class comparison analysis identified a number of differentially expressed candidate genes. These candidate genes were then validated in a larger cohort using qPCR-based TaqMan low density arrays (TLDAs). Results: A total of 239 probes corresponding to 221 genes were identified as being significantly different between patients and controls with a P-value <0.0005 (80% confidence level of false discovery rate). Forty-seven genes were then selected for validation studies, using the TLDAs. Thirteen of these genes were validated in the second patient cohort with 12 downregulated 1.3- to 2-fold and only 1 upregulated (1.6-fold). Among a number of identified genes with well-documented inflammatory roles we also validated genes that might be of great interest to the understanding of AS progression such as SPOCK2 (osteonectin) and EP300, which modulate cartilage and bone metabolism. Conclusions: We have validated a gene expression signature for AS from whole blood and identified strong candidate genes that may play roles in both the inflammatory and joint destruction aspects of the disease.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

To gain insight into the mechanisms by which the Myb transcription factor controls normal hematopoiesis and particularly, how it contributes to leukemogenesis, we mapped the genome-wide occupancy of Myb by chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) in ERMYB myeloid progenitor cells. By integrating the genome occupancy data with whole genome expression profiling data, we identified a Myb-regulated transcriptional program. Gene signatures for leukemia stem cells, normal hematopoietic stem/progenitor cells and myeloid development were overrepresented in 2368 Myb regulated genes. Of these, Myb bound directly near or within 793 genes. Myb directly activates some genes known critical in maintaining hematopoietic stem cells, such as Gfi1 and Cited2. Importantly, we also show that, despite being usually considered as a transactivator, Myb also functions to repress approximately half of its direct targets, including several key regulators of myeloid differentiation, such as Sfpi1 (also known as Pu.1), Runx1, Junb and Cebpb. Furthermore, our results demonstrate that interaction with p300, an established coactivator for Myb, is unexpectedly required for Myb-mediated transcriptional repression. We propose that the repression of the above mentioned key pro-differentiation factors may contribute essentially to Myb's ability to suppress differentiation and promote self-renewal, thus maintaining progenitor cells in an undifferentiated state and promoting leukemic transformation. © 2011 The Author(s).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Phenotypic convergence is thought to be driven by parallel substitutions coupled with natural selection at the sequence level. Multiple independent evolutionary transitions of mammals to an aquatic environment offer an opportunity to test this thesis. Here, whole genome alignment of coding sequences identified widespread parallel amino acid substitutions in marine mammals; however, the majority of these changes were not unique to these animals. Conversely, we report that candidate aquatic adaptation genes, identified by signatures of likelihood convergence and/or elevated ratio of nonsynonymous to synonymous nucleotide substitution rate, are characterized by very few parallel substitutions and exhibit distinct sequence changes in each group. Moreover, no significant positive correlation was found between likelihood convergence and positive selection in all three marine lineages. These results suggest that convergence in protein coding genes associated with aquatic lifestyle is mainly characterized by independent substitutions and relaxed negative selection.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of the pedigree-based genome mapping project is to investigate and develop systems for implementing marker assisted selection to improve the efficiency of selection and increase the rate of genetic gain in breeding programs. Pedigree-based whole genome marker application provides a vehicle for incorporating marker technologies into applied breeding programs by bridging the gap between marker-trait association and marker implementation. We report on the development of protocols for implementation of pedigree-based whole genome marker analysis in breeding programs within the Australian northern winter cereals region. Examples of applications from the Queensland DPI&F wheat and barley breeding programs are provided, commenting on the use of microsatellites and other types of molecular markers for routine genomic analysis, the integration of genotypic, phenotypic and pedigree information for targeted wheat and barley lines, the genomic impacts of strong selection pressure in case study pedigrees, and directions for future pedigree-based marker development and analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Japanese isolates of Candidatus Liberibacter asiaticus have been shown to be clearly differentiated by simple sequence repeat (SSR) profiles at four loci. In this study, 25 SSR loci, including these four loci, were selected from the whole-genome sequence and were used to differentiate non-Japanese samples of Ca. Liberibacter asiaticus (13 Indian, 3 East Timorese, 1 Papuan and 8 Floridian samples). Out of the 25 SSR loci, 13 were polymorphic. Dendrogram analysis using SSR loci showed that the clusters were mostly consistent with the geographical origins of the isolates. When single nucleotide polymorphisms (SNPs) were searched around these 25 loci, only the upstream region of locus 091 exhibited polymorphism. Phylogenetic tree analysis of the SNPs in the upstream region of locus 091 showed that Floridian samples were clustered into one group as shown by dendrogram analysis using SSR loci. The differences in nucleotide sequences were not associated with differences in the citrus hosts (lime, mandarin, lemon and sour orange) from which the isolates were originally derived.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The first complete genome sequence of capsicum chlorosis virus (CaCV) from Australia was determined using a combination of Illumina HiSeq RNA and Sanger sequencing technologies. Australian CaCV had a tripartite genome structure like other CaCV isolates. The large (L) RNA was 8913 nucleotides (nt) in length and contained a single open reading frame (ORF) of 8634 nt encoding a predicted RNA-dependent RNA polymerase (RdRp) in the viral-complementary (vc) sense. The medium (M) and small (S) RNA segments were 4846 and 3944 nt in length, respectively, each containing two non-overlapping ORFs in ambisense orientation, separated by intergenic regions (IGR). The M segment contained ORFs encoding the predicted non-structural movement protein (NSm; 927 nt) and precursor of glycoproteins (GP; 3366 nt) in the viral sense (v) and vc strand, respectively, separated by a 449-nt IGR. The S segment coded for the predicted nucleocapsid (N) protein (828 nt) and non-structural suppressor of silencing protein (NSs; 1320 nt) in the vc and v strand, respectively. The S RNA contained an IGR of 1663 nt, being the largest IGR of all CaCV isolates sequenced so far. Comparison of the Australian CaCV genome with complete CaCV genome sequences from other geographic regions showed highest sequence identity with a Taiwanese isolate. Genome sequence comparisons and phylogeny of all available CaCV isolates provided evidence for at least two highly diverged groups of CaCV isolates that may warrant re-classification of AIT-Thailand and CP-China isolates as unique tospoviruses, separate from CaCV.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Extraintestinal pathogenic Escherichia coli (ExPEC) represent a diverse group of strains of E. coli, which infect extraintestinal sites, such as the urinary tract, the bloodstream, the meninges, the peritoneal cavity, and the lungs. Urinary tract infections (UTIs) caused by uropathogenic E. coli (UPEC), the major subgroup of ExPEC, are among the most prevalent microbial diseases world wide and a substantial burden for public health care systems. UTIs are responsible for serious morbidity and mortality in the elderly, in young children, and in immune-compromised and hospitalized patients. ExPEC strains are different, both from genetic and clinical perspectives, from commensal E. coli strains belonging to the normal intestinal flora and from intestinal pathogenic E. coli strains causing diarrhea. ExPEC strains are characterized by a broad range of alternate virulence factors, such as adhesins, toxins, and iron accumulation systems. Unlike diarrheagenic E. coli, whose distinctive virulence determinants evoke characteristic diarrheagenic symptoms and signs, ExPEC strains are exceedingly heterogeneous and are known to possess no specific virulence factors or a set of factors, which are obligatory for the infection of a certain extraintestinal site (e. g. the urinary tract). The ExPEC genomes are highly diverse mosaic structures in permanent flux. These strains have obtained a significant amount of DNA (predictably up to 25% of the genomes) through acquisition of foreign DNA from diverse related or non-related donor species by lateral transfer of mobile genetic elements, including pathogenicity islands (PAIs), plasmids, phages, transposons, and insertion elements. The ability of ExPEC strains to cause disease is mainly derived from this horizontally acquired gene pool; the extragenous DNA facilitates rapid adaptation of the pathogen to changing conditions and hence the extent of the spectrum of sites that can be infected. However, neither the amount of unique DNA in different ExPEC strains (or UPEC strains) nor the mechanisms lying behind the observed genomic mobility are known. Due to this extreme heterogeneity of the UPEC and ExPEC populations in general, the routine surveillance of ExPEC is exceedingly difficult. In this project, we presented a novel virulence gene algorithm (VGA) for the estimation of the extraintestinal virulence potential (VP, pathogenicity risk) of clinically relevant ExPECs and fecal E. coli isolates. The VGA was based on a DNA microarray specific for the ExPEC phenotype (ExPEC pathoarray). This array contained 77 DNA probes homologous with known (e.g. adhesion factors, iron accumulation systems, and toxins) and putative (e.g. genes predictably involved in adhesion, iron uptake, or in metabolic functions) ExPEC virulence determinants. In total, 25 of DNA probes homologous with known virulence factors and 36 of DNA probes representing putative extraintestinal virulence determinants were found at significantly higher frequency in virulent ExPEC isolates than in commensal E. coli strains. We showed that the ExPEC pathoarray and the VGA could be readily used for the differentiation of highly virulent ExPECs both from less virulent ExPEC clones and from commensal E. coli strains as well. Implementing the VGA in a group of unknown ExPECs (n=53) and fecal E. coli isolates (n=37), 83% of strains were correctly identified as extraintestinal virulent or commensal E. coli. Conversely, 15% of clinical ExPECs and 19% of fecal E. coli strains failed to raster into their respective pathogenic and non-pathogenic groups. Clinical data and virulence gene profiles of these strains warranted the estimated VPs; UPEC strains with atypically low risk-ratios were largely isolated from patients with certain medical history, including diabetes mellitus or catheterization, or from elderly patients. In addition, fecal E. coli strains with VPs characteristic for ExPEC were shown to represent the diagnostically important fraction of resident strains of the gut flora with a high potential of causing extraintestinal infections. Interestingly, a large fraction of DNA probes associated with the ExPEC phenotype corresponded to novel DNA sequences without any known function in UTIs and thus represented new genetic markers for the extraintestinal virulence. These DNA probes included unknown DNA sequences originating from the genomic subtractions of four clinical ExPEC isolates as well as from five novel cosmid sequences identified in the UPEC strains HE300 and JS299. The characterized cosmid sequences (pJS332, pJS448, pJS666, pJS700, and pJS706) revealed complex modular DNA structures with known and unknown DNA fragments arranged in a puzzle-like manner and integrated into the common E. coli genomic backbone. Furthermore, cosmid pJS332 of the UPEC strain HE300, which carried a chromosomal virulence gene cluster (iroBCDEN) encoding the salmochelin siderophore system, was shown to be part of a transmissible plasmid of Salmonella enterica. Taken together, the results of this project pointed towards the assumptions that first, (i) homologous recombination, even within coding genes, contributes to the observed mosaicism of ExPEC genomes and secondly, (ii) besides en block transfer of large DNA regions (e.g. chromosomal PAIs) also rearrangements of small DNA modules provide a means of genomic plasticity. The data presented in this project supplemented previous whole genome sequencing projects of E. coli and indicated that each E. coli genome displays a unique assemblage of individual mosaic structures, which enable these strains to successfully colonize and infect different anatomical sites.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present WebGeSTer DB, the largest database of intrinsic transcription terminators (http://pallab.serc.iisc.ernet.in/gester). The database comprises of a million terminators identified in 1060 bacterial genome sequences and 798 plasmids. Users can obtain both graphic and tabular results on putative terminators based on default or user-defined parameters. The results are arranged in different tiers to facilitate retrieval, as per the specific requirements. An interactive map has been incorporated to visualize the distribution of terminators across the whole genome. Analysis of the results, both at the whole-genome level and with respect to terminators downstream of specific genes, offers insight into the prevalence of canonical and non-canonical terminators across different phyla. The data in the database reinforce the paradigm that intrinsic termination is a conserved and efficient regulatory mechanism in bacteria. Our database is freely accessible.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Staphylococcus aureus is a major human pathogen, first recognized as a leading cause of hospital-acquired infections. Community-associated S. aureus (CA-SA) pose a greater threat due to increase in severity of infection and disease among children and healthy adults. CA-SA strains in India are genetically diverse, among which is the sequence type (ST) 772, which has now spread to Australia, Europe and Japan. Towards understanding the genetic characteristics of ST772, we obtained draft genome sequences of five relevant clinical isolates and studied the properties of their PVL-carrying prophages, whose presence is a defining hallmark of CA-SA. We show that this is a novel prophage, which carries the structural genes of the hlb-carrying prophage and includes the sea enterotoxin. This architecture probably emerged early within the ST772 lineage, at least in India. The sea gene, unique to ST772 PVL, despite having promoter sequence characteristics typical of low expression, appears to be highly expressed during early phase of growth in laboratory conditions. We speculate that this might be a consequence of its novel sequence context. The crippled nature of the hlb-converting prophage in ST772. suggests that widespread mobility of the sea enterotoxin might be a selective force behind its `transfer' to the PVL prophage. Wild type ST772 strains induced strong proliferative responses as well as high cytotoxic activity against neutrophils, likely mediated by superantigen SEA and the PVL toxin respectively. Both proliferation and cytotoxicity were markedly reduced in a cured ST772 strain indicating the impact of the phage on virulence. The presence of SEA alongside he genes for the immune system-modulating PVL toxin may contribute to the success and virulence of ST772.