10 resultados para Comparative genomics
em CentAUR: Central Archive University of Reading - UK
Resumo:
Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.
Resumo:
Background. The anaerobic spirochaete Brachyspira pilosicoli causes enteric disease in avian, porcine and human hosts, amongst others. To date, the only available genome sequence of B. pilosicoli is that of strain 95/1000, a porcine isolate. In the first intra-species genome comparison within the Brachyspira genus, we report the whole genome sequence of B. pilosicoli B2904, an avian isolate, the incomplete genome sequence of B. pilosicoli WesB, a human isolate, and the comparisons with B. pilosicoli 95/1000. We also draw on incomplete genome sequences from three other Brachyspira species. Finally we report the first application of the high-throughput Biolog phenotype screening tool on the B. pilosicoli strains for detailed comparisons between genotype and phenotype. Results. Feature and sequence genome comparisons revealed a high degree of similarity between the three B. pilosicoli strains, although the genomes of B2904 and WesB were larger than that of 95/1000 (~2,765, 2.890 and 2.596 Mb, respectively). Genome rearrangements were observed which correlated largely with the positions of mobile genetic elements. Through comparison of the B2904 and WesB genomes with the 95/1000 genome, features that we propose are non-essential due to their absence from 95/1000 include a peptidase, glycine reductase complex components and transposases. Novel bacteriophages were detected in the newly-sequenced genomes, which appeared to have involvement in intra- and inter-species horizontal gene transfer. Phenotypic differences predicted from genome analysis, such as the lack of genes for glucuronate catabolism in 95/1000, were confirmed by phenotyping. Conclusions. The availability of multiple B. pilosicoli genome sequences has allowed us to demonstrate the substantial genomic variation that exists between these strains, and provides an insight into genetic events that are shaping the species. In addition, phenotype screening allowed determination of how genotypic differences translated to phenotype. Further application of such comparisons will improve understanding of the metabolic capabilities of Brachyspira species.
Resumo:
Root nodule symbiosis (RNS) is one of the most efficient biological systems for nitrogen fixation and it occurs in 90% of genera in the Papilionoideae, the largest subfamily of legumes. Most papilionoid species show evidence of a polyploidy event occurred approximately 58 million years ago. Although polyploidy is considered to be an important evolutionary force in plants, the role of this papilionoid polyploidy event, especially its association with RNS, is not understood. In this study, we explored this role using an integrated comparative genomic approach and conducted gene expression comparisons and gene ontology enrichment analyses. The results show the following: (1) approximately a quarter of the papilionoid-polyploidy-derived duplicate genes are retained; (2) there is a striking divergence in the level of expression of gene duplicate pairs derived from the polyploidy event; and (3) the retained duplicates are frequently involved in the processes crucial for RNS establishment, such as symbiotic signalling, nodule organogenesis, rhizobial infection and nutrient exchange and transport. Thus, we conclude that the papilionoid polyploidy event might have further refined RNS and induced a more robust and enhanced symbiotic system. This conclusion partly explains the widespread occurrence of the Papilionoideae.
Resumo:
Background Despite the frequent isolation of Salmonella enterica sub. enterica serovars Derby and Mbandaka from livestock in the UK and USA little is known about the biological processes maintaining their prevalence. Statistics for Salmonella isolations from livestock production in the UK show that S. Derby is most commonly associated with pigs and turkeys and S. Mbandaka with cattle and chickens. Here we compare the first sequenced genomes of S. Derby and S. Mbandaka as a basis for further analysis of the potential host adaptations that contribute to their distinct host species distributions. Results Comparative functional genomics using the RAST annotation system showed that predominantly mechanisms that relate to metabolite utilisation, in vivo and ex vivo persistence and pathogenesis distinguish S. Derby from S. Mbandaka. Alignment of the genome nucleotide sequences of S. Derby D1 and D2 and S. Mbandaka M1 and M2 with Salmonella pathogenicity islands (SPI) identified unique complements of genes associated with host adaptation. We also describe a new genomic island with a putative role in pathogenesis, SPI-23. SPI-23 is present in several S. enterica serovars, including S. Agona, S. Dublin and S. Gallinarum, it is absent in its entirety from S. Mbandaka. Conclusions We discovered a new 37 Kb genomic island, SPI-23, in the chromosome sequence of S. Derby, encoding 42 ORFS, ten of which are putative TTSS effector proteins. We infer from full-genome synonymous SNP analysis that these two serovars diverged, between 182kya and 625kya coinciding with the divergence of domestic pigs. The differences between the genomes of these serovars suggest they have been exposed to different stresses including, phage, transposons and prolonged externalisation. The two serovars possess distinct complements of metabolic genes; many of which cluster into pathways for catabolism of carbon sources.
Resumo:
Although Ca transport in plants is highly complex, the overexpression of vacuolar Ca2+ transporters in crops is a promising new technology to improve dietary Ca supplies through biofortification. Here, we sought to identify novel targets for increasing plant Ca accumulation using genetical and comparative genomics. Expression quantitative trait locus (eQTL) mapping to 1895 cis- and 8015 trans-loci were identified in shoots of an inbred mapping population of Brassica rapa (IMB211 × R500); 23 cis- and 948 trans-eQTLs responded specifically to altered Ca supply. eQTLs were screened for functional significance using a large database of shoot Ca concentration phenotypes of Arabidopsis thaliana. From 31 Arabidopsis gene identifiers tagged to robust shoot Ca concentration phenotypes, 21 mapped to 27 B. rapa eQTLs, including orthologs of the Ca2+ transporters At-CAX1 and At-ACA8. Two of three independent missense mutants of BraA.cax1a, isolated previously by targeting induced local lesions in genomes, have allele-specific shoot Ca concentration phenotypes compared with their segregating wild types. BraA.CAX1a is a promising target for altering the Ca composition of Brassica, consistent with prior knowledge from Arabidopsis. We conclude that multiple-environment eQTL analysis of complex crop genomes combined with comparative genomics is a powerful technique for novel gene identification/prioritization.
Resumo:
The recently described cupin superfamily of proteins includes the germin and germinlike proteins, of which the cereal oxalate oxidase is the best characterized. This superfamily also includes seed storage proteins, in addition to several microbial enzymes and proteins with unknown function. All these proteins are characterized by the conservation of two central motifs, usually containing two or three histidine residues presumed to be involved with metal binding in the catalytic active site. The present study on the coding regions of Synechocystis PCC6803 identifies a previously unknown group of 12 related cupins, each containing the characteristic two-motif signature. This group comprises 11 single-domain proteins, ranging in length from 104 to 289 residues, and includes two phosphomannose isomerases and two epimerases involved in cell wall synthesis, a member of the pirin group of nuclear proteins, a possible transcriptional regulator, and a close relative-of a cytochrome c551 from Rhodococcus. Additionally, there is a duplicated, two-domain protein that has close similarity to an oxalate decarboxylase from the fungus Collybia velutipes and that is a putative progenitor of the storage proteins of land plants.
Resumo:
The cupin superfamily of proteins, named on the basis of a conserved β-barrel fold (‘cupa’ is the Latin term for a small barrel), was originally discovered using a conserved motif found within germin and germin-like proteins from higher plants. Previous analysis of cupins had identified some 18 different functional classes that range from single-domain bacterial enzymes such as isomerases and epimerases involved in the modification of cell wall carbohydrates, through to two-domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain transcription factors including one linked to the nodulation response in legumes. Recent advances in comparative genomics, and the resolution of many more 3-D structures have now revealed that the largest subset of the cupin superfamily is the 2-oxyglutarate-Fe2+ dependent dioxygenases. The substrates for this subclass of enzyme are many and varied and in total amount to probably 50–100 different biochemical reactions, including several involved in plant growth and development. Although the majority of enzymatic cupins contain iron as an active site metal, other members contain either copper, zinc, cobalt, nickel or manganese ions as a cofactor, with each cofactor allowing a different type of chemistry to occur within the conserved tertiary structure. This review discusses the range of structures and functions found in this most diverse of superfamilies.
Resumo:
BACKGROUND:The Salmonella enterica serovar Derby is frequently isolated from pigs and turkeys whereas serovar Mbandaka is frequently isolated from cattle, chickens and animal feed in the UK. Through comparative genomics, phenomics and mutant construction we previously suggested possible mechanistic reasons why these serovars demonstrate apparently distinct host ranges. Here, we investigate the genetic and phenotypic diversity of these two serovars in the UK. We produce a phylogenetic reconstruction and perform several biochemical assays on isolates of S. Derby and S. Mbandaka acquired from sites across the UK between the years 2000 and 2010. RESULTS:We show that UK isolates of S. Mbandaka comprise of one clonal lineage which is adapted to proficient utilisation of metabolites found in soya beans under ambient conditions. We also show that this clonal lineage forms a biofilm at 25 °C, suggesting that this serovar maybe well adapted to survival ex vivo, growing in animal feed. Conversely, we show that S. Derby is made of two distinct lineages, L1 and L2. These lineages differ genotypically and phenotypically, being divided by the presence and absence of SPI-23 and the ability to more proficiently invade porcine jejunum derived cell line IPEC-J2. CONCLUSION:The results of this study lend support to the hypothesis that the differences in host ranges of S. Derby and S. Mbandaka are adaptations to pathogenesis, environmental persistence, as well as utilisation of metabolites abundant in their respective host environments.
Resumo:
In this study, we demonstrate the suitability of the vertebrate Danio rerio (zebrafish) for functional screening of novel platelet genes in vivo by reverse genetics. Comparative transcript analysis of platelets and their precursor cell, the megakaryocyte, together with nucleated blood cell elements, endothelial cells, and erythroblasts, identified novel platelet membrane proteins with hitherto unknown roles in thrombus formation. We determined the phenotype induced by antisense morpholino oligonucleotide (MO)–based knockdown of 5 of these genes in a laser-induced arterial thrombosis model. To validate the model, the genes for platelet glycoprotein (GP) IIb and the coagulation protein factor VIII were targeted. MO-injected fish showed normal thrombus initiation but severely impaired thrombus growth, consistent with the mouse knockout phenotypes, and concomitant knockdown of both resulted in spontaneous bleeding. Knockdown of 4 of the 5 novel platelet proteins altered arterial thrombosis, as demonstrated by modified kinetics of thrombus initiation and/or development. We identified a putative role for BAMBI and LRRC32 in promotion and DCBLD2 and ESAM in inhibition of thrombus formation. We conclude that phenotypic analysis of MO-injected zebrafish is a fast and powerful method for initial screening of novel platelet proteins for function in thrombosis.