815 resultados para Comparative Genomics, Non-coding RNAs, Conservation, Segmentation, Change-points, Sliding Window Analysis, Markov Chain Monte Carlo, Bayesian modeling
Resumo:
The silver gemfish Rexea solandri is an important economic resource but vulnerable to overfishing in Australian waters. The complete mitochondrial genome sequence is described from 1.6 million reads obtained via next generation sequencing. The total length of the mitogenome is 16,350 bp comprising 2 rRNA, 13 protein-coding genes, 22 tRNA and 2 non-coding regions. The mitogenome sequence was validated against sequences of PCR fragments and BLAST queries of Genbank. Gene order was equivalent to that found in marine fishes.
Resumo:
This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.
Resumo:
A cDNA library for 6S–9S poly(A)-containing RNA from rat liver was constructed in Image . Initial screening of the clones was carried out using single stranded 32P-labeled cDNA prepared against poly(A)-containing RNA isolated from immunoadsorbed polyribosomes enriched for the nuclear-coded subunit messenger RNAs of cytochrome c oxidase. One of the clones, pCO89, was found to hybridize with the messenger RNA for subunit VIC. The DNA sequence of the insert in pCO89 was carried out and it has got extensive homology with the C-terminal 33 amino acids of subunit VIC from beef heart cytochrome c oxidase. In addition, the insert contained 146 bp, corresponding to a portion of the 3′-non-coding region. Northern blot analysis of rat liver RNA with the nick-translated insert of pCO89 revealed that the messenger RNA for subunit VI would contain around 510 bases.
Resumo:
Evolutionary history of biological entities is recorded within their nucleic acid sequences and can (sometimes) be deciphered by thorough genomic analysis. In this study we sought to gain insights into the diversity and evolution of bacterial and archaeal viruses. Our primary interest was pointed towards those virus groups/families for which comprehensive genomic analysis was not previously possible due to the lack of sufficient amount of genomic data. During the course of this work twenty-five putative proviruses integrated into various prokaryotic genomes were identified, enabling us to undertake a comparative genomics approach. This analysis allowed us to test the previously formulated evolutionary hypotheses and also provided valuable information on the molecular mechanisms behind the genome evolution of the studied virus groups.
Resumo:
The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF = 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 x 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 x 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 x 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.
Resumo:
The virus inducible non-coding RNA (VINC) was detected initially in the brain of mice infected with Japanese encephalitis virus (JEV) and rabies virus. VINC is also known as NEAT1 or Men epsilon RNA. It is localized in the nuclear paraspeckles of several murine as well as human cell lines and is essential for paraspeckle formation. We demonstrate that VINC interacts with the paraspeckle protein, P54nrb through three different protein interaction regions (PIRs) one of which (PIR-1) is localized near the 50 end while the other two (PIR-2, PIR-3) are localized near the 30 region of VINC. Our studies suggest that VINC may interact with P54nrb through a novel mechanism which is different from that reported for protein coding RNAs. (C) 2010 Federation of European Biochemical Societies. Published by Elsevier B. V. All rights reserved.
Resumo:
Background:Bacterial non-coding small RNAs (sRNAs) have attracted considerable attention due to their ubiquitous nature and contribution to numerous cellular processes including survival, adaptation and pathogenesis. Existing computational approaches for identifying bacterial sRNAs demonstrate varying levels of success and there remains considerable room for improvement. Methodology/Principal Findings: Here we have proposed a transcriptional signal-based computational method to identify intergenic sRNA transcriptional units (TUs) in completely sequenced bacterial genomes. Our sRNAscanner tool uses position weight matrices derived from experimentally defined E. coli K-12 MG1655 sRNA promoter and rho-independent terminator signals to identify intergenic sRNA TUs through sliding window based genome scans. Analysis of genomes representative of twelve species suggested that sRNAscanner demonstrated equivalent sensitivity to sRNAPredict2, the best performing bioinformatics tool available presently. However, each algorithm yielded substantial numbers of known and uncharacterized hits that were unique to one or the other tool only. sRNAscanner identified 118 novel putative intergenic sRNA genes in Salmonella enterica Typhimurium LT2, none of which were flagged by sRNAPredict2. Candidate sRNA locations were compared with available deep sequencing libraries derived from Hfq-co-immunoprecipitated RNA purified from a second Typhimurium strain (Sittka et al. (2008) PLoS Genetics 4: e1000163). Sixteen potential novel sRNAs computationally predicted and detected in deep sequencing libraries were selected for experimental validation by Northern analysis using total RNA isolated from bacteria grown under eleven different growth conditions. RNA bands of expected sizes were detected in Northern blots for six of the examined candidates. Furthermore, the 5'-ends of these six Northern-supported sRNA candidates were successfully mapped using 5'-RACE analysis. Conclusions/Significance: We have developed, computationally examined and experimentally validated the sRNAscanner algorithm. Data derived from this study has successfully identified six novel S. Typhimurium sRNA genes. In addition, the computational specificity analysis we have undertaken suggests that similar to 40% of sRNAscanner hits with high cumulative sum of scores represent genuine, undiscovered sRNA genes. Collectively, these data strongly support the utility of sRNAscanner and offer a glimpse of its potential to reveal large numbers of sRNA genes that have to date defied identification. sRNAscanner is available from: http://bicmku.in:8081/sRNAscanner or http://cluster.physics.iisc.ernet.in/sRNAscanner/.
Resumo:
Background. Several types of networks, such as transcriptional, metabolic or protein-protein interaction networks of various organisms have been constructed, that have provided a variety of insights into metabolism and regulation. Here, we seek to exploit the reaction-based networks of three organisms for comparative genomics. We use concepts from spectral graph theory to systematically determine how differences in basic metabolism of organisms are reflected at the systems level and in the overall topological structures of their metabolic networks. Methodology/Principal Findings. Metabolome-based reaction networks of Mycobacterium tuberculosis, Mycobacterium leprae and Escherichia coli have been constructed based on the KEGG LIGAND database, followed by graph spectral analysis of the network to identify hubs as well as the sub-clustering of reactions. The shortest and alternate paths in the reaction networks have also been examined. Sub-cluster profiling demonstrates that reactions of the mycolic acid pathway in mycobacteria form a tightly connected sub-cluster. Identification of hubs reveals reactions involving glutamate to be central to mycobacterial metabolism, and pyruvate to be at the centre of the E. coli metabolome. The analysis of shortest paths between reactions has revealed several paths that are shorter than well established pathways. Conclusions. We conclude that severe downsizing of the leprae genome has not significantly altered the global structure of its reaction network but has reduced the total number of alternate paths between its reactions while keeping the shortest paths between them intact. The hubs in the mycobacterial networks that are absent in the human metabolome can be explored as potential drug targets. This work demonstrates the usefulness of constructing metabolome based networks of organisms and the feasibility of their analyses through graph spectral methods. The insights obtained from such studies provide a broad overview of the similarities and differences between organisms, taking comparative genomics studies to a higher dimension.
Resumo:
The use of copolymer and polymer blends widened the possibility of creating materials with multilayered architectures. Hierarchical polymer systems with a wide array of micro and nanostructures are generated by thermally induced phase separation (TIPS) in partially miscible polymer blends. Various parameters like the interaction between the polymers, concentration, solvent/non-solvent ratio, and quenching temperature have to be optimized to obtain these micro/nanophase structures. Alternatively, the addition of nanoparticles is another strategy to design materials with desired hetero-phase structures. The dynamics of the polymer nanocomposite depends on the statistical ordering of polymers around the nanoparticle, which is dependent on the shape of the nanoparticle. The entropic loss due to deformation of polymer chains, like the repulsive interactions due to coiling and the attractive interactions in the case of swelling has been highlighted in this perspective article. The dissipative particle dynamics has been discussed and is correlated with the molecular dynamics simulation in the case of polymer blends. The Cahn Hillard Cook model on variedly shaped immobile fillers has shown difference in the propagation of the composition wave. The nanoparticle shape has a contributing effect on the polymer particle interaction, which can change the miscibility window in the case of these phase separating polymer blends. Quantitative information on the effect of spherical particles on the demixing temperature is well established and further modified to explain the percolation of rod shaped particles in the polymer blends. These models correlate well with the experimental observations in context to the dynamics induced by the nanoparticle in the demixing behavior of the polymer blend. The miscibility of the LCST polymer blend depends on the enthalpic factors like the specific interaction between the components, and the solubility product and the entropic losses occurring due to the formation of any favorable interactions. Hence, it is essential to assess the entropic and enthalpic interactions induced by the nanoparticles independently. The addition of nanoparticles creates heterogeneity in the polymer phase it is localized. This can be observed as an alteration in the relaxation behavior of the polymer. This changes the demixing behavior and the interaction parameter between the polymers. The compositional changes induced due to the incorporation of nanoparticles are also attributed as a reason for the altered demixing temperature. The particle shape anisotropy causes a direction dependent depletion, which changes the phase behavior of the blend. The polymer-grafted nanoparticles with varying grafting density show tremendous variation in the miscibility of the blend. The stretching of the polymer chains grafted on the nanoparticles causes an entropy penalty in the polymer blend. A comparative study on the different shaped particles is not available up to date for understanding these aspects. Hence, we have juxtaposed the various computational studies on nanoparticle dynamics, the shape effect of NPs on homopolymers and also the cases of various polymer blends without nanoparticles to sketch a complete picture on the effect of various particles on the miscibility of LCST blends.
Resumo:
This is a handbook about Chalk Rivers Nature Conservation and Management from March 1999 by the Water Research Centre and commissioned by English Nature and the Environment Agency, primarly provides an objective basis for formulating conservation strategies for relevant Site of Special Scientific Interest (SSSIs) and Special Areas of Conservation (SACs). It was also seen as being applicable to chalk rivers more generally and has increasingly been regarded as important to the work of the Biodiversity Action Plan Steering Group on chalk rivers, which is led by the Environment Agency. This report contains information on characteristic wildlife communities, their habitat requirements and the ecological impact of activities that are relevant to the chalk river environment. It provides guidance on setting management objectives, options for mitigating impacts, and measures for the maintaining and enhancing the river channel, riparian and floodplain areas associated. The term `chalk river’ is used to describe watercourses dominated by groundwater discharge from chalk geology, including those that flow over a range of non-chalk surface geologies at various points along their length. England contains numerous examples of this river type, located in and downstream of areas of outcropping chalk in the south, East Anglia and up into Lincolnshire and Yorkshire. Indeed, England has the major part of the chalk river resource of Europe. A number of chalk rivers have been designated as Sites of Special Scientific Interest (SSSIs) and English Nature and Environment Agency work drawing up joint conservation strategies.
Resumo:
Background: Due to the advances of high throughput technology and data-collection approaches, we are now in an unprecedented position to understand the evolution of organisms. Great efforts have characterized many individual genes responsible for the interspecies divergence, yet little is known about the genome-wide divergence at a higher level. Modules, serving as the building blocks and operational units of biological systems, provide more information than individual genes. Hence, the comparative analysis between species at the module level would shed more light on the mechanisms underlying the evolution of organisms than the traditional comparative genomics approaches. Results: We systematically identified the tissue-related modules using the iterative signature algorithm (ISA), and we detected 52 and 65 modules in the human and mouse genomes, respectively. The gene expression patterns indicate that all of these predicted modules have a high possibility of serving as real biological modules. In addition, we defined a novel quantity, "total constraint intensity,'' a proxy of multiple constraints (of co-regulated genes and tissues where the co-regulation occurs) on the evolution of genes in module context. We demonstrate that the evolutionary rate of a gene is negatively correlated with its total constraint intensity. Furthermore, there are modules coding the same essential biological processes, while their gene contents have diverged extensively between human and mouse. Conclusions: Our results suggest that unlike the composition of module, which exhibits a great difference between human and mouse, the functional organization of the corresponding modules may evolve in a more conservative manner. Most importantly, our findings imply that similar biological processes can be carried out by different sets of genes from human and mouse, therefore, the functional data of individual genes from mouse may not apply to human in certain occasions.
Resumo:
The human genome project has been recently complemented by whole-genome assessment sequence of 32 mammals and 24 nonmammalian vertebrate species suitable for comparative genomic analyses. Here we anticipate a precipitous drop in costs and increase in sequ
Resumo:
RNA hairpins containing UNCG, GNRA, CUUG (N = A, U, C or G, R = G or A) loops are unusually thermodynamic stable and conserved structures. The structural features of these hairpin loops are very special, and they play very important roles in vivo. They are prevalent in rRNA, catalytic RNA and non-coding mRNA. However, the 5' C(UUCG)G 3' hairpin is not found in the folding structure of 88 human mRNA coding regions. It is also different from rRNA in that there is no preference for certain sequences among tetraloops in these 88 mRNA folding structures.
Resumo:
The full-length cDNA sequence (3219 base pairs) of the trehalose-6-phosphate synthase gene of Porphyra yezoensis (PyTPS) was isolated by RACE-PCR and deposited in GenBank (NCBI) with the accession number AY729671. PyTPS encodes a protein of 908 amino acids before a stop codon, and has a calculated molecular mass of 101,591 Daltons. The PyTPS protein consists of a TPS domain in the N-terminus and a putative TPP domain at the C-terminus. Homology alignment for PyTPS and the TPS proteins from bacteria, yeast and higher plants indicated that the most closely related sequences to PyTPS were those from higher plants (OsTPS and AtTPS5), whereas the most distant sequence to PyTPS was from bacteria (EcOtsAB). Based on the identified sequence of the PyTPS gene, PCR primers were designed and used to amplify the TPS genes from nine other seaweed species. Sequences of the nine obtained TPS genes were deposited in GenBank (NCBI). All 10 TPS genes encoded peptides of 908 amino acids and the sequences were highly conserved both in nucleotide composition (>94%) and in amino acid composition (>96%). Unlike the TPS genes from some other plants, there was no intron in any of the 10 isolated seaweed TPS genes.
Resumo:
Robert Hasterok, Agnieszka Marasek, Iain S. Donnison, Ian Armstead, Ann Thomas, Ian P. King, Elzbieta Wolny, Dominika Idziak, John Draper and Glyn Jenkins (2006). Alignment of the genomes of brachypodium distachyon and temperate cereals and grasses using bacterial artificial chromosome landing with fluorescence in situ hybridization.Genetics, 73 (1), 349-362. Sponsorship: Royal Society / BBSRC;BBSRC RAE2008