991 resultados para Partial genomic libraries


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computational biology increasingly demands the sharing of sophisticated data and annotations between research groups. Web 2.0 style sharing and publication requires that biological systems be described in well-defined, yet flexible and extensible formats which enhance exchange and re-use. In contrast to many of the standards for exchange in the genomic sciences, descriptions of biological sequences show a great diversity in format and function, impeding the definition and exchange of sequence patterns. In this presentation, we introduce BioPatML, an XML-based pattern description language that supports a wide range of patterns and allows the construction of complex, hierarchically structured patterns and pattern libraries. BioPatML unifies the diversity of current pattern description languages and fills a gap in the set of XML-based description languages for biological systems. We discuss the structure and elements of the language, and demonstrate its advantages on a series of applications, showing lightweight integration between the BioPatML parser and search engine, and the SilverGene genome browser. We conclude by describing our site to enable large scale pattern sharing, and our efforts to seed this repository.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chlamydia pneumoniae is a common human and animal pathogen associated with a wide range of upper and lower respiratory tract infections. In more recent years there has been increasing evidence to suggest a link between C. pneumoniae and chronic diseases in humans, including atherosclerosis, stroke and Alzheimer’s disease. C. pneumoniae human strains show little genetic variation, indicating that the human-derived strain originated from a common ancestor in the recent past. Despite extensive information on the genetics and morphology processes of the human strain, knowledge concerning many other hosts (including marsupials, amphibians, reptiles and equines) remains virtually unexplored. The koala (Phascolarctos cinereus) is a native Australian marsupial under threat due to habitat loss, predation and disease. Koalas are very susceptible to chlamydial infections, most commonly affecting the conjunctiva, urogenital tract and/or respiratory tract. To address this gap in the literature, the present study (i) provides a detailed description of the morphologic and genomic architecture of the C. pneumoniae koala (and human) strain, and shows that the koala strain is microscopically, developmentally and genetically distinct from the C. pneumoniae human strain, and (ii) examines the genetic relationship of geographically diverse C. pneumoniae isolates from human, marsupial, amphibian, reptilian and equine hosts, and identifies two distinct lineages that have arisen from animal-to-human cross species transmissions. Chapter One of this thesis explores the scientific problem and aims of this study, while Chapter Two provides a detailed literature review of the background in this field of work. Chapter Three, the first results chapter, describes the morphology and developmental stages of C. pneumoniae koala isolate LPCoLN, as revealed by fluorescence and transmission electron microscopy. The profile of this isolate, when cultured in HEp-2 human epithelial cells, was quite different to the human AR39 isolate. Koala LPCoLN inclusions were larger; the elementary bodies did not have the characteristic pear-shaped appearance, and the developmental cycle was completed within a shorter period of time (as confirmed by quantitative real-time PCR). These in vitro findings might reflect biological differences between koala LPCoLN and human AR39 in vivo. Chapter Four describes the complete genome sequence of the koala respiratory pathogen, C. pneumoniae LPCoLN. This is the first animal isolate of C. pneumoniae to be fully-sequenced. The genome sequence provides new insights into genomic ‘plasticity’ (organisation), evolution and biology of koala LPCoLN, relative to four complete C. pneumoniae human genomes (AR39, CWL029, J138 and TW183). Koala LPCoLN contains a plasmid that is not shared with any of the human isolates, there is evidence of gene loss in nucleotide salvage pathways, and there are 10 hot spot genomic regions of variation that were previously not identified in the C. pneumoniae human genomes. Sequence (partial-length) from a second, independent, wild koala isolate (EBB) at several gene loci confirmed that the koala LPCoLN isolate was representative of a koala C. pneumoniae strain. The combined sequence data provides evidence that the C. pneumoniae animal (koala LPCoLN) genome is ancestral to the C. pneumoniae human genomes and that human infections may have originated from zoonotic infections. Chapter Five examines key genome components of the five C. pneumoniae genomes in more detail. This analysis reveals genomic features that are shared by and/or contribute to the broad ecological adaptability and evolution of C. pneumoniae. This analysis resulted in the identification of 65 gene sequences for further analysis of intraspecific variation, and revealed some interesting differences, including fragmentation, truncation and gene decay (loss of redundant ancestral traits). This study provides valuable insights into metabolic diversity, adaptation and evolution of C. pneumoniae. Chapter Six utilises a subset of 23 target genes identified from the previous genomic comparisons and makes a significant contribution to our understanding of genetic variability among C. pneumoniae human (11) and animal (6 amphibian, 5 reptilian, 1 equine and 7 marsupial hosts) isolates. It has been shown that the animal isolates are genetically diverse, unlike the human isolates that are virtually clonal. More convincing evidence that C. pneumoniae originated in animals and recently (in the last few hundred thousand years) crossed host species to infect humans is provided in this study. It is proposed that two animal-to-human cross species events have occurred in the context of the results, one evident by the nearly clonal human genotype circulating in the world today, and the other by a more animal-like genotype apparent in Indigenous Australians. Taken together, these data indicate that the C. pneumoniae koala LPCoLN isolate has morphologic and genomic characteristics that are distinct from the human isolates. These differences may affect the survival and activity of the C. pneumoniae koala pathogen in its natural host, in vivo. This study, by utilising the genetic diversity of C. pneumoniae, identified new genetic markers for distinguishing human and animal isolates. However, not all C. pneumoniae isolates were genetically diverse; in fact, several isolates were highly conserved, if not identical in sequence (i.e. Australian marsupials) emphasising that at some stage in the evolution of this pathogen, there has been an adaptation/s to a particular host, providing some stability in the genome. The outcomes of this study by experimental and bioinformatic approaches have significantly enhanced our knowledge of the biology of this pathogen and will advance opportunities for the investigation of novel vaccine targets, antimicrobial therapy, or blocking of pathogenic pathways.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we propose a novel relay ordering and scheduling strategy for the sequential slotted amplify-and-forward (SAF) protocol and evaluate its performance in terms of diversity-multiplexing trade-off (DMT). The relays between the source and destination are grouped into two relay clusters based on their respective locations. The proposed strategy achieves partial relay isolation and decreases the decoding complexity at the destination. We show that the DMT upper bound of sequential-SAF with the proposed strategy outperforms other amplify and forward protocols and is more practical compared to the relay isolation assumption made in the original paper [1]. Simulation result shows that the sequential-SAF protocol with the proposed strategy has better outage performance compared to the existing AF and non-cooperative protocols in high SNR regime.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Subterranean clover stunt disease is an economically important aphid-borne virus disease affecting certain pasture and grain legumes in Australia. The virus associated with the disease, subterranean clover stunt virus (SCSV), was previously found to be representative of a new type of single-stranded DNA virus. Analysis of the virion DNA and restriction mapping of double-stranded cDNA synthesized from virion DNA suggested that SCSV has a segmented genome composed of 3 or 4 different species of circular ssDNA each of about 850-880 nucleotides. To further investigate the complexity of the SCSV genome, we have isolated the replicative form DNA from infected pea and from it prepared putative full-length clones representing the SCSV genome segments. Analysis of these clones by restriction mapping indicated that clones representing at least 4 distinct genomic segments were obtained. This method is thus suitable for generating an extensive genomic library of novel ssDNA viruses containing multiple genome segments such as SCSV and banana bunchy top virus. The N-terminal amino acid sequence and amino acid composition of the coat protein of SCSV were determined. Comparison of the amino acid sequence with partial DNA sequence data, and the distinctly different restriction maps obtained for the full-length clones suggested that only one of these clones contained the coat protein gene. The results confirmed that SCSV has a functionally divided genome composed of several distinct ssDNA circles each of about 1 kb.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article documents the public availability of (i) transcriptome sequence data, assembled and annotated contigs and unigenes, and BLAST hits from the Queensland fruit fly, Bactrocera tryoni; (ii) 75 single-nucleotide variants (SNVs) from 454 sequencing of reduced representation libraries for Phalangiidae harvestmen, Megabunus armatus, Megabunus vignai, Megabunus lesserti, and Rilaena triangularis; and (iii) expressed sequence tags from 454 sequencing of the lepidopterans Lymantria dispar and Lymantria monacha.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97 bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478 bp and N50 length of 506 bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic regions influencing resistance to powdery mildew [Blumeria graminis (DC.) E.O. Speer f. sp. hordei Em. Marchal] were detected in a doubled haploid (DH) barley (Hordeum vulgare L.) population derived from a cross between the breeding line ND24260 and cultivar Flagship when evaluated across four field environments in Australia and Uruguay. Significant quantitative trait loci (OIL) for resistance to B. graminis were detected on six of the seven chromosomes (1H, 2H, 3H, 4H, 5H, and 7H). A QTL with large effect donated by ND24260 mapped to the short arm of chromosome 1H (1 HS) conferring near immunity to B. graminis in Australia but was ineffective in Uruguay. Three OIL donated by Flagship contributed partial resistance to B. graminis and were detected in at least two environments. These OIL were mapped to chromosomes 3H, 4H, and 5H (5HS) accounting for up to 18.6, 3.4, and 8.8% phenotypic variation, respectively. The 5HS QTL contributed partial resistance to B. graminis in all field environments in both Australia and Uruguay and aligned with the genomic region of Rph20, a gene conferring adult plant resistance (APR) to leaf rust (Puccinia hordei Otth), which is found in some cultivars having Vada' or 'Emir' in their parentage. Selection for favorable marker haplotypes within the 3H, 4H, and 5H QTL regions can be performed even in the presence of single (major) gene resistance. Pyramiding such QTL may provide an effective and potentially durable form of resistance to B. graminis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

"The genetic diversity of Puumala hantavirus (PUUV) was studied in a local population of its natural host, the bank vole (Myodes glareolus). The trapping area (2.5x2.5 km) at Konnevesi, Central Finland, included 14 trapping sites, at least 500 m apart; altogether, 147 voles were captured during May and October 2005. Partial sequences of the S, M and L viral genome segments were recovered from 40 animals. Seven, 12 and 17 variants were detected for the S, M and L sequences, respectively; these represent new wild-type PUUV strains that belong to the Finnish genetic lineage. The genetic diversity of PUUV strains from Konnevesi was 0.2-4.9% for the S segment, 0.2-4.8% for the M segment and 0.2-9.7% for the L segment. Most nucleotide substitutions were synonymous and most deduced amino acid substitutions were conservative, probably due to strong stabilizing selection operating at the protein level. Based on both sequence markers and phylogenetic clustering, the S, M and L sequences could be assigned to two groups, 'A' and 'B'. Notably, not all bank voles carried S, M and L sequences belonging to the same group, i.e. SAMALA or SBMBLB.. A substantial proportion (8/40, 20%) of the newly characterized PUUV strains possessed reassortant genomes such as SBMALA, SAMBLB or SBMALB. These results suggest that at least some of the PUUV reassortants are viable and can survive in the presence of their parental strains."

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Largemouth bronze gudgeon (Coreius guichenoti) is a medium-sized fish endemic from the upper Yangtze River of China and its survival is threatened by the construction of the Three Gorges Dam. This study reports 20 new polymorphic microsatellites from a repeat-enriched genomic library with a mean number allele of 5.2, and observed and expected heterozygosities ranging from 0.035 to 1, and from 0.13 to 0.917, respectively. In a cross-species amplification test, nine of the 37 tested loci were found to be also polymorphic in a congeneric species, brass gudgeon (C. heterodon). In addition, other four loci from common carp (Cyprinus carpio) were also polymorphic in C. guichenoti. Out of these 24 polymorphic microsatellites, only three loci significantly deviated from Hardy-Weinberg equilibrium in the sampled population (P < 0.0025), and all pairwise tests for linkage disequilibrium among loci were nonsignificant after applying sequential Bonferroni correction (P > 0.0026). These novel microsatellites provide sufficient levels of polymorphism for studies on population genetics and conservation in C. guichenoti and its related species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present the Unified Form Language (UFL), which is a domain-specific language for representing weak formulations of partial differential equations with a view to numerical approximation. Features of UFL include support for variational forms and functionals, automatic differentiation of forms and expressions, arbitrary function space hierarchies formultifield problems, general differential operators and flexible tensor algebra. With these features, UFL has been used to effortlessly express finite element methods for complex systems of partial differential equations in near-mathematical notation, resulting in compact, intuitive and readable programs. We present in this work the language and its construction. An implementation of UFL is freely available as an open-source software library. The library generates abstract syntax tree representations of variational problems, which are used by other software libraries to generate concrete low-level implementations. Some application examples are presented and libraries that support UFL are highlighted. © 2014 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The structural changes of genomic DNA upon interaction with small molecules have been studied in real time using dual-polarization interferometry (DPI). Native or thermally denatured DNA was immobilized on the silicon oxynitride surface via a preadsorbed poly(ethylenimine) (PEI) layer. The mass loading was similar for both types of DNA, however, native DNA formed a looser and thicker layer due to its rigidity, unlike the more flexible denatured DNA, which mixed with PEI to form a denser and thinner layer. Ethidium bromide (EtBr), a classical intercalator, induced the large thickness decrease and density increase of native DNA (double-stranded), but a slight increase in both the thickness and density of denatured DNA (partial single-stranded).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two Large-insert genomic bacterial artificial chromosome (BAC) libraries of Zhikong scallop Chlamys farreri were constructed to promote our genetic and genomic research. High-quality megabase-sized DNA was isolated from the adductor muscle of the scallop and partially digested by BamH I and Mbo I, respectively. The BamH I library consisted of 53 760 clones while the Mbo I library consisted of 7 680clones. Approximately 96 % of the clones in BamH I library contained nuclear DNA inserts in average size of 100 kb, providing a coverage of 5.3 haploid genome equivalents. Similarly, the Mbo I library with an average insert of 145 kb and no insert-empty clones, thus providing a genome coverage of 1.1 haploid genome equivalents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.

We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.

We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.

Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.

This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE:: Report of a 16q24.1 deletion in a premature newborn, demonstrating the usefulness of array-based comparative genomic hybridization in persistent pulmonary hypertension of the newborn and multiple congenital malformations. DESIGN:: Descriptive case report. SETTING:: Genetic department and neonatal intensive care unit of a tertiary care children's hospital. INTERVENTIONS:: None. PATIENT:: We report the case of a preterm male infant, born at 26 wks of gestation. A cardiac malformation and bilateral hydronephrosis were diagnosed at 19 wks of gestation. Karyotype analysis was normal, and a 22q11.2 microdeletion was excluded by fluorescence in situ hybridization analysis. A cesarean section was performed due to fetal distress. The patient developed persistent pulmonary hypertension unresponsive to mechanical ventilation and nitric oxide treatment and expired at 16 hrs of life. MEASUREMENTS AND MAIN RESULTS:: An autopsy revealed partial atrioventricular canal malformation and showed bilateral dilation of the renal pelvocaliceal system with bilateral ureteral stenosis and annular pancreas. Array-based comparative genomic hybridization analysis (Agilent oligoNT 44K, Agilent Technologies, Santa Clara, CA) showed an interstitial microdeletion encompassing the forkhead box gene cluster in 16q24.1. Review of the pulmonary microscopic examination showed the characteristic features of alveolar capillary dysplasia with misalignment of pulmonary veins. Some features were less prominent due to the gestational age. CONCLUSIONS:: Our review of the literature shows that alveolar capillary dysplasia with misalignment of pulmonary veins is rare but probably underreported. Prematurity is not a usual presentation, and histologic features are difficult to interpret. In our case, array-based comparative genomic hybridization revealed a 16q24.1 deletion, leading to the final diagnosis of alveolar capillary dysplasia with misalignment of pulmonary veins. It emphasizes the usefulness of array-based comparative genomic hybridization analysis as a diagnostic tool with implications for both prognosis and management decisions in newborns with refractory persistent pulmonary hypertension and multiple congenital malformations.