846 resultados para Whole genome sequencing


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The DNA topology is an important modifier of DNA functions. Torsional stress is generated when right handed DNA is either over- or underwound, producing structural deformations which drive or are driven by processes such as replication, transcription, recombination and repair. DNA topoisomerases are molecular machines that regulate the topological state of the DNA in the cell. These enzymes accomplish this task by either passing one strand of the DNA through a break in the opposing strand or by passing a region of the duplex from the same or a different molecule through a double-stranded cut generated in the DNA. Because of their ability to cut one or two strands of DNA they are also target for some of the most successful anticancer drugs used in standard combination therapies of human cancers. An effective anticancer drug is Camptothecin (CPT) that specifically targets DNA topoisomerase 1 (TOP 1). The research project of the present thesis has been focused on the role of human TOP 1 during transcription and on the transcriptional consequences associated with TOP 1 inhibition by CPT in human cell lines. Previous findings demonstrate that TOP 1 inhibition by CPT perturbs RNA polymerase (RNAP II) density at promoters and along transcribed genes suggesting an involvement of TOP 1 in RNAP II promoter proximal pausing site. Within the transcription cycle, promoter pausing is a fundamental step the importance of which has been well established as a means of coupling elongation to RNA maturation. By measuring nascent RNA transcripts bound to chromatin, we demonstrated that TOP 1 inhibition by CPT can enhance RNAP II escape from promoter proximal pausing site of the human Hypoxia Inducible Factor 1 (HIF-1) and c-MYC genes in a dose dependent manner. This effect is dependent from Cdk7/Cdk9 activities since it can be reversed by the kinases inhibitor DRB. Since CPT affects RNAP II by promoting the hyperphosphorylation of its Rpb1 subunit the findings suggest that TOP 1inhibition by CPT may increase the activity of Cdks which in turn phosphorylate the Rpb1 subunit of RNAP II enhancing its escape from pausing. Interestingly, the transcriptional consequences of CPT induced topological stress are wider than expected. CPT increased co-transcriptional splicing of exon1 and 2 and markedly affected alternative splicing at exon 11. Surprisingly despite its well-established transcription inhibitory activity, CPT can trigger the production of a novel long RNA (5’aHIF-1) antisense to the human HIF-1 mRNA and a known antisense RNA at the 3’ end of the gene, while decreasing mRNA levels. The effects require TOP 1 and are independent from CPT induced DNA damage. Thus, when the supercoiling imbalance promoted by CPT occurs at promoter, it may trigger deregulation of the RNAP II pausing, increased chromatin accessibility and activation/derepression of antisense transcripts in a Cdks dependent manner. A changed balance of antisense transcripts and mRNAs may regulate the activity of HIF-1 and contribute to the control of tumor progression After focusing our TOP 1 investigations at a single gene level, we have extended the study to the whole genome by developing the “Topo-Seq” approach which generates a map of genome-wide distribution of sites of TOP 1 activity sites in human cells. The preliminary data revealed that TOP 1 preferentially localizes at intragenic regions and in particular at 5’ and 3’ ends of genes. Surprisingly upon TOP 1 downregulation, which impairs protein expression by 80%, TOP 1 molecules are mostly localized around 3’ ends of genes, thus suggesting that its activity is essential at these regions and can be compensate at 5’ ends. The developed procedure is a pioneer tool for the detection of TOP 1 cleavage sites across the genome and can open the way to further investigations of the enzyme roles in different nuclear processes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background We present a compendium of N-ethyl-N-nitrosourea (ENU)-induced mouse mutations, identified in our laboratory over a period of 10 years either on the basis of phenotype or whole genome and/or whole exome sequencing, and archived in the Mutagenetix database. Our purpose is threefold: 1) to formally describe many point mutations, including those that were not previously disclosed in peer-reviewed publications; 2) to assess the characteristics of these mutations; and 3) to estimate the likelihood that a missense mutation induced by ENU will create a detectable phenotype. Findings In the context of an ENU mutagenesis program for C57BL/6J mice, a total of 185 phenotypes were tracked to mutations in 129 genes. In addition, 402 incidental mutations were identified and predicted to affect 390 genes. As previously reported, ENU shows strand asymmetry in its induction of mutations, particularly favoring T to A rather than A to T in the sense strand of coding regions and splice junctions. Some amino acid substitutions are far more likely to be damaging than others, and some are far more likely to be observed. Indeed, from among a total of 494 non-synonymous coding mutations, ENU was observed to create only 114 of the 182 possible amino acid substitutions that single base changes can achieve. Based on differences in overt null allele frequencies observed in phenotypic vs. non-phenotypic mutation sets, we infer that ENU-induced missense mutations create detectable phenotype only about 1 in 4.7 times. While the remaining mutations may not be functionally neutral, they are, on average, beneath the limits of detection of the phenotypic assays we applied. Conclusions Collectively, these mutations add to our understanding of the chemical specificity of ENU, the types of amino acid substitutions it creates, and its efficiency in causing phenovariance. Our data support the validity of computational algorithms for the prediction of damage caused by amino acid substitutions, and may lead to refined predictions as to whether specific amino acid changes are responsible for observed phenotypes. These data form the basis for closer in silico estimations of the number of genes mutated to a state of phenovariance by ENU within a population of G3 mice.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Attention-deficit/hyperactivity disorder (ADHD) is a common, highly heritable neurodevelopmental disorder. Genetic loci have not yet been identified by genome-wide association studies. Rare copy number variations (CNVs), such as chromosomal deletions or duplications, have been implicated in ADHD and other neurodevelopmental disorders. To identify rare (frequency 1%) CNVs that increase the risk of ADHD, we performed a whole-genome CNV analysis based on 489 young ADHD patients and 1285 adult population-based controls and identified one significantly associated CNV region. In tests for a global burden of large (>500 kb) rare CNVs, we observed a nonsignificant (P=0.271) 1.126-fold enriched rate of subjects carrying at least one such CNV in the group of ADHD cases. Locus-specific tests of association were used to assess if there were more rare CNVs in cases compared with controls. Detected CNVs, which were significantly enriched in the ADHD group, were validated by quantitative (q)PCR. Findings were replicated in an independent sample of 386 young patients with ADHD and 781 young population-based healthy controls. We identified rare CNVs within the parkinson protein 2 gene (PARK2) with a significantly higher prevalence in ADHD patients than in controls (P=2.8 × 10(-4) after empirical correction for genome-wide testing). In total, the PARK2 locus (chr 6: 162 659 756-162 767 019) harboured three deletions and nine duplications in the ADHD patients and two deletions and two duplications in the controls. By qPCR analysis, we validated 11 of the 12 CNVs in ADHD patients (P=1.2 × 10(-3) after empirical correction for genome-wide testing). In the replication sample, CNVs at the PARK2 locus were found in four additional ADHD patients and one additional control (P=4.3 × 10(-2)). Our results suggest that copy number variants at the PARK2 locus contribute to the genetic susceptibility of ADHD. Mutations and CNVs in PARK2 are known to be associated with Parkinson disease.Molecular Psychiatry advance online publication, 20 November 2012; doi:10.1038/mp.2012.161.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With the advent of cheaper and faster DNA sequencing technologies, assembly methods have greatly changed. Instead of outputting reads that are thousands of base pairs long, new sequencers parallelize the task by producing read lengths between 35 and 400 base pairs. Reconstructing an organism’s genome from these millions of reads is a computationally expensive task. Our algorithm solves this problem by organizing and indexing the reads using n-grams, which are short, fixed-length DNA sequences of length n. These n-grams are used to efficiently locate putative read joins, thereby eliminating the need to perform an exhaustive search over all possible read pairs. Our goal was develop a novel n-gram method for the assembly of genomes from next-generation sequencers. Specifically, a probabilistic, iterative approach was utilized to determine the most likely reads to join through development of a new metric that models the probability of any two arbitrary reads being joined together. Tests were run using simulated short read data based on randomly created genomes ranging in lengths from 10,000 to 100,000 nucleotides with 16 to 20x coverage. We were able to successfully re-assemble entire genomes up to 100,000 nucleotides in length.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Advances in novel molecular biological diagnostic methods are changing the way of diagnosis and study of metabolic disorders like growth hormone deficiency. Faster sequencing and genotyping methods require strong bioinformatics tools to make sense of the vast amount of data generated by modern laboratories. Advances in genome sequencing and computational power to analyze the whole genome sequences will guide the diagnostics of future. In this chapter, an overview of some basic bioinformatics resources that are needed to study metabolic disorders are reviewed and some examples of bioinformatics analysis of human growth hormone gene, protein and structure are provided.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A genome-wide scan was performed to detect quantitative trait loci (QTLs) for osteochondrosis (OC) and osteochondrosis dissecans (OCD) in horses. The marker set comprised 260 microsatellites. We collected data from 211 Hanoverian warmblood horses consisting of 14 paternal half-sib families. Traits used were OC (fetlock and/or hock joints affected), OCD (fetlock and/or hock joints affected), fetlock OC, fetlock OCD, hock OC, and hock OCD. The first genome scan included 172 microsatellite markers. In a second step 88 additional markers were chosen to refine putative QTLs found in the first scan. Genome-wide significant QTLs were located on equine chromosomes 2, 4, 5, and 16. QTLs for fetlock OC and hock OC partly overlapped on the same chromosomes, indicating that these traits may be genetically related. QTLs reached the chromosome-wide significance level on eight different equine chromosomes: 2, 3, 4, 5, 15, 16, 19, and 21. This whole-genome scan was a first step toward the identification of candidate genome regions harboring genes responsible for equine OC. Further investigations are necessary to refine the map positions of the QTLs already identified for OC.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This investigation was based on 23 isolates from several European countries collected over the past 30 years, and included characterization of all isolates. Published data on amplified fragment length polymorphism typing of isolates representing all biovars as well as protein profiles were used to select strains that were then further characterized by polyamine profiling and sequencing of 16S rRNA, infB, rpoB and recN genes. Comparison of 16S rRNA gene sequences revealed a monophyletic group within the avian 16S rRNA group of the Pasteurellaceae, which currently includes the genera Avibacterium, Gallibacterium and Volucribacter. Five monophyletic subgroups related to Gallibacterium anatis were recognized by 16S rRNA, rpoB, infB and recN gene sequence comparisons. Whole-genome similarity between strains of the five subgroups and the type strain of G. anatis calculated from recN sequences allowed us to classify them within the genus Gallibacterium. In addition, phenotypic data including biochemical traits, protein profiling and polyamine patterns clearly indicated that these taxa are related. Major phenotypic diversity was observed for 16S rRNA gene sequence groups. Furthermore, comparison of whole-genome similarities, phenotypic data and published data on amplified fragment length polymorphism and protein profiling revealed that each of the five groups present unique properties that allow the proposal of three novel species of Gallibacterium, for which we propose the names Gallibacterium melopsittaci sp. nov. (type strain F450(T) =CCUG 36331(T) =CCM 7538(T)), Gallibacterium trehalosifermentans sp. nov. (type strain 52/S3/90(T) =CCUG 55631(T) =CCM 7539(T)) and Gallibacterium salpingitidis sp. nov. (type strain F150(T) =CCUG 15564(T) =CCUG 36325(T) =NCTC 11414(T)), a novel genomospecies 3 of Gallibacterium and an unnamed taxon (group V). An emended description of the genus Gallibacterium is also presented.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In eukaryotes, small RNAs (sRNAs) have key roles in development, gene expression regulation, and genome integrity maintenance. In ciliates, such as Paramecium, sRNAs form the heart of an epigenetic system that has evolved from core eukaryotic gene silencing components to selectively target DNA for deletion. In Paramecium, somatic genome development from the germline genome accurately eliminates the bulk of typically gene-interrupting, noncoding DNA. We have discovered an sRNA class (internal eliminated sequence [IES] sRNAs [iesRNAs]), arising later during Paramecium development, which originates from and precisely delineates germline DNA (IESs) and complements the initial sRNAs ("scan" RNAs [scnRNAs]) in targeting DNA for elimination. We show that whole-genome duplications have facilitated successive differentiations of Paramecium Dicer-like proteins, leading to cooperation between Dcl2 and Dcl3 to produce scnRNAs and to the production of iesRNAs by Dcl5. These innovations highlight the ability of sRNA systems to acquire capabilities, including those in genome development and integrity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ectodermal dysplasias (EDs) are a large and heterogeneous group of hereditary disorders characterized by abnormalities in structures of ectodermal origin. Incontinentia pigmenti (IP) is an ED characterized by skin lesions evolving over time, as well as dental, nail, and ocular abnormalities. Due to X-linked dominant inheritance IP symptoms can only be seen in female individuals while affected males die during development in utero. We observed a family of horses, in which several mares developed signs of a skin disorder reminiscent of human IP. Cutaneous manifestations in affected horses included the development of pruritic, exudative lesions soon after birth. These developed into wart-like lesions and areas of alopecia with occasional wooly hair re-growth. Affected horses also had streaks of darker and lighter coat coloration from birth. The observation that only females were affected together with a high number of spontaneous abortions suggested an X-linked dominant mechanism of transmission. Using next generation sequencing we sequenced the whole genome of one affected mare. We analyzed the sequence data for non-synonymous variants in candidate genes and found a heterozygous nonsense variant in the X-chromosomal IKBKG gene (c.184C>T; p.Arg62*). Mutations in IKBKG were previously reported to cause IP in humans and the homologous p.Arg62* variant has already been observed in a human IP patient. The comparative data thus strongly suggest that this is also the causative variant for the observed IP in horses. To our knowledge this is the first large animal model for IP.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Highland cattle with congenital crop ears have notches of variable size on the tips of both ears. In some cases, cartilage deformation can be seen and occasionally the external ears are shortened. We collected 40 cases and 80 controls across Switzerland. Pedigree data analysis confirmed a monogenic autosomal dominant mode of inheritance with variable expressivity. All affected animals could be traced back to a single common ancestor. A genome-wide association study was performed and the causative mutation was mapped to a 4 Mb interval on bovine chromosome 6. The H6 family homeobox 1 (HMX1) gene was selected as a positional and functional candidate gene. By whole genome re-sequencing of an affected Highland cattle, we detected 6 non-synonymous coding sequence variants and two variants in an ultra-conserved element at the HMX1 locus with respect to the reference genome. Of these 8 variants, only a non-coding 76 bp genomic duplication (g.106720058_106720133dup) located in the conserved region was perfectly associated with crop ears. The identified copy number variation probably results in HMX1 misregulation and possible gain-of-function. Our findings confirm the role of HMX1 during the development of the external ear. As it is sometimes difficult to phenotypically diagnose Highland cattle with slight ear notches, genetic testing can now be used to improve selection against this undesired trait.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

ALS is a neurodegenerative disease that specifically affects upper and lower motor neurons leading to progressive paralysis and death. There is currently no effective treatment. Thus, identification of the signaling pathways and cellular mediators of ALS remains a major challenge in the search for novel therapeutic approaches. Recent studies have shown that non-coding RNAs have a significant impact on normal CNS development and onset and progression of neurological disorders. Based on this evidence we specifically test the hypothesis that misregulation of miRNA expression is a common feature in familiar ALS. Hence, we are exploiting human neuroblastoma cell lines either expressing the SOD1(G93A) mutation or depleted from Fused in Sarcoma (FUS) as tools to investigate the role of miRNAs in familiar ALS. To this end we performed a genome-wide scale miRNA expression on these cells, using whole-genome small RNA deep-sequencing followed by quantitative real time validation (qPCR). This strategy allowed us to find a group of dysregulated miRNAs, which are predicted to play a role in the motorneurons physiology and pathology. We verified our data on cDNA derived from SOD1-ALS mice models at early stage of the disease and on cDNA derived from lymphocytes from a small group of ALS patients. In the future, we plan to define the mechanisms responsible for the miRNA dysregulation, by silencing or stimulating the signal transduction pathways putatively involved in miRNA expression and regulation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

DNA-based parentage determination accelerates genetic improvement in sheep by increasing pedigree accuracy. Single nucleotide polymorphism (SNP) markers can be used for determining parentage and to provide unique molecular identifiers for tracing sheep products to their source. However, the utility of a particular "parentage SNP" varies by breed depending on its minor allele frequency (MAF) and its sequence context. Our aims were to identify parentage SNPs with exceptional qualities for use in globally diverse breeds and to develop a subset for use in North American sheep. Starting with genotypes from 2,915 sheep and 74 breed groups provided by the International Sheep Genomics Consortium (ISGC), we analyzed 47,693 autosomal SNPs by multiple criteria and selected 163 with desirable properties for parentage testing. On average, each of the 163 SNPs was highly informative (MAF≥0.3) in 48±5 breed groups. Nearby polymorphisms that could otherwise confound genetic testing were identified by whole genome and Sanger sequencing of 166 sheep from 54 breed groups. A genetic test with 109 of the 163 parentage SNPs was developed for matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry. The scoring rates and accuracies for these 109 SNPs were greater than 99% in a panel of North American sheep. In a blinded set of 96 families (sire, dam, and non-identical twin lambs), each parent of every lamb was identified without using the other parent's genotype. In 74 ISGC breed groups, the median estimates for probability of a coincidental match between two animals (PI), and the fraction of potential adults excluded from parentage (PE) were 1.1×10(-39) and 0.999987, respectively, for the 109 SNPs combined. The availability of a well-characterized set of 163 parentage SNPs facilitates the development of high-throughput genetic technologies for implementing accurate and economical parentage testing and traceability in many of the world's sheep breeds.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

During the summer of 2013 seven Italian Tyrolean Grey calves were born with abnormally short limbs. Detailed clinical and pathological examination revealed similarities to chondrodysplastic dwarfism. Pedigree analysis showed a common founder, assuming autosomal monogenic recessive transmission of the defective allele. A positional cloning approach combining genome wide association and homozygosity mapping identified a single 1.6 Mb genomic region on BTA 6 that was associated with the disease. Whole genome re-sequencing of an affected calf revealed a single candidate causal mutation in the Ellis van Creveld syndrome 2 (EVC2) gene. This gene is known to be associated with chondrodysplastic dwarfism in Japanese Brown cattle, and dwarfism, abnormal nails and teeth, and dysostosis in humans with Ellis-van Creveld syndrome. Sanger sequencing confirmed the presence of a 2 bp deletion in exon 19 (c.2993_2994ACdel) that led to a premature stop codon in the coding sequence of bovine EVC2, and was concordant with the recessive pattern of inheritance in affected and carrier animals. This loss of function mutation confirms the important role of EVC2 in bone development. Genetic testing can now be used to eliminate this form of chondrodysplastic dwarfism from Tyrolean Grey cattle.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We report on the molecular characterization of a microdeletion of approximately 2.5 Mb at 2p11.2 in a female baby with left congenital aural atresia, microtia, and ipsilateral internal carotid artery agenesis. The deletion was characterized by fluorescence in situ hybridization, array comparative genomic hybridization, and whole genome re-sequencing. Among the genes present in the deleted region, we focused our attention on the FOXI3 gene. Foxi3 is a member of the Foxi class of Forkhead transcription factors. In mouse, chicken and zebrafish Foxi3 homologues are expressed in the ectoderm and endoderm giving rise to elements of the jaw as well as external, middle and inner ear. Homozygous Foxi3-/- mice have recently been generated and show a complete absence of the inner, middle, and external ears as well as severe defects in the jaw and palate. Recently, a 7-bp duplication within exon 1 of FOXI3 that produces a frameshift and a premature stop codon was found in hairless dogs. Mild malformations of the outer auditory canal (closed ear canal) and ear lobe have also been noted in a fraction of FOXI3 heterozygote Peruvian hairless dogs. Based on the phenotypes of Foxi3 mutant animals, we propose that FOXI3 may be responsible for the phenotypic features of our patient. Further characterization of the genomic region and the analysis of similar patients may help to demonstrate this point. © 2015 Wiley Periodicals, Inc.