60 resultados para sequencing error
Resumo:
Oligonucleotides comprising unnatural building blocks, which interfere with the translation machinery, have gained increased attention for the treatment of gene-related diseases (e.g. antisense, RNAi). Due to structural modifications, synthetic oligonucleotides exhibit increased biostability and bioavailability upon administration. Consequently, classical enzyme-based sequencing methods are not applicable to their sequence elucidation and verification. Tandem mass spectrometry is the method of choice for performing such tasks, since gas-phase dissociation is not restricted to natural nucleic acids. However, tandem mass spectrometric analysis can generate product ion spectra of tremendous complexity, as the number of possible fragments grows rapidly with increasing sequence length. The fact that structural modifications affect the dissociation pathways greatly increases the variety of analytically valuable fragment ions. The gas-phase dissociation of oligonucleotides is characterized by the cleavage of one of the four bonds along the phosphodiester chain, by the accompanying loss of nucleases, and by the generation of internal fragments due to secondary backbone cleavage. For example, an 18-mer oligonucleotide yields a total number of 272’920 theoretical fragment ions. In contrast to the processing of peptide product ion spectra, which nowadays is highly automated, there is a lack of tools assisting the interpretation of oligonucleotide data. The existing web-based and stand-alone software applications are primarily designed for the sequence analysis of natural nucleic acids, but do not account for chemical modifications and adducts. Consequently, we developed a software to support the interpretation of mass spectrometric data of natural and modified nucleic acids and their adducts with chemotherapeutic agents.
Resumo:
A 1887-bp region at the 5' flank of the human p75 tumor necrosis factor receptor (p75 TNF-R)-encoding gene was found to be active in driving expression of the luc (luciferase-encoding) reporter gene, suggesting that it contains the promoter for the receptor. Rather unexpectedly, a 1827-bp region at the 3' end of the first intron of the p75 TNF-R gene also displayed promoter activity. This activity may be artefactual, reflecting only the presence of an enhancer in this region; yet it also raises the possibility that p75 TNF-R is controlled by more than one promoter and that it encodes various forms of the receptor, or even other proteins. We present here the nucleotide sequences of the 5' flanking and intron regions. Possible implications for the transcriptional regulation of the p75 TNF-R gene are discussed.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
The RNome of a cell is highly diverse and consists besides messenger RNAs (mRNAs), transfer RNAs (tRNAs), and ribosomal RNAs (rRNAs) also of other small and long transcript entities without apparent coding potential. This class of molecules, commonly referred to as non-protein-coding RNAs (ncRNAs), is involved in regulating numerous biological processes and thought to contribute to cellular complexity. Therefore, much effort is put into their identification and further functional characterization. Here we provide a cost-effective and reliable method for cDNA library construction of small RNAs in the size range of 20-500 residues. The effectiveness of the described method is demonstrated by the analysis of ribosome-associated small RNAs in the eukaryotic model organism Trypanosoma brucei.
Resumo:
Background: Tef (Eragrostis tef), an indigenous cereal critical to food security in the Horn of Africa, is rich in minerals and protein, resistant to many biotic and abiotic stresses and safe for diabetics as well as sufferers of immune reactions to wheat gluten. We present the genome of tef, the first species in the grass subfamily Chloridoideae and the first allotetraploid assembled de novo. We sequenced the tef genome for marker-assisted breeding, to shed light on the molecular mechanisms conferring tef's desirable nutritional and agronomic properties, and to make its genome publicly available as a community resource. Results: The draft genome contains 672 Mbp representing 87% of the genome size estimated from flow cytometry. We also sequenced two transcriptomes, one from a normalized RNA library and another from unnormalized RNASeq data. The normalized RNA library revealed around 38000 transcripts that were then annotated by the SwissProt group. The CoGe comparative genomics platform was used to compare the tef genome to other genomes, notably sorghum. Scaffolds comprising approximately half of the genome size were ordered by syntenic alignment to sorghum producing tef pseudo-chromosomes, which were sorted into A and B genomes as well as compared to the genetic map of tef. The draft genome was used to identify novel SSR markers, investigate target genes for abiotic stress resistance studies, and understand the evolution of the prolamin family of proteins that are responsible for the immune response to gluten. Conclusions: It is highly plausible that breeding targets previously identified in other cereal crops will also be valuable breeding targets in tef. The draft genome and transcriptome will be of great use for identifying these targets for genetic improvement of this orphan crop that is vital for feeding 50 million people in the Horn of Africa.
Resumo:
Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted re-sequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrl änder dogs. To our knowledge, this is the first time this family trio WGS-approach, has successfully been used to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrl änder dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G>C (p.R52P) was found to be concordant in eight additional cases and 16 healthy Kromfohrl änder dogs.
Resumo:
OBJECTIVE Intraarticular gadolinium-enhanced magnetic resonance arthrography (MRA) is commonly applied to characterize morphological disorders of the hip. However, the reproducibility of retrieving anatomic landmarks on MRA scans and their correlation with intraarticular pathologies is unknown. A precise mapping system for the exact localization of hip pathomorphologies with radial MRA sequences is lacking. Therefore, the purpose of the study was the establishment and validation of a reproducible mapping system for radial sequences of hip MRA. MATERIALS AND METHODS Sixty-nine consecutive intraarticular gadolinium-enhanced hip MRAs were evaluated. Radial sequencing consisted of 14 cuts orientated along the axis of the femoral neck. Three orthopedic surgeons read the radial sequences independently. Each MRI was read twice with a minimum interval of 7 days from the first reading. The intra- and inter-observer reliability of the mapping procedure was determined. RESULTS A clockwise system for hip MRA was established. The teardrop figure served to determine the 6 o'clock position of the acetabulum; the center of the greater trochanter served to determine the 12 o'clock position of the femoral head-neck junction. The intra- and inter-observer ICCs to retrieve the correct 6/12 o'clock positions were 0.906-0.996 and 0.978-0.988, respectively. CONCLUSIONS The established mapping system for radial sequences of hip joint MRA is reproducible and easy to perform.
Resumo:
Familial acute myeloid leukemia is rare and linked to germline mutations in RUNX1, GATA2 or CCAAT/enhancer binding protein-α (CEBPA). We re-evaluated a large family with acute myeloid leukemia originally seen at NIH in 1969. We utilized whole-exome sequencing to study this family, and conducted in silico bioinformatics analysis, protein structural modeling and laboratory experiments to assess the impact of the identified CEBPA Q311P mutation. Unlike most previously identified germline mutations in CEBPA, which were N-terminal frameshift mutations, we identified a novel Q311P variant that was located in the C-terminal bZip domain of C/EBPα. Protein structural modeling suggested that the Q311P mutation alters the ability of the CEBPA dimer to bind DNA. Electrophoretic mobility shift assays showed that the Q311P mutant had attenuated binding to DNA, as predicted by the protein modeling. Consistent with these findings, we found that the Q311P mutation has reduced transactivation, consistent with a loss-of-function mutation. From 45 years of follow-up, we observed incomplete penetrance (46%) of CEBPA Q311P. This study of a large multi-generational pedigree reveals that a germline mutation in the C-terminal bZip domain can alter the ability of C/EBP-α to bind DNA and reduces transactivation, leading to acute myeloid leukemia.
A functional approach to movement analysis and error identification in sports and physical education