920 resultados para Mate-pair sequencing
Resumo:
Background Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). Results Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. Conclusions 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch webcite and should provide useful assistance to drug discovery projects.
Resumo:
Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted re-sequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrl änder dogs. To our knowledge, this is the first time this family trio WGS-approach, has successfully been used to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrl änder dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G>C (p.R52P) was found to be concordant in eight additional cases and 16 healthy Kromfohrl änder dogs.
Resumo:
The production of electron–positron pairs in time-dependent electric fields (Schwinger mechanism) depends non-linearly on the applied field profile. Accordingly, the resulting momentum spectrum is extremely sensitive to small variations of the field parameters. Owing to this non-linear dependence it is so far unpredictable how to choose a field configuration such that a predetermined momentum distribution is generated. We show that quantum kinetic theory along with optimal control theory can be used to approximately solve this inverse problem for Schwinger pair production. We exemplify this by studying the superposition of a small number of harmonic components resulting in predetermined signatures in the asymptotic momentum spectrum. In the long run, our results could facilitate the observation of this yet unobserved pair production mechanism in quantum electrodynamics by providing suggestions for tailored field configurations.
Resumo:
OBJECTIVE Intraarticular gadolinium-enhanced magnetic resonance arthrography (MRA) is commonly applied to characterize morphological disorders of the hip. However, the reproducibility of retrieving anatomic landmarks on MRA scans and their correlation with intraarticular pathologies is unknown. A precise mapping system for the exact localization of hip pathomorphologies with radial MRA sequences is lacking. Therefore, the purpose of the study was the establishment and validation of a reproducible mapping system for radial sequences of hip MRA. MATERIALS AND METHODS Sixty-nine consecutive intraarticular gadolinium-enhanced hip MRAs were evaluated. Radial sequencing consisted of 14 cuts orientated along the axis of the femoral neck. Three orthopedic surgeons read the radial sequences independently. Each MRI was read twice with a minimum interval of 7 days from the first reading. The intra- and inter-observer reliability of the mapping procedure was determined. RESULTS A clockwise system for hip MRA was established. The teardrop figure served to determine the 6 o'clock position of the acetabulum; the center of the greater trochanter served to determine the 12 o'clock position of the femoral head-neck junction. The intra- and inter-observer ICCs to retrieve the correct 6/12 o'clock positions were 0.906-0.996 and 0.978-0.988, respectively. CONCLUSIONS The established mapping system for radial sequences of hip joint MRA is reproducible and easy to perform.
Resumo:
Familial acute myeloid leukemia is rare and linked to germline mutations in RUNX1, GATA2 or CCAAT/enhancer binding protein-α (CEBPA). We re-evaluated a large family with acute myeloid leukemia originally seen at NIH in 1969. We utilized whole-exome sequencing to study this family, and conducted in silico bioinformatics analysis, protein structural modeling and laboratory experiments to assess the impact of the identified CEBPA Q311P mutation. Unlike most previously identified germline mutations in CEBPA, which were N-terminal frameshift mutations, we identified a novel Q311P variant that was located in the C-terminal bZip domain of C/EBPα. Protein structural modeling suggested that the Q311P mutation alters the ability of the CEBPA dimer to bind DNA. Electrophoretic mobility shift assays showed that the Q311P mutant had attenuated binding to DNA, as predicted by the protein modeling. Consistent with these findings, we found that the Q311P mutation has reduced transactivation, consistent with a loss-of-function mutation. From 45 years of follow-up, we observed incomplete penetrance (46%) of CEBPA Q311P. This study of a large multi-generational pedigree reveals that a germline mutation in the C-terminal bZip domain can alter the ability of C/EBP-α to bind DNA and reduces transactivation, leading to acute myeloid leukemia.
Resumo:
DNA duplexes containing unnatural base-pair surrogates are attractive biomolecular nanomaterials with potentially beneficial photophysical or electronic properties. Herein we report the first X-ray structure of a duplex containing a phen-pair in the center of the double helix in a zipper like stacking arrangement.
Resumo:
Every x-ray attenuation curve inherently contains all the information necessary to extract the complete energy spectrum of a beam. To date, attempts to obtain accurate spectral information from attenuation data have been inadequate.^ This investigation presents a mathematical pair model, grounded in physical reality by the Laplace Transformation, to describe the attenuation of a photon beam and the corresponding bremsstrahlung spectral distribution. In addition the Laplace model has been mathematically extended to include characteristic radiation in a physically meaningful way. A method to determine the fraction of characteristic radiation in any diagnostic x-ray beam was introduced for use with the extended model.^ This work has examined the reconstructive capability of the Laplace pair model for a photon beam range of from 50 kVp to 25 MV, using both theoretical and experimental methods.^ In the diagnostic region, excellent agreement between a wide variety of experimental spectra and those reconstructed with the Laplace model was obtained when the atomic composition of the attenuators was accurately known. The model successfully reproduced a 2 MV spectrum but demonstrated difficulty in accurately reconstructing orthovoltage and 6 MV spectra. The 25 MV spectrum was successfully reconstructed although poor agreement with the spectrum obtained by Levy was found.^ The analysis of errors, performed with diagnostic energy data, demonstrated the relative insensitivity of the model to typical experimental errors and confirmed that the model can be successfully used to theoretically derive accurate spectral information from experimental attenuation data. ^
Resumo:
Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^
Resumo:
A clone of the primary Eco R1 family of human DNA sequences has been used as an indicator sequence for detecting alterations induced by a toxic agent. Specific clones of this family have been examined and compared to the consensus sequence to determine the normal variability of this family. Though variations were observed, data indicated that such clones can be used to study induced DNA modifications. This DNA was exposed to the toxic agent dimethyl sulfate under various conditions and a distinct pattern of aberrations was shown to occur. It is suggested that this approach be used to characterize patterns of damage induced by various agents in the ultimate development of a system capable of monitoring human genotoxic exposure. ^
Resumo:
Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (
Resumo:
Paracrine motogenic factors, including motility cytokines and extracellular matrix molecules secreted by normal cells, can stimulate metastatic cell invasion. For extracellular matrix molecules, both the intact molecules and the degradative products may exhibit these activities, which in some cases are not shared by the intact molecules. We found that human peritumoral and lung fibroblasts secrete motility-stimulating activity for several recently established human sarcoma cell strains. The motility of lung metastasis-derived human SYN-1 sarcoma cells was preferentially stimulated by human lung and peritumoral fibroblast motility-stimulating factors (FMSFs). FMSFs were nondialyzable, susceptible to trypsin, and sensitive to dithiothreitol. Cycloheximide inhibited accumulation of FMSF activity in conditioned medium; however, addition of cycloheximide to the migration assay did not significantly affect motility-stimulating activity. Purified hepatocyte growth factor/scatter factor (HGF/SF), rabbit anti-hHGF, and RT-PCR analysis of peritumoral and lung fibroblast HGF/SF mRNA expression indicated that FMSF activity was unrelated to HGF/SF. Partial purification of FMSF by gel exclusion chromatography revealed several peaks of activity, suggesting multiple FMSF molecules or complexes.^ We purified the fibroblast motility-stimulating factor from human lung fibroblast-conditioned medium to apparent homogeneity by sequential heparin affinity chromatography and DEAE anion exchange chromatography. Lysylendopeptidase C digestion of FMSF and sequencing of peptides purified by reverse phase HPLC after digestion identified it as an N-terminal fragment of human fibronectin. Purified FMSF stimulated predominantly chemotaxis but chemokinesis as well of SYN-1 sarcoma cells and was chemotactic for a variety of human sarcoma cells, including fibrosarcoma, leiomyosarcoma, liposarcoma, synovial sarcoma and neurofibrosarcoma cells. The motility-stimulating activity present in HLF-CM was completely eliminated by either neutralization or immunodepletion with a rabbit anti-human-fibronectin antibody, thus further confirming that the fibronectin fragment was the FMSF responsible for the motility stimulation of human soft tissue sarcoma cells. Since human soft tissue sarcomas have a distinctive hematogenous metastatic pattern (predominantly lung), FMSF may play a role in this process. ^
Resumo:
Background: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel Next Generation Sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify the richness and diversity of a mixed zooplankton assemblage from a productive monitoring site in the Western English Channel. Methodology/Principle Findings: Plankton WP2 replicate net hauls (200 µm) were taken at the Western Channel Observatory long-term monitoring station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,042 sequences were obtained for all samples. The sequences clustered in to 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 138 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 75 taxonomic groups. Conclusions: The percentage of OTUs assigned to major eukaryotic taxonomic groups broadly aligns between the metagenetic and morphological analysis and are dominated by Copepoda. However, the metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for estimating diversity and species richness of zooplankton communities.
Resumo:
Microorganisms play an important role in the transformation of material within the earth's crust. The storage of CO2 could affect the composition of inorganic and organic components in the reservoir, consequently influencing microbial activities. To study the microbial induced processes together with geochemical, petrophysical and mineralogical changes, occurring during CO2 storage, long-term laboratory experiments under simulated reservoir P-T conditions were carried out. Clean inner core sections, obtained from the reservoir region at the CO2 storage site in Ketzin (Germany) from a depth of about 650 m, were incubated in high pressure vessels together with sterile synthetic formation brine under in situ P-T conditions of 5.5 MPa and 40°C. A 16S rDNA based fingerprinting method was used to identify the dominant species in DNA extracts of pristine sandstone samples. Members of the alpha- and beta-subdivisions of Proteobacteria and the Actinobacteria were identified. So far sequences belonging to facultative anaerobic, chemoheterotrophic bacteria (Burkholderia fungorum, Agrobacterium tumefaciens) gaining their energy from the oxidation of organic molecules and a genus also capable of chemolithoautotrophic growth (Hydrogenophaga) was identified. During CO2 incubation minor changes in the microbial community composition were observed. The majority of microbes were able to adapt to the changed conditions. During CO2 exposure increased concentrations of Ca**2+, K**+, Mg**2+ and SO4**2- were observed. Partially, concentration rises are (i) due to equilibration between rock pore water and synthetic brine, and (ii) between rock and brine, and are thus independent on CO2 exposure. However, observed concentrations of Ca**2+, K**+, Mg**2+ are even higher than in the original reservoir fluid and therefore indicate mineral dissolution due to CO2 exposure.