976 resultados para complex sequences
Resumo:
This review is focused on the impact of chemometrics for resolving data sets collected from investigations of the interactions of small molecules with biopolymers. These samples have been analyzed with various instrumental techniques, such as fluorescence, ultraviolet–visible spectroscopy, and voltammetry. The impact of two powerful and demonstrably useful multivariate methods for resolution of complex data—multivariate curve resolution–alternating least squares (MCR–ALS) and parallel factor analysis (PARAFAC)—is highlighted through analysis of applications involving the interactions of small molecules with the biopolymers, serum albumin, and deoxyribonucleic acid. The outcomes illustrated that significant information extracted by the chemometric methods was unattainable by simple, univariate data analysis. In addition, although the techniques used to collect data were confined to ultraviolet–visible spectroscopy, fluorescence spectroscopy, circular dichroism, and voltammetry, data profiles produced by other techniques may also be processed. Topics considered including binding sites and modes, cooperative and competitive small molecule binding, kinetics, and thermodynamics of ligand binding, and the folding and unfolding of biopolymers. Applications of the MCR–ALS and PARAFAC methods reviewed were primarily published between 2008 and 2013.
Resumo:
The concept of energy gap(s) is useful for understanding the consequence of a small daily, weekly, or monthly positive energy balance and the inconspicuous shift in weight gain ultimately leading to overweight and obesity. Energy gap is a dynamic concept: an initial positive energy gap incurred via an increase in energy intake (or a decrease in physical activity) is not constant, may fade out with time if the initial conditions are maintained, and depends on the 'efficiency' with which the readjustment of the energy imbalance gap occurs with time. The metabolic response to an energy imbalance gap and the magnitude of the energy gap(s) can be estimated by at least two methods, i.e. i) assessment by longitudinal overfeeding studies, imposing (by design) an initial positive energy imbalance gap; ii) retrospective assessment based on epidemiological surveys, whereby the accumulated endogenous energy storage per unit of time is calculated from the change in body weight and body composition. In order to illustrate the difficulty of accurately assessing an energy gap we have used, as an illustrative example, a recent epidemiological study which tracked changes in total energy intake (estimated by gross food availability) and body weight over 3 decades in the US, combined with total energy expenditure prediction from body weight using doubly labelled water data. At the population level, the study attempted to assess the cause of the energy gap purported to be entirely due to increased food intake. Based on an estimate of change in energy intake judged to be more reliable (i.e. in the same study population) and together with calculations of simple energetic indices, our analysis suggests that conclusions about the fundamental causes of obesity development in a population (excess intake vs. low physical activity or both) is clouded by a high level of uncertainty.
Resumo:
The timing of widespread continental emergence is generally considered to have had a dramatic effect on the hydrological cycle, atmospheric conditions, and climate. New secondary ion mass spectrometry (SIMS) oxygen and laser-ablation–multicollector–inductively coupled plasma–mass spectrometry (LA-MC-ICP-MS) Lu-Hf isotopic results from dated zircon grains in the granitic Neoarchean Rum Jungle Complex provide a minimum time constraint on the emergence of continental crust above sea level for the North Australian craton. A 2535 ± 7 Ma monzogranite is characterized by magmatic zircon with slightly elevated δ18O (6.0‰–7.5‰ relative to Vienna standard mean ocean water [VSMOW]), consistent with some contribution to the magma from reworked supracrustal material. A supracrustal contribution to magma genesis is supported by the presence of metasedimentary rock enclaves, a large population of inherited zircon grains, and subchondritic zircon Hf (εHf = −6.6 to −4.1). A separate, distinct crustal source to the same magma is indicated by inherited zircon grains that are dominated by low δ18O values (2.5‰–4.8‰, n = 9 of 15) across a range of ages (3536–2598 Ma; εHf = −18.2 to +0.4). The low δ18O grains may be the product of one of two processes: (1) grain-scale diffusion of oxygen in zircon by exchange with a low δ18O magma or (2) several episodes of magmatic reworking of a Mesoarchean or older low δ18O source. Both scenarios require shallow crustal magmatism in emergent crust, to allow interaction with rocks altered by hydrothermal meteoric water in order to generate the low δ18O zircon. In the first scenario, assimilation of these altered rocks during Neoarchean magmatism generated low δ18O magma with which residual detrital zircons were able to exchange oxygen, while preserving their U-Pb systematics. In the second scenario, wholesale melting of the altered rocks occurred in several distinct events through the Mesoarchean, generating low δ18O magma from which zircon crystallized. Ultimately, in either scenario, the low δ18O zircons were entrained as inherited grains in a Neoarchean granite. The data suggest operation of a modern hydrological cycle by the Neoarchean and add to evidence for the increased emergence of continents by this time
Resumo:
AIM: This study investigated the ability of an osteoconductive biphasic scaffold to simultaneously regenerate alveolar bone, periodontal ligament and cementum. MATERIALS AND METHODS: A biphasic scaffold was built by attaching a fused deposition modelled bone compartment to a melt electrospun periodontal compartment. The bone compartment was coated with a calcium phosphate (CaP) layer for increasing osteoconductivity, seeded with osteoblasts and cultured in vitro for 6 weeks. The resulting constructs were then complemented with the placement of PDL cell sheets on the periodontal compartment, attached to a dentin block and subcutaneously implanted into athymic rats for 8 weeks. Scanning electron microscopy, X-ray diffraction, alkaline phosphatase and DNA content quantification, confocal laser microscopy, micro computerized tomography and histological analysis were employed to evaluate the scaffold's performance. RESULTS: The in vitro study showed that alkaline phosphatase activity was significantly increased in the CaP-coated samples and they also displayed enhanced mineralization. In the in vivo study, significantly more bone formation was observed in the coated scaffolds. Histological analysis revealed that the large pore size of the periodontal compartment permitted vascularization of the cell sheets, and periodontal attachment was achieved at the dentin interface. CONCLUSIONS: This work demonstrates that the combination of cell sheet technology together with an osteoconductive biphasic scaffold could be utilized to address the limitations of current periodontal regeneration techniques.
Resumo:
Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.
Resumo:
A combined data matrix consisting of high performance liquid chromatography–diode array detector (HPLC–DAD) and inductively coupled plasma-mass spectrometry (ICP-MS) measurements of samples from the plant roots of the Cortex moutan (CM), produced much better classification and prediction results in comparison with those obtained from either of the individual data sets. The HPLC peaks (organic components) of the CM samples, and the ICP-MS measurements (trace metal elements) were investigated with the use of principal component analysis (PCA) and the linear discriminant analysis (LDA) methods of data analysis; essentially, qualitative results suggested that discrimination of the CM samples from three different provinces was possible with the combined matrix producing best results. Another three methods, K-nearest neighbor (KNN), back-propagation artificial neural network (BP-ANN) and least squares support vector machines (LS-SVM) were applied for the classification and prediction of the samples. Again, the combined data matrix analyzed by the KNN method produced best results (100% correct; prediction set data). Additionally, multiple linear regression (MLR) was utilized to explore any relationship between the organic constituents and the metal elements of the CM samples; the extracted linear regression equations showed that the essential metals as well as some metallic pollutants were related to the organic compounds on the basis of their concentrations
Resumo:
Based on protein molecular dynamics, we investigate the fractal properties of energy, pressure and volume time series using the multifractal detrended fluctuation analysis (MF-DFA) and the topological and fractal properties of their converted horizontal visibility graphs (HVGs). The energy parameters of protein dynamics we considered are bonded potential, angle potential, dihedral potential, improper potential, kinetic energy, Van der Waals potential, electrostatic potential, total energy and potential energy. The shape of the h(q)h(q) curves from MF-DFA indicates that these time series are multifractal. The numerical values of the exponent h(2)h(2) of MF-DFA show that the series of total energy and potential energy are non-stationary and anti-persistent; the other time series are stationary and persistent apart from series of pressure (with H≈0.5H≈0.5 indicating the absence of long-range correlation). The degree distributions of their converted HVGs show that these networks are exponential. The results of fractal analysis show that fractality exists in these converted HVGs. For each energy, pressure or volume parameter, it is found that the values of h(2)h(2) of MF-DFA on the time series, exponent λλ of the exponential degree distribution and fractal dimension dBdB of their converted HVGs do not change much for different proteins (indicating some universality). We also found that after taking average over all proteins, there is a linear relationship between 〈h(2)〉〈h(2)〉 (from MF-DFA on time series) and 〈dB〉〈dB〉 of the converted HVGs for different energy, pressure and volume.
Resumo:
Debates on gene patents have necessitated the analysis of patents that disclose and reference human sequences. In this study, we built an automated classifier that assigns sequences to one of nine predefined categories according to their functional roles in patent claims by applying natural language processing and supervised learning techniques. To improve its correctness, we experimented with various feature mappings, resulting in the maximal accuracy of 79%.
Resumo:
A novel combined near- and mid-infrared (NIR and MIR) spectroscopic method has been researched and developed for the analysis of complex substances such as the Traditional Chinese Medicine (TCM), Illicium verum Hook. F. (IVHF), and its noxious adulterant, Iuicium lanceolatum A.C. Smith (ILACS). Three types of spectral matrix were submitted for classification with the use of the linear discriminant analysis (LDA) method. The data were pretreated with either the successive projections algorithm (SPA) or the discrete wavelet transform (DWT) method. The SPA method performed somewhat better, principally because it required less spectral features for its pretreatment model. Thus, NIR or MIR matrix as well as the combined NIR/MIR one, were pretreated by the SPA method, and then analysed by LDA. This approach enabled the prediction and classification of the IVHF, ILACS and mixed samples. The MIR spectral data produced somewhat better classification rates than the NIR data. However, the best results were obtained from the combined NIR/MIR data matrix with 95–100% correct classifications for calibration, validation and prediction. Principal component analysis (PCA) of the three types of spectral data supported the results obtained with the LDA classification method.
Resumo:
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/.
Resumo:
Bayesian networks (BNs) are tools for representing expert knowledge or evidence. They are especially useful for synthesising evidence or belief concerning a complex intervention, assessing the sensitivity of outcomes to different situations or contextual frameworks and framing decision problems that involve alternative types of intervention. Bayesian networks are useful extensions to logic maps when initiating a review or to facilitate synthesis and bridge the gap between evidence acquisition and decision-making. Formal elicitation techniques allow development of BNs on the basis of expert opinion. Such applications are useful alternatives to ‘empty’ reviews, which identify knowledge gaps but fail to support decision-making. Where review evidence exists, it can inform the development of a BN. We illustrate the construction of a BN using a motivating example that demonstrates how BNs can ensure coherence, transparently structure the problem addressed by a complex intervention and assess sensitivity to context, all of which are critical components of robust reviews of complex interventions. We suggest that BNs should be utilised to routinely synthesise reviews of complex interventions or empty reviews where decisions must be made despite poor evidence.
Resumo:
Ankylosing spondylitis (AS) is a common, highly heritable, inflammatory arthritis for which HLA-B*27 is the major genetic risk factor, although its role in the aetiology of AS remains elusive. To better understand the genetic basis of the MHC susceptibility loci, we genotyped 7,264 MHC SNPs in 22,647 AS cases and controls of European descent. We impute SNPs, classical HLA alleles and amino-acid residues within HLA proteins, and tested these for association to AS status. Here we show that in addition to effects due to HLA-B*27 alleles, several other HLA-B alleles also affect susceptibility. After controlling for the associated haplotypes in HLA-B, we observe independent associations with variants in the HLA-A, HLA-DPB1 and HLA-DRB1 loci. We also demonstrate that the ERAP1 SNP rs30187 association is not restricted only to carriers of HLA-B*27 but also found in HLA-B*40:01 carriers independently of HLA-B*27 genotype.
Resumo:
Shared aetiopathogenic factors among immune-mediated diseases have long been suggested by their co-familiality and co-occurrence, and molecular support has been provided by analysis of human leukocyte antigen (HLA) haplotypes and genome-wide association studies. The interrelationships can now be better appreciated following the genotyping of large immune disease sample sets on a shared SNP array: the 'Immunochip'. Here, we systematically analyse loci shared among major immune-mediated diseases. This reveals that several diseases share multiple susceptibility loci, but there are many nuances. The most associated variant at a given locus frequently differs and, even when shared, the same allele often has opposite associations. Interestingly, risk alleles conferring the largest effect sizes are usually disease-specific. These factors help to explain why early evidence of extensive 'sharing' is not always reflected in epidemiological overlap. © 2013 Macmillan Publishers Limited. All rights reserved.
Resumo:
MicroRNAs (miRNAs) are small non-coding RNAs of 20 nt in length that are capable of modulating gene expression post-transcriptionally. Although miRNAs have been implicated in cancer, including breast cancer, the regulation of miRNA transcription and the role of defects in this process in cancer is not well understood. In this study we have mapped the promoters of 93 breast cancer-associated miRNAs, and then looked for associations between DNA methylation of 15 of these promoters and miRNA expression in breast cancer cells. The miRNA promoters with clearest association between DNA methylation and expression included a previously described and a novel promoter of the Hsa-mir-200b cluster. The novel promoter of the Hsa-mir-200b cluster, denoted P2, is located 2 kb upstream of the 5′ stemloop and maps within a CpG island. P2 has comparable promoter activity to the previously reported promoter (P1), and is able to drive the expression of miR-200b in its endogenous genomic context. DNA methylation of both P1 and P2 was inversely associated with miR-200b expression in eight out of nine breast cancer cell lines, and in vitro methylation of both promoters repressed their activity in reporter assays. In clinical samples, P1 and P2 were differentially methylated with methylation inversely associated with miR-200b expression. P1 was hypermethylated in metastatic lymph nodes compared with matched primary breast tumours whereas P2 hypermethylation was associated with loss of either oestrogen receptor or progesterone receptor. Hypomethylation of P2 was associated with gain of HER2 and androgen receptor expression. These data suggest an association between miR-200b regulation and breast cancer subtype and a potential use of DNA methylation of miRNA promoters as a component of a suite of breast cancer biomarkers.
Resumo:
Genome-wide association studies (GWASs) have been successful at identifying single-nucleotide polymorphisms (SNPs) highly associated with common traits; however, a great deal of the heritable variation associated with common traits remains unaccounted for within the genome. Genome-wide complex trait analysis (GCTA) is a statistical method that applies a linear mixed model to estimate phenotypic variance of complex traits explained by genome-wide SNPs, including those not associated with the trait in a GWAS. We applied GCTA to 8 cohorts containing 7096 case and 19 455 control individuals of European ancestry in order to examine the missing heritability present in Parkinson's disease (PD). We meta-analyzed our initial results to produce robust heritability estimates for PD types across cohorts. Our results identify 27% (95% CI 17-38, P = 8.08E - 08) phenotypic variance associated with all types of PD, 15% (95% CI -0.2 to 33, P = 0.09) phenotypic variance associated with early-onset PD and 31% (95% CI 17-44, P = 1.34E - 05) phenotypic variance associated with late-onset PD. This is a substantial increase from the genetic variance identified by top GWAS hits alone (between 3 and 5%) and indicates there are substantially more risk loci to be identified. Our results suggest that although GWASs are a useful tool in identifying the most common variants associated with complex disease, a great deal of common variants of small effect remain to be discovered. © Published by Oxford University Press 2012.