57 resultados para Data clustering. Fuzzy C-Means. Cluster centers initialization. Validation indices
Resumo:
Background: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure.Results: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae.Conclusion: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
Introduction: Breastfeeding effects on cognition are attributed to long-chain polyunsaturated fatty acids (LC-PUFAs), but controversy persists. Genetic variation in fatty acid desaturase (FADS) and elongase (ELOVL) enzymes has been overlooked when studying the effects of LC-PUFAs supply on cognition. We aimed to: 1) to determine whether maternal genetic variants in the FADS cluster and ELOVL genes contribute to differences in LC-PUFA levels in colostrum; 2) to analyze whether these maternal variants are related to child cognition; and 3) to assess whether children's variants modify breastfeeding effects on cognition. Methods: Data come from two population-based birth cohorts (n = 400 mother-child pairs from INMA-Sabadell; and n = 340 children from INMA-Menorca). LC-PUFAs were measured in 270 colostrum samples from INMA-Sabadell. Tag SNPs were genotyped both in mothers and children (13 in the FADS cluster, 6 in ELOVL2, and 7 in ELOVL5). Child cognition was assessed at 14 mo and 4 y using the Bayley Scales of Infant Development and the McCarthy Scales of Children"s Abilities, respectively. Results: Children of mothers carrying genetic variants associated with lower FADS1 activity (regulating AA and EPA synthesis), higher FADS2 activity (regulating DHA synthesis), and with higher EPA/AA and DHA/AA ratios in colostrum showed a significant advantage in cognition at 14 mo (3.5 to 5.3 points). Not being breastfed conferred an 8- to 9-point disadvantage in cognition among children GG homozygote for rs174468 (low FADS1 activity) but not among those with the A allele. Moreover, not being breastfed resulted in a disadvantage in cognition (5 to 8 points) among children CC homozygote for rs2397142 (low ELOVL5 activity), but not among those carrying the G allele. Conclusion: Genetically determined maternal supplies of LC-PUFAs during pregnancy and lactation appear to be crucial for child cognition. Breastfeeding effects on cognition are modified by child genetic variation in fatty acid desaturase and elongase enzymes.
Resumo:
The interaction of atomic F and Cl with Si4H9 and Ge4H9 cluster models has been studied by using ab initio pseudopotentials and basis sets of increasing complexity. The results show that the effect of d orbitals is important in order to reproduce the experimental findings. However, the use of polarization functions in the atoms which are directly involved in the chemisorption bond leads to results which are very close to those obtained using extended basis sets. The local nature of the chemisorption bond is also interpreted by means of a Mulliken population analysis. For F-Si4H9 and Cl-Si4H9 the present results are in good agreement with previous ab initio all-electron calculations, and for the chemisorption of Cl on Si(111) and Ge(111) surfaces, good agreement is found with respect to the available experimental results as well as with previous slab calculations based on the local-density-functional formalism.
Resumo:
The O 1s x-ray photoelectron spectroscopy spectrum for Al(111)/O at 300 K shows two components whose behavior as a function of time and variation of detection angle are consistent with either (a) a surface species represented by the higher binding-energy (BE) component and a subsurface species represented by the lower BE component, or (b) small close-packed oxygen islands with the interior atoms represented by the lower BE component and the perimeter atoms by the higher BE component. We have modeled both situations using ab initio Hartree-Fock wave functions for clusters of Al and O atoms. For an O atom in a threefold site, it was found that a below-surface position gave a higher O 1s BE than an above-surface position, incompatible with interpretation (a). This change in the O 1s BE could arise because the bond for O to Al may have a more covalent character when the O is below the surface than when it is above the surface. We present evidence consistent with this view. An O adatom island with all the O atoms in threefold sites gives calculated O 1s BE's which are significantly higher for the perimeter O atoms. Further, the results for an isolated O island without the Al substrate present also give higher BE¿s for the perimeter atoms. Both these results are consistent with interpretation (b). Published scanning-tunneling-microscopy data supports the suggestion that the chemisorbed state consists of small, close-packed islands, whereas the presence of two vibrational modes in high-resolution electron-energy-loss spectroscopy data has been interpreted as representing surface and subsurface oxygen atoms. In light of the present results, we suggest that a vibrational interpretation in terms of interior and perimeter adatoms should be considered.
Resumo:
We present an analysis of the M-O chemical bonding in the binary oxides MgO, CaO, SrO, BaO, and Al2O3 based on ab initio wave functions. The model used to represent the local environment of a metal cation in the bulk oxide is an MO6 cluster which also includes the effect of the lattice Madelung potential. The analysis of the wave functions for these clusters leads to the conclusion that all the alkaline-earth oxides must be regarded as highly ionic oxides; however, the ionic character of the oxides decreases as one goes from MgO, almost perfectly ionic, to BaO. In Al2O3 the ionic character is further reduced; however, even in this case, the departure from the ideal, fully ionic, model of Al3+ is not exceptionally large. These conclusions are based on three measures, a decomposition of the Mq+-Oq- interaction energy, the number of electrons associated to the oxygen ions as obtained from a projection operator technique, and the analysis of the cation core-level binding energies. The increasing covalent character along the series MgO, CaO, SrO, and BaO is discussed in view of the existing theoretical models and experimental data.
Resumo:
Aggregates of oxygen vacancies (F centers) represent a particular form of point defects in ionic crystals. In this study we have considered the combination of two oxygen vacancies, the M center, in the bulk and on the surface of MgO by means of cluster model calculations. Both neutral and charged forms of the defect M and M+ have been taken into account. The ground state of the M center is characterized by the presence of two doubly occupied impurity levels in the gap of the material; in M+ centers the highest level is singly occupied. For the ground-state properties we used a gradient corrected density functional theory approach. The dipole-allowed singlet-to-singlet and doublet-to-doublet electronic transitions have been determined by means of explicitly correlated multireference second-order perturbation theory calculations. These have been compared with optical transitions determined with the time-dependent density functional theory formalism. The results show that bulk M and M+ centers give rise to intense absorptions at about 4.4 and 4.0 eV, respectively. Another less intense transition at 1.3 eV has also been found for the M+ center. On the surface the transitions occur at 1.6 eV (M+) and 2 eV (M). The results are compared with recently reported electron energy loss spectroscopy spectra on MgO thin films.
Resumo:
Majolica pottery is one of the most characteristic tableware produced during the Medieval and Renaissance periods. Majolica technology was introduced to the Iberian Peninsula by Islamic artisans during Medieval times, and its production and popularity rapidly spread throughout Spain and eventually to other locations in Europe and the Americas. The prestige and importance of Spanish majolica was very high. Consequently, this ware was imported profusely to the Americas during the Spanish Colonial period. Nowadays, Majolica pottery serves as an important horizon marker at Spanish colonial sites. A preliminary study of Spanish-produced majolica was conducted on a set of 246 samples from the 12 primary majolica production centers on the Iberian Peninsula. The samples were analyzed by neutron activation analysis (NAA), and the resulting data were interpreted using an array of multivariate statistical procedures. Our results show a clear discrimination between different production centers. In some cases, our data allow one to distinguish amongst shards coming from the same production location suggesting different workshops or group of workshops were responsible for production of this pre-industrial pottery.
Resumo:
Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.
Resumo:
The study of transcriptional regulation often needs the integration of diverse yet independent data. In the present work, sequence conservation, predic-tion of transcription factor binding sites (TFBS) and gene expression analysis have been applied to the detection of putative transcription factor (TF) modules in the regulatory region of the FGFR3 oncogene. Several TFs with conserved binding sites in the FGFR3 regulatory region have shown high positive or negative corre-lation with FGFR3 expression both in urothelial carcinoma and in benign nevi. By means of conserved TF cluster analysis, two different TF modules have been iden-tified in the promoter and first intron of FGFR3 gene. These modules contain acti-vating AP2, E2F, E47 and SP1 binding sites plus motifs for EGR with possible repressor function.
Resumo:
We present molecular dynamics (MD) simulations results for dense fluids of ultrasoft, fully penetrable particles. These are a binary mixture and a polydisperse system of particles interacting via the generalized exponential model, which is known to yield cluster crystal phases for the corresponding monodisperse systems. Because of the dispersity in the particle size, the systems investigated in this work do not crystallize and form disordered cluster phases. The clusteringtransition appears as a smooth crossover to a regime in which particles are mostly located in clusters, isolated particles being infrequent. The analysis of the internal cluster structure reveals microsegregation of the big and small particles, with a strong homo-coordination in the binary mixture. Upon further lowering the temperature below the clusteringtransition, the motion of the clusters" centers-of-mass slows down dramatically, giving way to a cluster glass transition. In the cluster glass, the diffusivities remain finite and display an activated temperature dependence, indicating that relaxation in the cluster glass occurs via particle hopping in a nearly arrested matrix of clusters. Finally we discuss the influence of the microscopic dynamics on the transport properties by comparing the MD results with Monte Carlo simulations.
Resumo:
Vagueness and high dimensional space data are usual features of current data. The paper is an approach to identify conceptual structures among fuzzy three dimensional data sets in order to get conceptual hierarchy. We propose a fuzzy extension of the Galois connections that allows to demonstrate an isomorphism theorem between fuzzy sets closures which is the basis for generating lattices ordered-sets
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Background: Current advances in genomics, proteomics and other areas of molecular biology make the identification and reconstruction of novel pathways an emerging area of great interest. One such class of pathways is involved in the biogenesis of Iron-Sulfur Clusters (ISC). Results: Our goal is the development of a new approach based on the use and combination of mathematical, theoretical and computational methods to identify the topology of a target network. In this approach, mathematical models play a central role for the evaluation of the alternative network structures that arise from literature data-mining, phylogenetic profiling, structural methods, and human curation. As a test case, we reconstruct the topology of the reaction and regulatory network for the mitochondrial ISC biogenesis pathway in S. cerevisiae. Predictions regarding how proteins act in ISC biogenesis are validated by comparison with published experimental results. For example, the predicted role of Arh1 and Yah1 and some of the interactions we predict for Grx5 both matches experimental evidence. A putative role for frataxin in directly regulating mitochondrial iron import is discarded from our analysis, which agrees with also published experimental results. Additionally, we propose a number of experiments for testing other predictions and further improve the identification of the network structure. Conclusion: We propose and apply an iterative in silico procedure for predictive reconstruction of the network topology of metabolic pathways. The procedure combines structural bioinformatics tools and mathematical modeling techniques that allow the reconstruction of biochemical networks. Using the Iron Sulfur cluster biogenesis in S. cerevisiae as a test case we indicate how this procedure can be used to analyze and validate the network model against experimental results. Critical evaluation of the obtained results through this procedure allows devising new wet lab experiments to confirm its predictions or provide alternative explanations for further improving the models.
A priori parameterisation of the CERES soil-crop models and tests against several European data sets
Resumo:
Mechanistic soil-crop models have become indispensable tools to investigate the effect of management practices on the productivity or environmental impacts of arable crops. Ideally these models may claim to be universally applicable because they simulate the major processes governing the fate of inputs such as fertiliser nitrogen or pesticides. However, because they deal with complex systems and uncertain phenomena, site-specific calibration is usually a prerequisite to ensure their predictions are realistic. This statement implies that some experimental knowledge on the system to be simulated should be available prior to any modelling attempt, and raises a tremendous limitation to practical applications of models. Because the demand for more general simulation results is high, modellers have nevertheless taken the bold step of extrapolating a model tested within a limited sample of real conditions to a much larger domain. While methodological questions are often disregarded in this extrapolation process, they are specifically addressed in this paper, and in particular the issue of models a priori parameterisation. We thus implemented and tested a standard procedure to parameterize the soil components of a modified version of the CERES models. The procedure converts routinely-available soil properties into functional characteristics by means of pedo-transfer functions. The resulting predictions of soil water and nitrogen dynamics, as well as crop biomass, nitrogen content and leaf area index were compared to observations from trials conducted in five locations across Europe (southern Italy, northern Spain, northern France and northern Germany). In three cases, the model’s performance was judged acceptable when compared to experimental errors on the measurements, based on a test of the model’s root mean squared error (RMSE). Significant deviations between observations and model outputs were however noted in all sites, and could be ascribed to various model routines. In decreasing importance, these were: water balance, the turnover of soil organic matter, and crop N uptake. A better match to field observations could therefore be achieved by visually adjusting related parameters, such as field-capacity water content or the size of soil microbial biomass. As a result, model predictions fell within the measurement errors in all sites for most variables, and the model’s RMSE was within the range of published values for similar tests. We conclude that the proposed a priori method yields acceptable simulations with only a 50% probability, a figure which may be greatly increased through a posteriori calibration. Modellers should thus exercise caution when extrapolating their models to a large sample of pedo-climatic conditions for which they have only limited information.
Resumo:
PLFC is a first-order possibilistic logic dealing with fuzzy constants and fuzzily restricted quantifiers. The refutation proof method in PLFC is mainly based on a generalized resolution rule which allows an implicit graded unification among fuzzy constants. However, unification for precise object constants is classical. In order to use PLFC for similarity-based reasoning, in this paper we extend a Horn-rule sublogic of PLFC with similarity-based unification of object constants. The Horn-rule sublogic of PLFC we consider deals only with disjunctive fuzzy constants and it is equipped with a simple and efficient version of PLFC proof method. At the semantic level, it is extended by equipping each sort with a fuzzy similarity relation, and at the syntactic level, by fuzzily “enlarging” each non-fuzzy object constant in the antecedent of a Horn-rule by means of a fuzzy similarity relation.