990 resultados para Chemical space diagram
Resumo:
Analyzing and modeling relationships between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects in chemical datasets is a challenging task for scientific researchers in the field of cheminformatics. Therefore, (Q)SAR model validation is essential to ensure future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to approve its use in real-world scenarios as an alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model is still under discussion. In this work, we empirically compare a k-fold cross-validation with external test set validation. The introduced workflow allows to apply the built and validated models to large amounts of unseen data, and to compare the performance of the different validation approaches. Our experimental results indicate that cross-validation produces (Q)SAR models with higher predictivity than external test set validation and reduces the variance of the results. Statistical validation is important to evaluate the performance of (Q)SAR models, but does not support the user in better understanding the properties of the model or the underlying correlations. We present the 3D molecular viewer CheS-Mapper (Chemical Space Mapper) that arranges compounds in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kinds of features, like structural fragments as well as quantitative chemical descriptors. Comprehensive functionalities including clustering, alignment of compounds according to their 3D structure, and feature highlighting aid the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. Even though visualization tools for analyzing (Q)SAR information in small molecule datasets exist, integrated visualization methods that allows for the investigation of model validation results are still lacking. We propose visual validation, as an approach for the graphical inspection of (Q)SAR model validation results. New functionalities in CheS-Mapper 2.0 facilitate the analysis of (Q)SAR information and allow the visual validation of (Q)SAR models. The tool enables the comparison of model predictions to the actual activity in feature space. Our approach reveals if the endpoint is modeled too specific or too generic and highlights common properties of misclassified compounds. Moreover, the researcher can use CheS-Mapper to inspect how the (Q)SAR model predicts activity cliffs. The CheS-Mapper software is freely available at http://ches-mapper.org.
Resumo:
This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.
Resumo:
Human activities represent a significant burden on the global water cycle, with large and increasing demands placed on limited water resources by manufacturing, energy production and domestic water use. In addition to changing the quantity of available water resources, human activities lead to changes in water quality by introducing a large and often poorly-characterized array of chemical pollutants, which may negatively impact biodiversity in aquatic ecosystems, leading to impairment of valuable ecosystem functions and services. Domestic and industrial wastewaters represent a significant source of pollution to the aquatic environment due to inadequate or incomplete removal of chemicals introduced into waters by human activities. Currently, incomplete chemical characterization of treated wastewaters limits comprehensive risk assessment of this ubiquitous impact to water. In particular, a significant fraction of the organic chemical composition of treated industrial and domestic wastewaters remains uncharacterized at the molecular level. Efforts aimed at reducing the impacts of water pollution on aquatic ecosystems critically require knowledge of the composition of wastewaters to develop interventions capable of protecting our precious natural water resources.
The goal of this dissertation was to develop a robust, extensible and high-throughput framework for the comprehensive characterization of organic micropollutants in wastewaters by high-resolution accurate-mass mass spectrometry. High-resolution mass spectrometry provides the most powerful analytical technique available for assessing the occurrence and fate of organic pollutants in the water cycle. However, significant limitations in data processing, analysis and interpretation have limited this technique in achieving comprehensive characterization of organic pollutants occurring in natural and built environments. My work aimed to address these challenges by development of automated workflows for the structural characterization of organic pollutants in wastewater and wastewater impacted environments by high-resolution mass spectrometry, and to apply these methods in combination with novel data handling routines to conduct detailed fate studies of wastewater-derived organic micropollutants in the aquatic environment.
In Chapter 2, chemoinformatic tools were implemented along with novel non-targeted mass spectrometric analytical methods to characterize, map, and explore an environmentally-relevant “chemical space” in municipal wastewater. This was accomplished by characterizing the molecular composition of known wastewater-derived organic pollutants and substances that are prioritized as potential wastewater contaminants, using these databases to evaluate the pollutant-likeness of structures postulated for unknown organic compounds that I detected in wastewater extracts using high-resolution mass spectrometry approaches. Results showed that application of multiple computational mass spectrometric tools to structural elucidation of unknown organic pollutants arising in wastewaters improved the efficiency and veracity of screening approaches based on high-resolution mass spectrometry. Furthermore, structural similarity searching was essential for prioritizing substances sharing structural features with known organic pollutants or industrial and consumer chemicals that could enter the environment through use or disposal.
I then applied this comprehensive methodological and computational non-targeted analysis workflow to micropollutant fate analysis in domestic wastewaters (Chapter 3), surface waters impacted by water reuse activities (Chapter 4) and effluents of wastewater treatment facilities receiving wastewater from oil and gas extraction activities (Chapter 5). In Chapter 3, I showed that application of chemometric tools aided in the prioritization of non-targeted compounds arising at various stages of conventional wastewater treatment by partitioning high dimensional data into rational chemical categories based on knowledge of organic chemical fate processes, resulting in the classification of organic micropollutants based on their occurrence and/or removal during treatment. Similarly, in Chapter 4, high-resolution sampling and broad-spectrum targeted and non-targeted chemical analysis were applied to assess the occurrence and fate of organic micropollutants in a water reuse application, wherein reclaimed wastewater was applied for irrigation of turf grass. Results showed that organic micropollutant composition of surface waters receiving runoff from wastewater irrigated areas appeared to be minimally impacted by wastewater-derived organic micropollutants. Finally, Chapter 5 presents results of the comprehensive organic chemical composition of oil and gas wastewaters treated for surface water discharge. Concurrent analysis of effluent samples by complementary, broad-spectrum analytical techniques, revealed that low-levels of hydrophobic organic contaminants, but elevated concentrations of polymeric surfactants, which may effect the fate and analysis of contaminants of concern in oil and gas wastewaters.
Taken together, my work represents significant progress in the characterization of polar organic chemical pollutants associated with wastewater-impacted environments by high-resolution mass spectrometry. Application of these comprehensive methods to examine micropollutant fate processes in wastewater treatment systems, water reuse environments, and water applications in oil/gas exploration yielded new insights into the factors that influence transport, transformation, and persistence of organic micropollutants in these systems across an unprecedented breadth of chemical space.
Resumo:
N-Heterocycles are ubiquitous in biologically active natural products and pharmaceuticals. Yet, new syntheses and modifications of N-heterocycles are continually of interest for the purposes of expanding chemical space, finding quicker synthetic routes, better pharmaceuticals, and even new handles for molecular labeling. There are several iterations of molecular labeling; the decision of where to place the label is as important as of which visualization technique to emphasize.
Piperidine and indole are two of the most widely distributed N-heterocycles and thus were targeted for synthesis, functionalization, and labeling. The major functionalization of these scaffolds should include a nitrogen atom, while the inclusion of other groups will expand the utility of the method. Towards this goal, ease of synthesis and elimination of step-wise transformations are of the utmost concern. Here, the concept of electrophilic amination can be utilized as a way of introducing complex secondary and tertiary amines with minimal operations.
Molecular tags should be on or adjacent to an N-heterocycle as they are normally the motifs implicated at the binding site of enzymes and receptors. The labeling techniques should be useful to a chemical biologist, but should also in theory be useful to the medical community. The two types of labeling that are of interest to a chemist and a physician would be positron emission tomography (PET) and magnetic resonance imaging (MRI).
Coincidentally, the 3-positions of both piperidine and indole are historically difficult to access and modify. However, using electrophilic amination techniques, 3-functionalized piperidines can be synthesized in good yields from unsaturated amines. In the same manner, 3-labeled piperidines can be obtained; the piperidines can either be labeled with an azide for biochemical research or an 18F for PET imaging research. The novel electrophiles, N-benzenesulfonyloxyamides, can be reacted with indole in one of two ways: 3-amidation or 1-amidomethylation, depending on the exact reaction conditions. Lastly, a novel, hyperpolarizable 15N2-labeled diazirine has been developed as an exogenous and versatile tag for use in magnetic resonance imaging.
Resumo:
Recent research trends in computer-aided drug design have shown an increasing interest towards the implementation of advanced approaches able to deal with large amount of data. This demand arose from the awareness of the complexity of biological systems and from the availability of data provided by high-throughput technologies. As a consequence, drug research has embraced this paradigm shift exploiting approaches such as that based on networks. Indeed, the process of drug discovery can benefit from the implementation of network-based methods at different steps from target identification to drug repurposing. From this broad range of opportunities, this thesis is focused on three main topics: (i) chemical space networks (CSNs), which are designed to represent and characterize bioactive compound data sets; (ii) drug-target interactions (DTIs) prediction through a network-based algorithm that predicts missing links; (iii) COVID-19 drug research which was explored implementing COVIDrugNet, a network-based tool for COVID-19 related drugs. The main highlight emerged from this thesis is that network-based approaches can be considered useful methodologies to tackle different issues in drug research. In detail, CSNs are valuable coordinate-free, graphically accessible representations of structure-activity relationships of bioactive compounds data sets especially for medium-large libraries of molecules. DTIs prediction through the random walk with restart algorithm on heterogeneous networks can be a helpful method for target identification. COVIDrugNet is an example of the usefulness of network-based approaches for studying drugs related to a specific condition, i.e., COVID-19, and the same ‘systems-based’ approaches can be used for other diseases. To conclude, network-based tools are proving to be suitable in many applications in drug research and provide the opportunity to model and analyze diverse drug-related data sets, even large ones, also integrating different multi-domain information.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt".
Resumo:
The state-space approach is used to evaluate the relation between soil physical and chemical properties in an area cultivated with sugarcane. The experiment was carried out on a Rhodic Kandiudalf in Piracicaba, State of São Paulo, Brazil. Sugarcane was planted on an area of 0.21 ha i.e., in 15 rows 100 m long, spaced 1.4 m. Soil water content, soil organic matter, clay content and aggregate stability were sampled along a transect of 84 points, meter by meter. The state-space approach is used to evaluate how the soil water content is affected by itself and by soil organic matter, clay content, and aggregate stability of neighboring locations, in different combinations, aiming to contribute to a better understanding of the relation among these variables in the soil. Results show that soil water contents were successfully estimated by this approach. Best performances were found when the estimate of soil water content at locations i was related to soil water content, clay content and aggregate stability at locations i-1. Results also indicate that this state-space model using all series describes the soil water content better than any equivalent multiple regression equation.
Resumo:
Après des décennies de développement, l'ablation laser est devenue une technique importante pour un grand nombre d'applications telles que le dépôt de couches minces, la synthèse de nanoparticules, le micro-usinage, l’analyse chimique, etc. Des études expérimentales ainsi que théoriques ont été menées pour comprendre les mécanismes physiques fondamentaux mis en jeu pendant l'ablation et pour déterminer l’effet de la longueur d'onde, de la durée d'impulsion, de la nature de gaz ambiant et du matériau de la cible. La présente thèse décrit et examine l'importance relative des mécanismes physiques qui influencent les caractéristiques des plasmas d’aluminium induits par laser. Le cadre général de cette recherche forme une étude approfondie de l'interaction entre la dynamique de la plume-plasma et l’atmosphère gazeuse dans laquelle elle se développe. Ceci a été réalisé par imagerie résolue temporellement et spatialement de la plume du plasma en termes d'intensité spectrale, de densité électronique et de température d'excitation dans différentes atmosphères de gaz inertes tel que l’Ar et l’He et réactifs tel que le N2 et ce à des pressions s’étendant de 10‾7 Torr (vide) jusqu’à 760 Torr (pression atmosphérique). Nos résultats montrent que l'intensité d'émission de plasma dépend généralement de la nature de gaz et qu’elle est fortement affectée par sa pression. En outre, pour un délai temporel donné par rapport à l'impulsion laser, la densité électronique ainsi que la température augmentent avec la pression de gaz, ce qui peut être attribué au confinement inertiel du plasma. De plus, on observe que la densité électronique est maximale à proximité de la surface de la cible où le laser est focalisé et qu’elle diminue en s’éloignant (axialement et radialement) de cette position. Malgré la variation axiale importante de la température le long du plasma, on trouve que sa variation radiale est négligeable. La densité électronique et la température ont été trouvées maximales lorsque le gaz est de l’argon et minimales pour l’hélium, tandis que les valeurs sont intermédiaires dans le cas de l’azote. Ceci tient surtout aux propriétés physiques et chimiques du gaz telles que la masse des espèces, leur énergie d'excitation et d'ionisation, la conductivité thermique et la réactivité chimique. L'expansion de la plume du plasma a été étudiée par imagerie résolue spatio-temporellement. Les résultats montrent que la nature de gaz n’affecte pas la dynamique de la plume pour des pressions inférieures à 20 Torr et pour un délai temporel inférieur à 200 ns. Cependant, pour des pressions supérieures à 20 Torr, l'effet de la nature du gaz devient important et la plume la plus courte est obtenue lorsque la masse des espèces du gaz est élevée et lorsque sa conductivité thermique est relativement faible. Ces résultats sont confirmés par la mesure de temps de vol de l’ion Al+ émettant à 281,6 nm. D’autre part, on trouve que la vitesse de propagation des ions d’aluminium est bien définie juste après l’ablation et près de la surface de la cible. Toutefois, pour un délai temporel important, les ions, en traversant la plume, se thermalisent grâce aux collisions avec les espèces du plasma et du gaz.
Resumo:
The behavior of the chemical attributes is directly influenced by superficial flow and water movement inside the soil. This work aimed to study the space dependency of chemical attributes in an area with sugarcane plantation in Pereira Barreto, SP. An area of 530.67 hectares was mapped using an equipment of Global Positioning System and obtaining a Digital Elevation Model. A set of 134 soil samples were collected every seven hectares in the depths of 0-0.25 m and 0.80-1.00 m. The pH, organic matter (OM), Ca, Mg, K, BS, CEC and base saturation (BS) were analyzed. All the chemical attributes presented similar behavior in the superficial and subsuperficial layer of the soil, which provided better visualization and definition of the homogeneous tillage zones.
Resumo:
A bitopic ligand, 4-(3,5-dimethylpyrazol-4-yl)-1,2,4-triazole (Hpz-tr) (1), containing two different heterocyclic moieties was employed for the design of copper(II)–molybdate solids under hydrothermal conditions. In the multicomponent CuII/Hpz-tr/MoVI system, a diverse set of coordination hybrids, [Cu(Hpz-tr)2SO4]·3H2O (2), [Cu(Hpz-tr)Mo3O10] (3), [Cu4(OH)4(Hpz-tr)4Mo8O26]·6H2O (4), [Cu(Hpz-tr)2Mo4O13] (5), and [Mo2O6(Hpz-tr)]·H2O (6), was prepared and characterized. A systematic investigation of these systems in the form of a ternary crystallization diagram approach was utilized to show the influence of the molar ratios of starting reagents, the metal (CuII and MoVI) sources, the temperature, etc., on the reaction products outcome. Complexes 2–4 dominate throughout a wide crystallization range of the composition triangle, while the other two compounds 5 and 6 crystallize as minor phases in a narrow concentration range. In the crystal structures of 2–6, the organic ligand behaves as a short [N–N]-triazole linker between metal centers Cu···Cu in 2–4, Cu···Mo in 5, and Mo···Mo in 6, while the pyrazolyl function remains uncoordinated. This is the reason for the exceptional formation of low-dimensional coordination motifs: 1D for 2, 4, and 6 and 2D for 3 and 5. In all cases, the pyrazolyl group is involved in H bonding (H-donor/H-acceptor) and is responsible for π–π stacking, thus connecting the chain and layer structures in more complicated H-bonding architectures. These compounds possess moderate thermal stability up to 250–300 °C. The magnetic measurements were performed for 2–4, revealing in all three cases antiferromagnetic exchange interactions between neighboring CuII centers and long-range order with a net moment below Tc of 13 K for compound 4.
Resumo:
We present a re-analysis of the Geneva-Copenhagen survey, which benefits from the infrared flux method to improve the accuracy of the derived stellar effective temperatures and uses the latter to build a consistent and improved metallicity scale. Metallicities are calibrated on high-resolution spectroscopy and checked against four open clusters and a moving group, showing excellent consistency. The new temperature and metallicity scales provide a better match to theoretical isochrones, which are used for a Bayesian analysis of stellar ages. With respect to previous analyses, our stars are on average 100 K hotter and 0.1 dex more metal rich, which shift the peak of the metallicity distribution function around the solar value. From Stromgren photometry we are able to derive for the first time a proxy for [alpha/Fe] abundances, which enables us to perform a tentative dissection of the chemical thin and thick disc. We find evidence for the latter being composed of an old, mildly but systematically alpha-enhanced population that extends to super solar metallicities, in agreement with spectroscopic studies. Our revision offers the largest existing kinematically unbiased sample of the solar neighbourhood that contains full information on kinematics, metallicities, and ages and thus provides better constraints on the physical processes relevant in the build-up of the Milky Way disc, enabling a better understanding of the Sun in a Galactic context.
Resumo:
Context. The formation and evolution of the Galactic bulge and its relationship with the other Galactic populations is still poorly understood. Aims. To establish the chemical differences and similarities between the bulge and other stellar populations, we performed an elemental abundance analysis of alpha- (O, Mg, Si, Ca, and Ti) and Z-odd (Na and Al) elements of red giant stars in the bulge as well as of local thin disk, thick disk and halo giants. Methods. We use high-resolution optical spectra of 25 bulge giants in Baade's window and 55 comparison giants (4 halo, 29 thin disk and 22 thick disk giants) in the solar neighborhood. All stars have similar stellar parameters but cover a broad range in metallicity (-1.5 < [Fe/H] < +0.5). A standard 1D local thermodynamic equilibrium analysis using both Kurucz and MARCS models yielded the abundances of O, Na, Mg, Al, Si, Ca, Ti and Fe. Our homogeneous and differential analysis of the Galactic stellar populations ensured that systematic errors were minimized. Results. We confirm the well-established differences for [alpha/Fe] at a given metallicity between the local thin and thick disks. For all the elements investigated, we find no chemical distinction between the bulge and the local thick disk, in agreement with our previous study of C, N and O but in contrast to other groups relying on literature values for nearby disk dwarf stars. For -1.5 < [Fe/H] < -0.3 exactly the same trend is followed by both the bulge and thick disk stars, with a star-to-star scatter of only 0.03 dex. Furthermore, both populations share the location of the knee in the [alpha/Fe] vs. [Fe/H] diagram. It still remains to be confirmed that the local thick disk extends to super-solar metallicities as is the case for the bulge. These are the most stringent constraints to date on the chemical similarity of these stellar populations. Conclusions. Our findings suggest that the bulge and local thick disk stars experienced similar formation timescales, star formation rates and initial mass functions, confirming thus the main outcomes of our previous homogeneous analysis of [O/Fe] from infrared spectra for nearly the same sample. The identical a-enhancements of thick disk and bulge stars may reflect a rapid chemical evolution taking place before the bulge and thick disk structures we see today were formed, or it may reflect Galactic orbital migration of inner disk/bulge stars resulting in stars in the solar neighborhood with thick-disk kinematics.