18 resultados para Inverse computational method
em Helda - Digital Repository of University of Helsinki
Resumo:
Protein conformations and dynamics can be studied by nuclear magnetic resonance spectroscopy using dilute liquid crystalline samples. This work clarifies the interpretation of residual dipolar coupling data yielded by the experiments. It was discovered that unfolded proteins without any additional structure beyond that of a mere polypeptide chain exhibit residual dipolar couplings. Also, it was found that molecular dynamics induce fluctuations in the molecular alignment and doing so affect residual dipolar couplings. The finding clarified the origins of low order parameter values observed earlier. The work required the development of new analytical and computational methods for the prediction of intrinsic residual dipolar coupling profiles for unfolded proteins. The presented characteristic chain model is able to reproduce the general trend of experimental residual dipolar couplings for denatured proteins. The details of experimental residual dipolar coupling profiles are beyond the analytical model, but improvements are proposed to achieve greater accuracy. A computational method for rapid prediction of unfolded protein residual dipolar couplings was also developed. Protein dynamics were shown to modulate the effective molecular alignment in a dilute liquid crystalline medium. The effects were investigated from experimental and molecular dynamics generated conformational ensembles of folded proteins. It was noted that dynamics induced alignment is significant especially for the interpretation of molecular dynamics in small, globular proteins. A method of correction was presented. Residual dipolar couplings offer an attractive possibility for the direct observation of protein conformational preferences and dynamics. The presented models and methods of analysis provide significant advances in the interpretation of residual dipolar coupling data from proteins.
Resumo:
Nucleation is the first step of the process by which gas molecules in the atmosphere condense to form liquid or solid particles. Despite the importance of atmospheric new-particle formation for both climate and health-related issues, little information exists on its precise molecular-level mechanisms. In this thesis, potential nucleation mechanisms involving sulfuric acid together with either water and ammonia or reactive biogenic molecules are studied using quantum chemical methods. Quantum chemistry calculations are based on the numerical solution of Schrödinger's equation for a system of atoms and electrons subject to various sets of approximations, the precise details of which give rise to a large number of model chemistries. A comparison of several different model chemistries indicates that the computational method must be chosen with care if accurate results for sulfuric acid - water - ammonia clusters are desired. Specifically, binding energies are incorrectly predicted by some popular density functionals, and vibrational anharmonicity must be accounted for if quantitatively reliable formation free energies are desired. The calculations reported in this thesis show that a combination of different high-level energy corrections and advanced thermochemical analysis can quantitatively replicate experimental results concerning the hydration of sulfuric acid. The role of ammonia in sulfuric acid - water nucleation was revealed by a series of calculations on molecular clusters of increasing size with respect to all three co-ordinates; sulfuric acid, water and ammonia. As indicated by experimental measurements, ammonia significantly assists the growth of clusters in the sulfuric acid - co-ordinate. The calculations presented in this thesis predict that in atmospheric conditions, this effect becomes important as the number of acid molecules increases from two to three. On the other hand, small molecular clusters are unlikely to contain more than one ammonia molecule per sulfuric acid. This implies that the average NH3:H2SO4 mole ratio of small molecular clusters in atmospheric conditions is likely to be between 1:3 and 1:1. Calculations on charged clusters confirm the experimental result that the HSO4- ion is much more strongly hydrated than neutral sulfuric acid. Preliminary calculations on HSO4- NH3 clusters indicate that ammonia is likely to play at most a minor role in ion-induced nucleation in the sulfuric acid - water system. Calculations of thermodynamic and kinetic parameters for the reaction of stabilized Criegee Intermediates with sulfuric acid demonstrate that quantum chemistry is a powerful tool for investigating chemically complicated nucleation mechanisms. The calculations indicate that if the biogenic Criegee Intermediates have sufficiently long lifetimes in atmospheric conditions, the studied reaction may be an important source of nucleation precursors.
Resumo:
Lakes serve as sites for terrestrially fixed carbon to be remineralized and transferred back to the atmosphere. Their role in regional carbon cycling is especially important in the Boreal Zone, where lakes can cover up to 20% of the land area. Boreal lakes are often characterized by the presence of a brown water colour, which implies high levels of dissolved organic carbon from the surrounding terrestrial ecosystem, but the load of inorganic carbon from the catchment is largely unknown. Organic carbon is transformed to methane (CH4) and carbon dioxide (CO2) in biological processes that result in lake water gas concentrations that increase above atmospheric equilibrium, thus making boreal lakes as sources of these important greenhouse gases. However, flux estimates are often based on sporadic sampling and modelling and actual flux measurements are scarce. Thus, the detailed temporal flux dynamics of greenhouse gases are still largely unknown. ----- One aim here was to reveal the natural dynamics of CH4 and CO2 concentrations and fluxes in a small boreal lake. The other aim was to test the applicability of a measuring technique for CO2 flux, i.e. the eddy covariance (EC) technique, and a computational method for estimation of primary production and community respiration, both commonly used in terrestrial research, in this lake. Continuous surface water CO2 concentration measurements, also needed in free-water applications to estimate primary production and community respiration, were used over two open water periods in a study of CO2 concentration dynamics. Traditional methods were also used to measure gas concentration and fluxes. The study lake, Valkea-Kotinen, is a small, humic, headwater lake within an old-growth forest catchment with no local anthropogenic disturbance and thus possible changes in gas dynamics reflect the natural variability in lake ecosystems. CH4 accumulated under the ice and in the hypolimnion during summer stratification. The surface water CH4 concentration was always above atmospheric equilibrium and thus the lake was a continuous source of CH4 to the atmosphere. However, the annual CH4 fluxes were small, i.e. 0.11 mol m-2 yr-1, and the timing of fluxes differed from that of other published estimates. The highest fluxes are usually measured in spring after ice melt but in Lake Valkea-Kotinen CH4 was effectively oxidised in spring and highest effluxes occurred in autumn after summer stratification period. CO2 also accumulated under the ice and the hypolimnetic CO2 concentration increased steadily during stratification period. The surface water CO2 concentration was highest in spring and in autumn, whereas during the stable stratification it was sometimes under atmospheric equilibrium. It showed diel, daily and seasonal variation; the diel cycle was clearly driven by light and thus reflected the metabolism of the lacustrine ecosystem. However, the diel cycle was sometimes blurred by injection of hypolimnetic water rich in CO2 and the surface water CO2 concentration was thus controlled by stratification dynamics. The highest CO2 fluxes were measured in spring, autumn and during those hypolimnetic injections causing bursts of CO2 comparable with the spring and autumn fluxes. The annual fluxes averaged 77 (±11 SD) g C m-2 yr-1. In estimating the importance of the lake in recycling terrestrial carbon, the flux was normalized to the catchment area and this normalized flux was compared with net ecosystem production estimates of -50 to 200 g C m-2 yr-1 from unmanaged forests in corresponding temperature and precipitation regimes in the literature. Within this range the flux of Lake Valkea-Kotinen yielded from the increase in source of the surrounding forest by 20% to decrease in sink by 5%. The free water approach gave primary production and community respiration estimates of 5- and 16-fold, respectively, compared with traditional bottle incubations during a 5-day testing period in autumn. The results are in parallel with findings in the literature. Both methods adopted from the terrestrial community also proved useful in lake studies. A large percentage of the EC data was rejected, due to the unfulfilled prerequisites of the method. However, the amount of data accepted remained large compared with what would be feasible with traditional methods. Use of the EC method revealed underestimation of the widely used gas exchange model and suggests simultaneous measurements of actual turbulence at the water surface with comparison of the different gas flux methods to revise the parameterization of the gas transfer velocity used in the models.
Resumo:
The molecular level structure of mixtures of water and alcohols is very complicated and has been under intense research in the recent past. Both experimental and computational methods have been used in the studies. One method for studying the intra- and intermolecular bindings in the mixtures is the use of the so called difference Compton profiles, which are a way to obtain information about changes in the electron wave functions. In the process of Compton scattering a photon scatters inelastically from an electron. The Compton profile that is obtained from the electron wave functions is directly proportional to the probability of photon scattering at a given energy to a given solid angle. In this work we develop a method to compute Compton profiles numerically for mixtures of liquids. In order to obtain the electronic wave functions necessary to calculate the Compton profiles we need some statistical information about atomic coordinates. Acquiring this using ab-initio molecular dynamics is beyond our computational capabilities and therefore we use classical molecular dynamics to model the movement of atoms in the mixture. We discuss the validity of the chosen method in view of the results obtained from the simulations. There are some difficulties in using classical molecular dynamics for the quantum mechanical calculations, but these can possibly be overcome by parameter tuning. According to the calculations clear differences can be seen in the Compton profiles of different mixtures. This prediction needs to be tested in experiments in order to find out whether the approximations made are valid.
Resumo:
There is intense activity in the area of theoretical chemistry of gold. It is now possible to predict new molecular species, and more recently, solids by combining relativistic methodology with isoelectronic thinking. In this thesis we predict a series of solid sheet-type crystals for Group-11 cyanides, MCN (M=Cu, Ag, Au), and Group-2 and 12 carbides MC2 (M=Be-Ba, Zn-Hg). The idea of sheets is then extended to nanostrips which can be bent to nanorings. The bending energies and deformation frequencies can be systematized by treating these molecules as an elastic bodies. In these species Au atoms act as an 'intermolecular glue'. Further suggested molecular species are the new uncongested aurocarbons, and the neutral Au_nHg_m clusters. Many of the suggested species are expected to be stabilized by aurophilic interactions. We also estimate the MP2 basis-set limit of the aurophilicity for the model compounds [ClAuPH_3]_2 and [P(AuPH_3)_4]^+. Beside investigating the size of the basis-set applied, our research confirms that the 19-VE TZVP+2f level, used a decade ago, already produced 74 % of the present aurophilic attraction energy for the [ClAuPH_3]_2 dimer. Likewise we verify the preferred C4v structure for the [P(AuPH_3)_4]^+ cation at the MP2 level. We also perform the first calculation on model aurophilic systems using the SCS-MP2 method and compare the results to high-accuracy CCSD(T) ones. The recently obtained high-resolution microwave spectra on MCN molecules (M=Cu, Ag, Au) provide an excellent testing ground for quantum chemistry. MP2 or CCSD(T) calculations, correlating all 19 valence electrons of Au and including BSSE and SO corrections, are able to give bond lengths to 0.6 pm, or better. Our calculated vibrational frequencies are expected to be better than the currently available experimental estimates. Qualitative evidence for multiple Au-C bonding in triatomic AuCN is also found.
Resumo:
The chemical and physical properties of bimetallic clusters have attracted considerable attention due to the potential technological applications of mixed-metal systems. It is of fundamental interests to study clusters because they are the link between atomic surface and bulk properties. More information of metal-metal bond in small clusters can be hence released. The studies in my thesis mainly focus on the two different kinds of bimetallic clusters: the clusters consisting of extraordinary shaped all metal four-membered rings and a series of sodium auride clusters. As described in most general organic chemistry books nowadays, a group of compounds are classified as aromatic compounds because of their remarkable stabilities, particular geometrical and energetic properties and so on. The notation of aromaticity is essentially qualitative. More recently, the connection has been made between aromaticity and energetic and magnetic properties. Also, the discussions of the aromatic nature of molecular rings are no longer limited to organic compounds obeying the Hückel’s rule. In our research, we mainly applied the GIMIC method to several bimetallic clusters at the CCSD level, and compared the results with those obtained by using chemical shift based methods. The magnetically induced ring currents can be generated easily by employing GIMIC method, and the nature of aromaticity for each system can be therefore clarified. We performed intensive quantum chemical calculations to explore the characters of the anionic sodium auride clusters and the corresponding neutral clusters since it has been fascinating in investigating molecules with gold atom involved due to its distinctive physical and chemical properties. As small gold clusters, the sodium auride clusters seem to form planar structures. With the addition of a negative charge, the gold atom in anionic clusters prefers to carry the charge and orients itself away from other gold atoms. As a result, the energetically lowest isomer for an anionic cluster is distinguished from the one for the corresponding neutral cluster. Mostly importantly, we presented a comprehensive strategy of ab initio applications to computationally implement the experimental photoelectron spectra.
Resumo:
The problem of recovering information from measurement data has already been studied for a long time. In the beginning, the methods were mostly empirical, but already towards the end of the sixties Backus and Gilbert started the development of mathematical methods for the interpretation of geophysical data. The problem of recovering information about a physical phenomenon from measurement data is an inverse problem. Throughout this work, the statistical inversion method is used to obtain a solution. Assuming that the measurement vector is a realization of fractional Brownian motion, the goal is to retrieve the amplitude and the Hurst parameter. We prove that under some conditions, the solution of the discretized problem coincides with the solution of the corresponding continuous problem as the number of observations tends to infinity. The measurement data is usually noisy, and we assume the data to be the sum of two vectors: the trend and the noise. Both vectors are supposed to be realizations of fractional Brownian motions, and the goal is to retrieve their parameters using the statistical inversion method. We prove a partial uniqueness of the solution. Moreover, with the support of numerical simulations, we show that in certain cases the solution is reliable and the reconstruction of the trend vector is quite accurate.
Resumo:
We consider an obstacle scattering problem for linear Beltrami fields. A vector field is a linear Beltrami field if the curl of the field is a constant times itself. We study the obstacles that are of Neumann type, that is, the normal component of the total field vanishes on the boundary of the obstacle. We prove the unique solvability for the corresponding exterior boundary value problem, in other words, the direct obstacle scattering model. For the inverse obstacle scattering problem, we deduce the formulas that are needed to apply the singular sources method. The numerical examples are computed for the direct scattering problem and for the inverse scattering problem.
Resumo:
Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.
Resumo:
Large-scale chromosome rearrangements such as copy number variants (CNVs) and inversions encompass a considerable proportion of the genetic variation between human individuals. In a number of cases, they have been closely linked with various inheritable diseases. Single-nucleotide polymorphisms (SNPs) are another large part of the genetic variance between individuals. They are also typically abundant and their measuring is straightforward and cheap. This thesis presents computational means of using SNPs to detect the presence of inversions and deletions, a particular variety of CNVs. Technically, the inversion-detection algorithm detects the suppressed recombination rate between inverted and non-inverted haplotype populations whereas the deletion-detection algorithm uses the EM-algorithm to estimate the haplotype frequencies of a window with and without a deletion haplotype. As a contribution to population biology, a coalescent simulator for simulating inversion polymorphisms has been developed. Coalescent simulation is a backward-in-time method of modelling population ancestry. Technically, the simulator also models multiple crossovers by using the Counting model as the chiasma interference model. Finally, this thesis includes an experimental section. The aforementioned methods were tested on synthetic data to evaluate their power and specificity. They were also applied to the HapMap Phase II and Phase III data sets, yielding a number of candidates for previously unknown inversions, deletions and also correctly detecting known such rearrangements.
Resumo:
This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.
Resumo:
This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.
Resumo:
This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.
Resumo:
Nucleation is the first step in the formation of a new phase inside a mother phase. Two main forms of nucleation can be distinguished. In homogeneous nucleation, the new phase is formed in a uniform substance. In heterogeneous nucleation, on the other hand, the new phase emerges on a pre-existing surface (nucleation site). Nucleation is the source of about 30% of all atmospheric aerosol which in turn has noticeable health effects and a significant impact on climate. Nucleation can be observed in the atmosphere, studied experimentally in the laboratory and is the subject of ongoing theoretical research. This thesis attempts to be a link between experiment and theory. By comparing simulation results to experimental data, the aim is to (i) better understand the experiments and (ii) determine where the theory needs improvement. Computational fluid dynamics (CFD) tools were used to simulate homogeneous onecomponent nucleation of n-alcohols in argon and helium as carrier gases, homogeneous nucleation in the water-sulfuric acid-system, and heterogeneous nucleation of water vapor on silver particles. In the nucleation of n-alcohols, vapor depletion, carrier gas effect and carrier gas pressure effect were evaluated, with a special focus on the pressure effect whose dependence on vapor and carrier gas properties could be specified. The investigation of nucleation in the water-sulfuric acid-system included a thorough analysis of the experimental setup, determining flow conditions, vapor losses, and nucleation zone. Experimental nucleation rates were compared to various theoretical approaches. We found that none of the considered theoretical descriptions of nucleation captured the role of water in the process at all relative humidities. Heterogeneous nucleation was studied in the activation of silver particles in a TSI 3785 particle counter which uses water as its working fluid. The role of the contact angle was investigated and the influence of incoming particle concentrations and homogeneous nucleation on counting efficiency determined.