12 resultados para Combinatorial Grassmannian
em Helda - Digital Repository of University of Helsinki
Resumo:
This PhD Thesis is about certain infinite-dimensional Grassmannian manifolds that arise naturally in geometry, representation theory and mathematical physics. From the physics point of view one encounters these infinite-dimensional manifolds when trying to understand the second quantization of fermions. The many particle Hilbert space of the second quantized fermions is called the fermionic Fock space. A typical element of the fermionic Fock space can be thought to be a linear combination of the configurations m particles and n anti-particles . Geometrically the fermionic Fock space can be constructed as holomorphic sections of a certain (dual)determinant line bundle lying over the so called restricted Grassmannian manifold, which is a typical example of an infinite-dimensional Grassmannian manifold one encounters in QFT. The construction should be compared with its well-known finite-dimensional analogue, where one realizes an exterior power of a finite-dimensional vector space as the space of holomorphic sections of a determinant line bundle lying over a finite-dimensional Grassmannian manifold. The connection with infinite-dimensional representation theory stems from the fact that the restricted Grassmannian manifold is an infinite-dimensional homogeneous (Kähler) manifold, i.e. it is of the form G/H where G is a certain infinite-dimensional Lie group and H its subgroup. A central extension of G acts on the total space of the dual determinant line bundle and also on the space its holomorphic sections; thus G admits a (projective) representation on the fermionic Fock space. This construction also induces the so called basic representation for loop groups (of compact groups), which in turn are vitally important in string theory / conformal field theory. The Thesis consists of three chapters: the first chapter is an introduction to the backround material and the other two chapters are individually written research articles. The first article deals in a new way with the well-known question in Yang-Mills theory, when can one lift the action of the gauge transformation group on the space of connection one forms to the total space of the Fock bundle in a compatible way with the second quantized Dirac operator. In general there is an obstruction to this (called the Mickelsson-Faddeev anomaly) and various geometric interpretations for this anomaly, using such things as group extensions and bundle gerbes, have been given earlier. In this work we give a new geometric interpretation for the Faddeev-Mickelsson anomaly in terms of differentiable gerbes (certain sheaves of categories) and central extensions of Lie groupoids. The second research article deals with the question how to define a Dirac-like operator on the restricted Grassmannian manifold, which is an infinite-dimensional space and hence not in the landscape of standard Dirac operator theory. The construction relies heavily on infinite-dimensional representation theory and one of the most technically demanding challenges is to be able to introduce proper normal orderings for certain infinite sums of operators in such a way that all divergences will disappear and the infinite sum will make sense as a well-defined operator acting on a suitable Hilbert space of spinors. This research article was motivated by a more extensive ongoing project to construct twisted K-theory classes in Yang-Mills theory via a Dirac-like operator on the restricted Grassmannian manifold.
Resumo:
This thesis describes current and past n-in-one methods and presents three early experimental studies using mass spectrometry and the triple quadrupole instrument on the application of n-in-one in drug discovery. N-in-one strategy pools and mix samples in drug discovery prior to measurement or analysis. This allows the most promising compounds to be rapidly identified and then analysed. Nowadays properties of drugs are characterised earlier and in parallel with pharmacological efficacy. Studies presented here use in vitro methods as caco-2 cells and immobilized artificial membrane chromatography for drug absorption and lipophilicity measurements. The high sensitivity and selectivity of liquid chromatography mass spectrometry are especially important for new analytical methods using n-in-one. In the first study, the fragmentation patterns of ten nitrophenoxy benzoate compounds, serial homology, were characterised and the presence of the compounds was determined in a combinatorial library. The influence of one or two nitro substituents and the alkyl chain length of methyl to pentyl on collision-induced fragmentation was studied, and interesting structurefragmentation relationships were detected. Two nitro group compounds increased fragmentation compared to one nitro group, whereas less fragmentation was noted in molecules with a longer alkyl chain. The most abundant product ions were nitrophenoxy ions, which were also tested in the precursor ion screening of the combinatorial library. In the second study, the immobilized artificial membrane chromatographic method was transferred from ultraviolet detection to mass spectrometric analysis and a new method was developed. Mass spectra were scanned and the chromatographic retention of compounds was analysed using extract ion chromatograms. When changing detectors and buffers and including n-in-one in the method, the results showed good correlation. Finally, the results demonstrated that mass spectrometric detection with gradient elution can provide a rapid and convenient n-in-one method for ranking the lipophilic properties of several structurally diverse compounds simultaneously. In the final study, a new method was developed for caco-2 samples. Compounds were separated by liquid chromatography and quantified by selected reaction monitoring using mass spectrometry. This method was used for caco-2 samples, where absorption of ten chemically and physiologically different compounds was screened using both single and nin- one approaches. These three studies used mass spectrometry for compound identification, method transfer and quantitation in the area of mixture analysis. Different mass spectrometric scanning modes for the triple quadrupole instrument were used in each method. Early drug discovery with n-in-one is area where mass spectrometric analysis, its possibilities and proper use, is especially important.
Resumo:
Poor pharmacokinetics is one of the reasons for the withdrawal of drug candidates from clinical trials. There is an urgent need for investigating in vitro ADME (absorption, distribution, metabolism and excretion) properties and recognising unsuitable drug candidates as early as possible in the drug development process. Current throughput of in vitro ADME profiling is insufficient because effective new synthesis techniques, such as drug design in silico and combinatorial synthesis, have vastly increased the number of drug candidates. Assay technologies for larger sets of compounds than are currently feasible are critically needed. The first part of this work focused on the evaluation of cocktail strategy in studies of drug permeability and metabolic stability. N-in-one liquid chromatography-tandem mass spectrometry (LC/MS/MS) methods were developed and validated for the multiple component analysis of samples in cocktail experiments. Together, cocktail dosing and LC/MS/MS were found to form an effective tool for increasing throughput. First, cocktail dosing, i.e. the use of a mixture of many test compounds, was applied in permeability experiments with Caco-2 cell culture, which is a widely used in vitro model for small intestinal absorption. A cocktail of 7-10 reference compounds was successfully evaluated for standardization and routine testing of the performance of Caco-2 cell cultures. Secondly, cocktail strategy was used in metabolic stability studies of drugs with UGT isoenzymes, which are one of the most important phase II drug metabolizing enzymes. The study confirmed that the determination of intrinsic clearance (Clint) as a cocktail of seven substrates is possible. The LC/MS/MS methods that were developed were fast and reliable for the quantitative analysis of a heterogenous set of drugs from Caco-2 permeability experiments and the set of glucuronides from in vitro stability experiments. The performance of a new ionization technique, atmospheric pressure photoionization (APPI), was evaluated through comparison with electrospray ionization (ESI), where both techniques were used for the analysis of Caco-2 samples. Like ESI, also APPI proved to be a reliable technique for the analysis of Caco-2 samples and even more flexible than ESI because of the wider dynamic linear range. The second part of the experimental study focused on metabolite profiling. Different mass spectrometric instruments and commercially available software tools were investigated for profiling metabolites in urine and hepatocyte samples. All the instruments tested (triple quadrupole, quadrupole time-of-flight, ion trap) exhibited some good and some bad features in searching for and identifying of expected and non-expected metabolites. Although, current profiling software is helpful, it is still insufficient. Thus a time-consuming largely manual approach is still required for metabolite profiling from complex biological matrices.
Resumo:
Polyethene, polyacrylates and polymethyl acrylates are versatile materials that find wide variety of applications in several areas. Therefore, polymerization of ethene, acrylates and methacrylates has achieved a lot attention during past years. Numbers of metal catalysts have been introduced in order to control the polymerization and to produce tailored polymer structures. Herein an overview on the possible polymerization pathways for ethene, acrylates and methacrylates is presented. In this thesis iron(II) and cobalt(II) complexes bearing tri- and tetradentate nitrogen ligands were synthesized and studied in the polymerization of tertbutyl acrylate (tBA) and methyl methacrylate (MMA). Complexes are activated with methylaluminoxane (MAO) before they form active combinations for polymerization reactions. The effect of reaction conditions, i.e. monomer concentration, reaction time, temperature, MAO to metal ratio, on activity and polymer properties were investigated. The described polymerization system enables mild reaction conditions, the possibility to tailor molar mass of the produced polymers and provides good control over the polymerization. Moreover, the polymerization of MMA in the presence of iron(II) complex with tetradentate nitrogen ligands under conditions of atom transfer radical polymerization (ATRP) was studied. Several manganese(II) complexes were studied in the ethene polymerization with combinatorial methods and new active catalysts were found. These complexes were also studied in acrylate and methacrylate polymerizations after MAO activation and converted into the corresponding alkyl (methyl or benzyl) derivatives. Combinatorial methods were introduced to discover aluminum alkyl complexes for the polymerization of acrylates and methacrylates. Various combinations of aluminum alkyls and ligands, including phosphines, salicylaldimines and nitrogen donor ligands, were prepared in situ and utilized to initiate the polymerization of tBA. Phosphine ligands were found to be the most active and the polymerization MMA was studied with these active combinations. In addition, a plausible polymerization mechanism for MMA based on ESI-MS, 1H and 13C NMR is proposed.
Resumo:
This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.
Resumo:
Growth is a fundamental aspect of life cycle of all organisms. Body size varies highly in most animal groups, such as mammals. Moreover, growth of a multicellular organism is not uniform enlargement of size, but different body parts and organs grow to their characteristic sizes at different times. Currently very little is known about the molecular mechanisms governing this organ-specific growth. The genome sequencing projects have provided complete genomic DNA sequences of several species over the past decade. The amount of genomic sequence information, including sequence variants within species, is constantly increasing. Based on the universal genetic code, we can make sense of this sequence information as far as it codes proteins. However, less is known about the molecular mechanisms that control expression of genes, and about the variations in gene expression that underlie many pathological states in humans. This is caused in part by lack of information about the second genetic code that consists of the binding specificities of transcription factors and the combinatorial code by which transcription factor binding sites are assembled to form tissue-specific and/or ligand-regulated enhancer elements. This thesis presents a high-throughput assay for identification of transcription factor binding specificities, which were then used to measure the DNA binding profiles of transcription factors involved in growth control. We developed ‘enhancer element locator’, a computational tool, which can be used to predict functional enhancer elements. A genome-wide prediction of human and mouse enhancer elements generated a large database of enhancer elements. This database can be used to identify target genes of signaling pathways, and to predict activated transcription factors based on changes in gene expression. Predictions validated in transgenic mouse embryos revealed the presence of multiple tissue-specific enhancers in mouse c- and N-Myc genes, which has implications to organ specific growth control and tumor type specificity of oncogenes. Furthermore, we were able to locate a variation in a single nucleotide, which carries a susceptibility to colorectal cancer, to an enhancer element and propose a mechanism by which this SNP might be involved in generation of colorectal cancer.
Resumo:
When augmented with the longest common prefix (LCP) array and some other structures, the suffix array can solve many string processing problems in optimal time and space. A compressed representation of the LCP array is also one of the main building blocks in many compressed suffix tree proposals. In this paper, we describe a new compressed LCP representation: the sampled LCP array. We show that when used with a compressed suffix array (CSA), the sampled LCP array often offers better time/space trade-offs than the existing alternatives. We also show how to construct the compressed representations of the LCP array directly from a CSA
Resumo:
According to certain arguments, computation is observer-relative either in the sense that many physical systems implement many computations (Hilary Putnam), or in the sense that almost all physical systems implement all computations (John Searle). If sound, these arguments have a potentially devastating consequence for the computational theory of mind: if arbitrary physical systems can be seen to implement arbitrary computations, the notion of computation seems to lose all explanatory power as far as brains and minds are concerned. David Chalmers and B. Jack Copeland have attempted to counter these relativist arguments by placing certain constraints on the definition of implementation. In this thesis, I examine their proposals and find both wanting in some respects. During the course of this examination, I give a formal definition of the class of combinatorial-state automata , upon which Chalmers s account of implementation is based. I show that this definition implies two theorems (one an observation due to Curtis Brown) concerning the computational power of combinatorial-state automata, theorems which speak against founding the theory of implementation upon this formalism. Toward the end of the thesis, I sketch a definition of the implementation of Turing machines in dynamical systems, and offer this as an alternative to Chalmers s and Copeland s accounts of implementation. I demonstrate that the definition does not imply Searle s claim for the universal implementation of computations. However, the definition may support claims that are weaker than Searle s, yet still troubling to the computationalist. There remains a kernel of relativity in implementation at any rate, since the interpretation of physical systems seems itself to be an observer-relative matter, to some degree at least. This observation helps clarify the role the notion of computation can play in cognitive science. Specifically, I will argue that the notion should be conceived as an instrumental rather than as a fundamental or foundational one.
Resumo:
Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain patterns by permuting the rows and columns. We concentrate on the following patterns in binary matrices: consecutive-ones (C1P), simultaneous consecutive-ones (SC1P), nestedness, k-nestedness, and bandedness. These patterns reflect specific types of interplay and variation between the rows and columns, such as continuity and hierarchies. Furthermore, their combinatorial properties are interlinked, which helps us to develop the theory of binary matrices and efficient algorithms. Indeed, we can detect all these patterns in a binary matrix efficiently, that is, in polynomial time in the size of the matrix. Since real-world datasets often contain noise and errors, we rarely witness perfect patterns. Therefore we also need to assess how far an input matrix is from a pattern: we count the number of flips (from 0s to 1s or vice versa) needed to bring out the perfect pattern in the matrix. Unfortunately, for most patterns it is an NP-complete problem to find the minimum distance to a matrix that has the perfect pattern, which means that the existence of a polynomial-time algorithm is unlikely. To find patterns in datasets with noise, we need methods that are noise-tolerant and work in practical time with large datasets. The theory of binary matrices gives rise to robust heuristics that have good performance with synthetic data and discover easily interpretable structures in real-world datasets: dialectical variation in the spoken Finnish language, division of European locations by the hierarchies found in mammal occurrences, and co-occuring groups in network data. In addition to determining the distance from a dataset to a pattern, we need to determine whether the pattern is significant or a mere occurrence of a random chance. To this end, we use significance testing: we deem a dataset significant if it appears exceptional when compared to datasets generated from a certain null hypothesis. After detecting a significant pattern in a dataset, it is up to domain experts to interpret the results in the terms of the application.
Resumo:
The most prominent objective of the thesis is the development of the generalized descriptive set theory, as we call it. There, we study the space of all functions from a fixed uncountable cardinal to itself, or to a finite set of size two. These correspond to generalized notions of the universal Baire space (functions from natural numbers to themselves with the product topology) and the Cantor space (functions from natural numbers to the {0,1}-set) respectively. We generalize the notion of Borel sets in three different ways and study the corresponding Borel structures with the aims of generalizing classical theorems of descriptive set theory or providing counter examples. In particular we are interested in equivalence relations on these spaces and their Borel reducibility to each other. The last chapter shows, using game-theoretic techniques, that the order of Borel equivalence relations under Borel reduciblity has very high complexity. The techniques in the above described set theoretical side of the thesis include forcing, general topological notions such as meager sets and combinatorial games of infinite length. By coding uncountable models to functions, we are able to apply the understanding of the generalized descriptive set theory to the model theory of uncountable models. The links between the theorems of model theory (including Shelah's classification theory) and the theorems in pure set theory are provided using game theoretic techniques from Ehrenfeucht-Fraïssé games in model theory to cub-games in set theory. The bottom line of the research declairs that the descriptive (set theoretic) complexity of an isomorphism relation of a first-order definable model class goes in synch with the stability theoretical complexity of the corresponding first-order theory. The first chapter of the thesis has slightly different focus and is purely concerned with a certain modification of the well known Ehrenfeucht-Fraïssé games. There we (me and my supervisor Tapani Hyttinen) answer some natural questions about that game mainly concerning determinacy and its relation to the standard EF-game
Resumo:
A local algorithm with local horizon r is a distributed algorithm that runs in r synchronous communication rounds; here r is a constant that does not depend on the size of the network. As a consequence, the output of a node in a local algorithm only depends on the input within r hops from the node. We give tight bounds on the local horizon for a class of local algorithms for combinatorial problems on unit-disk graphs (UDGs). Most of our bounds are due to a refined analysis of existing approaches, while others are obtained by suggesting new algorithms. The algorithms we consider are based on network decompositions guided by a rectangular tiling of the plane. The algorithms are applied to matching, independent set, graph colouring, vertex cover, and dominating set. We also study local algorithms on quasi-UDGs, which are a popular generalisation of UDGs, aimed at more realistic modelling of communication between the network nodes. Analysing the local algorithms on quasi-UDGs allows one to assume that the nodes know their coordinates only approximately, up to an additive error. Despite the localisation error, the quality of the solution to problems on quasi-UDGs remains the same as for the case of UDGs with perfect location awareness. We analyse the increase in the local horizon that comes along with moving from UDGs to quasi-UDGs.