16 resultados para Linear algebra
em Helda - Digital Repository of University of Helsinki
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
The metabolism of an organism consists of a network of biochemical reactions that transform small molecules, or metabolites, into others in order to produce energy and building blocks for essential macromolecules. The goal of metabolic flux analysis is to uncover the rates, or the fluxes, of those biochemical reactions. In a steady state, the sum of the fluxes that produce an internal metabolite is equal to the sum of the fluxes that consume the same molecule. Thus the steady state imposes linear balance constraints to the fluxes. In general, the balance constraints imposed by the steady state are not sufficient to uncover all the fluxes of a metabolic network. The fluxes through cycles and alternative pathways between the same source and target metabolites remain unknown. More information about the fluxes can be obtained from isotopic labelling experiments, where a cell population is fed with labelled nutrients, such as glucose that contains 13C atoms. Labels are then transferred by biochemical reactions to other metabolites. The relative abundances of different labelling patterns in internal metabolites depend on the fluxes of pathways producing them. Thus, the relative abundances of different labelling patterns contain information about the fluxes that cannot be uncovered from the balance constraints derived from the steady state. The field of research that estimates the fluxes utilizing the measured constraints to the relative abundances of different labelling patterns induced by 13C labelled nutrients is called 13C metabolic flux analysis. There exist two approaches of 13C metabolic flux analysis. In the optimization approach, a non-linear optimization task, where candidate fluxes are iteratively generated until they fit to the measured abundances of different labelling patterns, is constructed. In the direct approach, linear balance constraints given by the steady state are augmented with linear constraints derived from the abundances of different labelling patterns of metabolites. Thus, mathematically involved non-linear optimization methods that can get stuck to the local optima can be avoided. On the other hand, the direct approach may require more measurement data than the optimization approach to obtain the same flux information. Furthermore, the optimization framework can easily be applied regardless of the labelling measurement technology and with all network topologies. In this thesis we present a formal computational framework for direct 13C metabolic flux analysis. The aim of our study is to construct as many linear constraints to the fluxes from the 13C labelling measurements using only computational methods that avoid non-linear techniques and are independent from the type of measurement data, the labelling of external nutrients and the topology of the metabolic network. The presented framework is the first representative of the direct approach for 13C metabolic flux analysis that is free from restricting assumptions made about these parameters.In our framework, measurement data is first propagated from the measured metabolites to other metabolites. The propagation is facilitated by the flow analysis of metabolite fragments in the network. Then new linear constraints to the fluxes are derived from the propagated data by applying the techniques of linear algebra.Based on the results of the fragment flow analysis, we also present an experiment planning method that selects sets of metabolites whose relative abundances of different labelling patterns are most useful for 13C metabolic flux analysis. Furthermore, we give computational tools to process raw 13C labelling data produced by tandem mass spectrometry to a form suitable for 13C metabolic flux analysis.
Resumo:
We solve the Dynamic Ehrenfeucht-Fra\"iss\'e Game on linear orders for both players, yielding a normal form for quantifier-rank equivalence classes of linear orders in first-order logic, infinitary logic, and generalized-infinitary logics with linearly ordered clocks. We show that Scott Sentences can be manipulated quickly, classified into local information, and consistency can be decided effectively in the length of the Scott Sentence. We describe a finite set of linked automata moving continuously on a linear order. Running them on ordinals, we compute the ordinal truth predicate and compute truth in the constructible universe of set-theory. Among the corollaries are a study of semi-models as efficient database of both model-theoretic and formulaic information, and a new proof of the atomicity of the Boolean algebra of sentences consistent with the theory of linear order -- i.e., that the finitely axiomatized theories of linear order are dense.
Resumo:
Environmentally benign and economical methods for the preparation of industrially important hydroxy acids and diacids were developed. The carboxylic acids, used in polyesters, alkyd resins, and polyamides, were obtained by the oxidation of the corresponding alcohols with hydrogen peroxide or air catalyzed by sodium tungstate or supported noble metals. These oxidations were carried out using water as a solvent. The alcohols are also a useful alternative to the conventional reactants, hydroxyaldehydes and cycloalkanes. The oxidation of 2,2-disubstituted propane-1,3-diols with hydrogen peroxide catalyzed by sodium tungstate afforded 2,2-disubstituted 3-hydroxypropanoic acids and 1,1-disubstituted ethane-1,2-diols as products. A computational study of the Baeyer-Villiger rearrangement of the intermediate 2,2-disubstituted 3-hydroxypropanals gave in-depth data of the mechanism of the reaction. Linear primary diols having chain length of at least six carbons were easily oxidized with hydrogen peroxide to linear dicarboxylic acids catalyzed by sodium tungstate. The Pt/C catalyzed air oxidation of 2,2-disubstituted propane-1,3-diols and linear primary diols afforded the highest yield of the corresponding hydroxy acids, while the Pt, Bi/C catalyzed oxidation of the diols afforded the highest yield of the corresponding diacids. The mechanism of the promoted oxidation was best described by the ensemble effect, and by the formation of a complex of the hydroxy and the carboxy groups of the hydroxy acids with bismuth atoms. The Pt, Bi/C catalyzed air oxidation of 2-substituted 2-hydroxymethylpropane-1,3-diols gave 2-substituted malonic acids by the decarboxylation of the corresponding triacids. Activated carbon was the best support and bismuth the most efficient promoter in the air oxidation of 2,2-dialkylpropane-1,3-diols to diacids. In oxidations carried out in organic solvents barium sulfate could be a valuable alternative to activated carbon as a non-flammable support. In the Pt/C catalyzed air oxidation of 2,2-disubstituted propane-1,3-diols to 2,2-disubstituted 3-hydroxypropanoic acids the small size of the 2-substituents enhanced the rate of the oxidation. When the potential of platinum of the catalyst was not controlled, the highest yield of the diacids in the Pt, Bi/C catalyzed air oxidation of 2,2-dialkylpropane-1,3-diols was obtained in the regime of mass transfer. The most favorable pH of the reaction mixture of the promoted oxidation was 10. The reaction temperature of 40°C prevented the decarboxylation of the diacids.
Resumo:
This paper examines how volatility in financial markets can preferable be modeled. The examination investigates how good the models for the volatility, both linear and nonlinear, are in absorbing skewness and kurtosis. The examination is done on the Nordic stock markets, including Finland, Sweden, Norway and Denmark. Different linear and nonlinear models are applied, and the results indicates that a linear model can almost always be used for modeling the series under investigation, even though nonlinear models performs slightly better in some cases. These results indicate that the markets under study are exposed to asymmetric patterns only to a certain degree. Negative shocks generally have a more prominent effect on the markets, but these effects are not really strong. However, in terms of absorbing skewness and kurtosis, nonlinear models outperform linear ones.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0 for nonnegative matrices A and C. We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0. In a min-max LP, the objective is to minimise ρ subject to Ax ≤ ρ1, Cx ≥ 1, and x ≥ 0. The matrices A and C are nonnegative and sparse: each row ai of A has at most ΔI positive elements, and each row ck of C has at most ΔK positive elements. We study the approximability of max-min LPs and min-max LPs in a distributed setting; in particular, we focus on local algorithms (constant-time distributed algorithms). We show that for any ΔI ≥ 2, ΔK ≥ 2, and ε > 0 there exists a local algorithm that achieves the approximation ratio ΔI (1 − 1/ΔK) + ε. We also show that this result is the best possible: no local algorithm can achieve the approximation ratio ΔI (1 − 1/ΔK) for any ΔI ≥ 2 and ΔK ≥ 2.
Resumo:
A new classification and linear sequence of the gymnosperms based on previous molecular and morphological phylogenetic and other studies is presented. Currently accepted genera are listed for each family and arranged according to their (probable) phylogenetic position. A full synonymy is provided, and types are listed for accepted genera. An index to genera assists in easy access to synonymy and family placement of genera.
Resumo:
Throughout the history of the classification of extant ferns (monilophytes) and lycophytes, familial and generic concepts have been in great flux. For the organisation of lycophytes and ferns in herbaria, books, checklists, indices and spore banks and on the internet, this poses a problem, and a standardized linear sequence of these plants is therefore in great need. We provide here a linear classification to the extant lycophytes and ferns based on current phylogenetic knowledge; this provides a standardized guide for organisation of fern collections into a more natural sequence. Two new families, Diplaziopsidaceae and Rhachidosoraceae, are here introduced.