22 resultados para linear complexity
em Helda - Digital Repository of University of Helsinki
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammars
Resumo:
How toddlers with special needs adjust to the daycare setting A multiple case study of how the relationships with adults and children are built The aim in this study is to describe how toddlers with special needs adjust to daycare. The emotional well-being and involvement in daycare activities of toddlers are especially investigated in this study. The relationship and how it is built between an adult and a child, a child and a child is examined. The daycare is examined through the socio-cultural theory as a pedagogical institution, where the child adapts by participating in social and cultural activities with the others. The development of the child is the result of the experiences that are gained through the constant relationship between the child, the family and social context. By the attachment theory the inner self-regulation, that allows the child safely adapt to new situations, develops most in the relationship between the child under 3years of age and the attending adult. The relationships between toddlers in daycare are usually built by the coincidental encounters in play and daily activities. In these relationships, the toddler gets the information of themselves and the other children. The complexity of the rules in the setting that organize the social action is challenging for the children and they need constant support from the adults. The participants of the study were five toddlers with special needs. When applying to daycare they were less than three years old and they got the specialist statement for their special needs, and the reference for daycare. The children were observed by recording their attending in the daycare once in the 3-4 months from the first day in daycare. Approximately 15 hours of material that was analysed with the Transana-program. The qualitative material was analysed by first collecting a descriptive model that explains and theorises the phenomenon. By the summery of the narrative it is placed a hypothesis that is tested by quantitative methods using correlations and variance analyses and general linear modeling that is used to count the differences between repeated measures and connections between different variables. The results of the study are built theoretically for the consistent conception between the theory and the findings in research. The toddlers in the study were all dependent on the support given by the adults in all the situations in the daycare. They could not associate with the other children without the support of the adults and their involvement in activities was low. The engagement of an adult in interaction was necessary for the children’s involvement in activities, and the co-operation with the other children. The engagement of teachers was statistically significantly higher than the engagement of other professions.
Resumo:
Uusien polymeeripohjaisten teknologioiden ja materiaalien myötä räätälöityjen polymeerien tarve on kasvanut. Viime vuosituhannen lopussa kehitetyt kontrolloidut polymerointimenetelmät ovat avanneet uusia mahdollisuuksia paitsi monimutkaisten polymeerien synteesiin, myös itsejärjestyvyyteen perustuvien funktionaalisten nanorakenteiden suunnitteluun ja valmistukseen. Nämä voivat jäljitellä luonnossa esiintyviä rakenteita, joita muodostavat esimerkiksi lipidit ja proteiinit. Itsejärjestyvät molekyylit ovat usein amfifiilisiä eli ne koostuvat hydrofiilisistä ja hydrofobisista osista ja polymeereissä nämä osat voivat olla omina lohkoinaan, jolloin puhutaan amfifiilisistä lohko- tai blokkikopolymeereistä. Riippuen järjestyneiden rakenteiden koostumuksesta ja muodosta, amfifiilisiä blokkikopolymeerejä on tutkittu tai jo käytetty nanoteknologiassa, elastomeereissä, voiteluaineissa, pinta-aktiivisina aineina, lääkkeenannostelussa, maaleissa, sekä elektroniikka-, kosmetiikka- ja elintarviketeollisuudessa. Tavallisimmin käytetyt amfifiiliset blokkikopolymeerit ovat olleet lineaarisia, mutta viime aikoina tutkimus on suuntautunut kohti monimutkaisempia rakenteita. Tällaisia ovat esimerkiksi tähtipolymeerit. Tähtimäisissä polymeereissä miselleille tyypillinen ydin-kuori-rakenne säilyy hyvin alhaisissakin polymeerikonsentraatioissa, koska polymeeriketjut ovat kiinni toisissaan yhdessä pisteessä. Siten ne ovat erityisen kiinnostavia tutkimuskohteita erilaisten hydrofobisten orgaanisten yhdisteiden sitomiseksi ja vapauttamiseksi. Tässä työssä on tarkasteltu amfifiilisten tähtipolymeerien itsejärjestymistä vesiliuoksissa sekä kokeellisesti ja tietokonesimulaatioin. Työ koostuu kahdesta osasta: tähtipolymeerien synteesistä makrosyklisillä initiaattoreilla ja amfifiilisten tähtimäisten blokkikopolymeerien ominaisuuksien tutkimisesta.
Resumo:
Environmentally benign and economical methods for the preparation of industrially important hydroxy acids and diacids were developed. The carboxylic acids, used in polyesters, alkyd resins, and polyamides, were obtained by the oxidation of the corresponding alcohols with hydrogen peroxide or air catalyzed by sodium tungstate or supported noble metals. These oxidations were carried out using water as a solvent. The alcohols are also a useful alternative to the conventional reactants, hydroxyaldehydes and cycloalkanes. The oxidation of 2,2-disubstituted propane-1,3-diols with hydrogen peroxide catalyzed by sodium tungstate afforded 2,2-disubstituted 3-hydroxypropanoic acids and 1,1-disubstituted ethane-1,2-diols as products. A computational study of the Baeyer-Villiger rearrangement of the intermediate 2,2-disubstituted 3-hydroxypropanals gave in-depth data of the mechanism of the reaction. Linear primary diols having chain length of at least six carbons were easily oxidized with hydrogen peroxide to linear dicarboxylic acids catalyzed by sodium tungstate. The Pt/C catalyzed air oxidation of 2,2-disubstituted propane-1,3-diols and linear primary diols afforded the highest yield of the corresponding hydroxy acids, while the Pt, Bi/C catalyzed oxidation of the diols afforded the highest yield of the corresponding diacids. The mechanism of the promoted oxidation was best described by the ensemble effect, and by the formation of a complex of the hydroxy and the carboxy groups of the hydroxy acids with bismuth atoms. The Pt, Bi/C catalyzed air oxidation of 2-substituted 2-hydroxymethylpropane-1,3-diols gave 2-substituted malonic acids by the decarboxylation of the corresponding triacids. Activated carbon was the best support and bismuth the most efficient promoter in the air oxidation of 2,2-dialkylpropane-1,3-diols to diacids. In oxidations carried out in organic solvents barium sulfate could be a valuable alternative to activated carbon as a non-flammable support. In the Pt/C catalyzed air oxidation of 2,2-disubstituted propane-1,3-diols to 2,2-disubstituted 3-hydroxypropanoic acids the small size of the 2-substituents enhanced the rate of the oxidation. When the potential of platinum of the catalyst was not controlled, the highest yield of the diacids in the Pt, Bi/C catalyzed air oxidation of 2,2-dialkylpropane-1,3-diols was obtained in the regime of mass transfer. The most favorable pH of the reaction mixture of the promoted oxidation was 10. The reaction temperature of 40°C prevented the decarboxylation of the diacids.
Resumo:
We solve the Dynamic Ehrenfeucht-Fra\"iss\'e Game on linear orders for both players, yielding a normal form for quantifier-rank equivalence classes of linear orders in first-order logic, infinitary logic, and generalized-infinitary logics with linearly ordered clocks. We show that Scott Sentences can be manipulated quickly, classified into local information, and consistency can be decided effectively in the length of the Scott Sentence. We describe a finite set of linked automata moving continuously on a linear order. Running them on ordinals, we compute the ordinal truth predicate and compute truth in the constructible universe of set-theory. Among the corollaries are a study of semi-models as efficient database of both model-theoretic and formulaic information, and a new proof of the atomicity of the Boolean algebra of sentences consistent with the theory of linear order -- i.e., that the finitely axiomatized theories of linear order are dense.
Resumo:
Minimum Description Length (MDL) is an information-theoretic principle that can be used for model selection and other statistical inference tasks. There are various ways to use the principle in practice. One theoretically valid way is to use the normalized maximum likelihood (NML) criterion. Due to computational difficulties, this approach has not been used very often. This thesis presents efficient floating-point algorithms that make it possible to compute the NML for multinomial, Naive Bayes and Bayesian forest models. None of the presented algorithms rely on asymptotic analysis and with the first two model classes we also discuss how to compute exact rational number solutions.
Resumo:
We have presented an overview of the FSIG approach and related FSIG gram- mars to issues of very low complexity and parsing strategy. We ended up with serious optimism according to which most FSIG grammars could be decom- posed in a reasonable way and then processed efficiently.
Resumo:
Financial time series tend to behave in a manner that is not directly drawn from a normal distribution. Asymmetries and nonlinearities are usually seen and these characteristics need to be taken into account. To make forecasts and predictions of future return and risk is rather complicated. The existing models for predicting risk are of help to a certain degree, but the complexity in financial time series data makes it difficult. The introduction of nonlinearities and asymmetries for the purpose of better models and forecasts regarding both mean and variance is supported by the essays in this dissertation. Linear and nonlinear models are consequently introduced in this dissertation. The advantages of nonlinear models are that they can take into account asymmetries. Asymmetric patterns usually mean that large negative returns appear more often than positive returns of the same magnitude. This goes hand in hand with the fact that negative returns are associated with higher risk than in the case where positive returns of the same magnitude are observed. The reason why these models are of high importance lies in the ability to make the best possible estimations and predictions of future returns and for predicting risk.
Resumo:
This paper examines how volatility in financial markets can preferable be modeled. The examination investigates how good the models for the volatility, both linear and nonlinear, are in absorbing skewness and kurtosis. The examination is done on the Nordic stock markets, including Finland, Sweden, Norway and Denmark. Different linear and nonlinear models are applied, and the results indicates that a linear model can almost always be used for modeling the series under investigation, even though nonlinear models performs slightly better in some cases. These results indicate that the markets under study are exposed to asymmetric patterns only to a certain degree. Negative shocks generally have a more prominent effect on the markets, but these effects are not really strong. However, in terms of absorbing skewness and kurtosis, nonlinear models outperform linear ones.
Resumo:
In this dissertation I study language complexity from a typological perspective. Since the structuralist era, it has been assumed that local complexity differences in languages are balanced out in cross-linguistic comparisons and that complexity is not affected by the geopolitical or sociocultural aspects of the speech community. However, these assumptions have seldom been studied systematically from a typological point of view. My objective is to define complexity so that it is possible to compare it across languages and to approach its variation with the methods of quantitative typology. My main empirical research questions are: i) does language complexity vary in any systematic way in local domains, and ii) can language complexity be affected by the geographical or social environment? These questions are studied in three articles, whose findings are summarized in the introduction to the dissertation. In order to enable cross-language comparison, I measure complexity as the description length of the regularities in an entity; I separate it from difficulty, focus on local instead of global complexity, and break it up into different types. This approach helps avoid the problems that plagued earlier metrics of language complexity. My approach to grammar is functional-typological in nature, and the theoretical framework is basic linguistic theory. I delimit the empirical research functionally to the marking of core arguments (the basic participants in the sentence). I assess the distributions of complexity in this domain with multifactorial statistical methods and use different sampling strategies, implementing, for instance, the Greenbergian view of universals as diachronic laws of type preference. My data come from large and balanced samples (up to approximately 850 languages), drawn mainly from reference grammars. The results suggest that various significant trends occur in the marking of core arguments in regard to complexity and that complexity in this domain correlates with population size. These results provide evidence that linguistic patterns interact among themselves in terms of complexity, that language structure adapts to the social environment, and that there may be cognitive mechanisms that limit complexity locally. My approach to complexity and language universals can therefore be successfully applied to empirical data and may serve as a model for further research in these areas.