16 resultados para Binary Morphology
em Helda - Digital Repository of University of Helsinki
Resumo:
The usual task in music information retrieval (MIR) is to find occurrences of a monophonic query pattern within a music database, which can contain both monophonic and polyphonic content. The so-called query-by-humming systems are a famous instance of content-based MIR. In such a system, the user's hummed query is converted into symbolic form to perform search operations in a similarly encoded database. The symbolic representation (e.g., textual, MIDI or vector data) is typically a quantized and simplified version of the sampled audio data, yielding to faster search algorithms and space requirements that can be met in real-life situations. In this thesis, we investigate geometric approaches to MIR. We first study some musicological properties often needed in MIR algorithms, and then give a literature review on traditional (e.g., string-matching-based) MIR algorithms and novel techniques based on geometry. We also introduce some concepts from digital image processing, namely the mathematical morphology, which we will use to develop and implement four algorithms for geometric music retrieval. The symbolic representation in the case of our algorithms is a binary 2-D image. We use various morphological pre- and post-processing operations on the query and the database images to perform template matching / pattern recognition for the images. The algorithms are basically extensions to classic image correlation and hit-or-miss transformation techniques used widely in template matching applications. They aim to be a future extension to the retrieval engine of C-BRAHMS, which is a research project of the Department of Computer Science at University of Helsinki.
Resumo:
The number of drug substances in formulation development in the pharmaceutical industry is increasing. Some of these are amorphous drugs and have glass transition below ambient temperature, and thus they are usually difficult to formulate and handle. One reason for this is the reduced viscosity, related to the stickiness of the drug, that makes them complicated to handle in unit operations. Thus, the aim in this thesis was to develop a new processing method for a sticky amorphous model material. Furthermore, model materials were characterised before and after formulation, using several characterisation methods, to understand more precisely the prerequisites for physical stability of amorphous state against crystallisation. The model materials used were monoclinic paracetamol and citric acid anhydrate. Amorphous materials were prepared by melt quenching or by ethanol evaporation methods. The melt blends were found to have slightly higher viscosity than the ethanol evaporated materials. However, melt produced materials crystallised more easily upon consecutive shearing than ethanol evaporated materials. The only material that did not crystallise during shearing was a 50/50 (w/w, %) blend regardless of the preparation method and it was physically stable at least two years in dry conditions. Shearing at varying temperatures was established to measure the physical stability of amorphous materials in processing and storage conditions. The actual physical stability of the blends was better than the pure amorphous materials at ambient temperature. Molecular mobility was not related to the physical stability of the amorphous blends, observed as crystallisation. Molecular mobility of the 50/50 blend derived from a spectral linewidth as a function of temperature using solid state NMR correlated better with the molecular mobility derived from a rheometer than that of differential scanning calorimetry data. Based on the results obtained, the effect of molecular interactions, thermodynamic driving force and miscibility of the blends are discussed as the key factors to stabilise the blends. The stickiness was found to be affected glass transition and viscosity. Ultrasound extrusion and cutting were successfully tested to increase the processability of sticky material. Furthermore, it was found to be possible to process the physically stable 50/50 blend in a supercooled liquid state instead of a glassy state. The method was not found to accelerate the crystallisation. This may open up new possibilities to process amorphous materials that are otherwise impossible to manufacture into solid dosage forms.
Resumo:
In this study I offer a diachronic solution for a number of difficult inflectional endings in Old Church Slavic nominal declensions. In this context I address the perhaps most disputed and the most important question of the Slavic nominal inflectional morphology: whether there was in Proto-Slavic an Auslautgesetz (ALG), a law of final syllables, that narrowed the Proto-Indo-European vowel */o/ to */u/ in closed word-final syllables. In addition, the work contains an exhaustive morphological classification of the nouns and adjectives that occur in canonical Old Church Slavic. I argue that Proto-Indo-European */o/ became Proto-Slavic */u/ before word-final */s/ and */N/. This conclusion is based on the impossibility of finding credible analogical (as opposed to phonological) explanations for the forms supporting the ALG hypothesis, and on the survival of the neuter gender in Slavic. It is not likely that the */o/-stem nominative singular ending */-u/ was borrowed from the accusative singular, because the latter would have been the only paradigmatic form with the stem vowel */-u-/. It is equally unlikely that the ending */-u/ was borrowed from the */u/-stems, because the latter constituted a moribund class. The usually stated motivation for such an analogical borrowing, i.e. a need to prevent the merger of */o/-stem masculines with neuters of the same class, is not tenable. Extra-Slavic, as well as intra-Slavic evidence suggests that phonologically-triggered mergers between two semantically opaque genders do not tend to be prevented, but rather that such mergers lead to the loss of the gender opposition in question. On the other hand, if */-os/ had not become */-us/, most nouns and, most importantly, all adjectives and pronouns would have lost the formal distinction between masculines and neuters. This would have necessarily resulted in the loss of the neuter gender. A new explanation is given for the most apparent piece of evidence against the ALG hypothesis, the nominative-accusative singular of the */es/-stem neuters, e.g. nebo 'sky'. I argue that it arose in late Proto-Slavic dialects, replacing regular nebe, under the influence of the */o/- and */yo/-stems where a correlation had emerged between a hard root-final consonant and the termination -o, on the one hand, and a soft root-final consonant and the termination -e, on the other.
Resumo:
Nephrin is a transmembrane protein belonging to the immunoglobulin superfamily and is expressed primarily in the podocytes, which are highly differentiated epithelial cells needed for primary urine formation in the kidney. Mutations leading to nephrin loss abrogate podocyte morphology, and result in massive protein loss into urine and consequent early death in humans carrying specific mutations in this gene. The disease phenotype is closely replicated in respective mouse models. The purpose of this thesis was to generate novel inducible mouse-lines, which allow targeted gene deletion in a time and tissue-specific manner. A proof of principle model for succesful gene therapy for this disease was generated, which allowed podocyte specific transgene replacement to rescue gene deficient mice from perinatal lethality. Furthermore, the phenotypic consequences of nephrin restoration in the kidney and nephrin deficiency in the testis, brain and pancreas in rescued mice were investigated. A novel podocyte-specific construct was achieved by using standard cloning techniques to provide an inducible tool for in vitro and in vivo gene targeting. Using modified constructs and microinjection procedures two novel transgenic mouse-lines were generated. First, a mouse-line with doxycycline inducible expression of Cre recombinase that allows podocyte-specific gene deletion was generated. Second, a mouse-line with doxycycline inducible expression of rat nephrin, which allows podocyte-specific nephrin over-expression was made. Furthermore, it was possible to rescue nephrin deficient mice from perinatal lethality by cross-breeding them with a mouse-line with inducible rat nephrin expression that restored the missing endogenous nephrin only in the kidney after doxycycline treatment. The rescued mice were smaller, infertile, showed genital malformations and developed distinct histological abnormalities in the kidney with an altered molecular composition of the podocytes. Histological changes were also found in the testis, cerebellum and pancreas. The expression of another molecule with limited tissue expression, densin, was localized to the plasma membranes of Sertoli cells in the testis by immunofluorescence staining. Densin may be an essential adherens junction protein between Sertoli cells and developing germ cells and these junctions share similar protein assembly with kidney podocytes. This single, binary conditional construct serves as a cost- and time-efficient tool to increase the understanding of podocyte-specific key proteins in health and disease. The results verified a tightly controlled inducible podocyte-specific transgene expression in vitro and in vivo as expected. These novel mouse-lines with doxycycline inducible Cre recombinase and with rat nephrin expression will be useful for conditional gene targeting of essential podocyte proteins and to study in detail their functions in the adult mice. This is important for future diagnostic and pharmacologic development platforms.
Resumo:
Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we develop efficient algorithms for a commonly occurring search problem - searching for the statistically most significant dependency rules in binary data. We consider dependency rules of the form X->A or X->not A, where X is a set of positive-valued attributes and A is a single attribute. Such rules describe which factors either increase or decrease the probability of the consequent A. A classical example are genetic and environmental factors, which can either cause or prevent a disease. The emphasis in this research is that the discovered dependencies should be genuine - i.e. they should also hold in future data. This is an important distinction from the traditional association rules, which - in spite of their name and a similar appearance to dependency rules - do not necessarily represent statistical dependencies at all or represent only spurious connections, which occur by chance. Therefore, the principal objective is to search for the rules with statistical significance measures. Another important objective is to search for only non-redundant rules, which express the real causes of dependence, without any occasional extra factors. The extra factors do not add any new information on the dependence, but can only blur it and make it less accurate in future data. The problem is computationally very demanding, because the number of all possible rules increases exponentially with the number of attributes. In addition, neither the statistical dependency nor the statistical significance are monotonic properties, which means that the traditional pruning techniques do not work. As a solution, we first derive the mathematical basis for pruning the search space with any well-behaving statistical significance measures. The mathematical theory is complemented by a new algorithmic invention, which enables an efficient search without any heuristic restrictions. The resulting algorithm can be used to search for both positive and negative dependencies with any commonly used statistical measures, like Fisher's exact test, the chi-squared measure, mutual information, and z scores. According to our experiments, the algorithm is well-scalable, especially with Fisher's exact test. It can easily handle even the densest data sets with 10000-20000 attributes. Still, the results are globally optimal, which is a remarkable improvement over the existing solutions. In practice, this means that the user does not have to worry whether the dependencies hold in future data or if the data still contains better, but undiscovered dependencies.
Resumo:
The cells of multicellular organisms have differentiated to carry out specific functions that are often accompanied by distinct cell morphology. The actin cytoskeleton is one of the key regulators of cell shape subsequently controlling multiple cellular events including cell migration, cell division, endo- and exocytosis. A large set of actin regulating proteins has evolved to achieve and tightly coordinate this wide range of functions. Some actin regulator proteins have so-called house keeping roles and are essential for all eukaryotic cells, but some have evolved to meet the requirements of more specialized cell-types found in higher organisms enabling complex functions of differentiated organs, such as liver, kidney and brain. Often processes mediated by the actin cytoskeleton, like formation of cellular protrusions during cell migration, are intimately linked to plasma membrane remodeling. Thus, a close cooperation between these two cellular compartments is necessary, yet not much is known about the underlying molecular mechanisms. This study focused on a vertebrate-specific protein called missing-in-metastasis (MIM), which was originally characterized as a metastasis suppressor of bladder cancer. We demonstrated that MIM regulates the dynamics of actin cytoskeleton via its WH2 domain, and is expressed in a cell-type specific manner. Interestingly, further examination showed that the IM-domain of MIM displays a novel membrane tubulation activity, which induces formation of filopodia in cells. Following studies demonstrated that this membrane deformation activity is crucial for cell protrusions driven by MIM. In mammals, there are five members of IM-domain protein family. Functions and expression patterns of these family members have remained poorly characterized. To understand the physiological functions of MIM, we generated MIM knockout mice. MIM-deficient mice display no apparent developmental defects, but instead suffer from progressive renal disease and increased susceptibility to tumors. This indicates that MIM plays a role in the maintenance of specific physiological functions associated with distinct cell morphologies. Taken together, these studies implicate MIM both in the regulation of the actin cytoskeleton and the plasma membrane. Our results thus suggest that members of MIM/IRSp53 protein family coordinate the actin cytoskeleton:plasma membrane interface to control cell and tissue morphogenesis in multicellular organisms.
Resumo:
To a large extent, lakes can be described with a one-dimensional approach, as their main features can be characterized by the vertical temperature profile of the water. The development of the profiles during the year follows the seasonal climate variations. Depending on conditions, lakes become stratified during the warm summer. After cooling, overturn occurs, water cools and an ice cover forms. Typically, water is inversely stratified under the ice, and another overturn occurs in spring after the ice has melted. Features of this circulation have been used in studies to distinguish between lakes in different areas, as basis for observation systems and even as climate indicators. Numerical models can be used to calculate temperature in the lake, on the basis of the meteorological input at the surface. The simple form is to solve the surface temperature. The depth of the lake affects heat transfer, together with other morphological features, the shape and size of the lake. Also the surrounding landscape affects the formation of the meteorological fields over the lake and the energy input. For small lakes the shading by the shores affects both over the lake and inside the water body bringing limitations for the one-dimensional approach. A two-layer model gives an approximation for the basic stratification in the lake. A turbulence model can simulate vertical temperature profile in a more detailed way. If the shape of the temperature profile is very abrupt, vertical transfer is hindered, having many important consequences for lake biology. One-dimensional modelling approach was successfully studied comparing a one-layer model, a two-layer model and a turbulence model. The turbulence model was applied to lakes with different sizes, shapes and locations. Lake models need data from the lakes for model adjustment. The use of the meteorological input data on different scales was analysed, ranging from momentary turbulent changes over the lake to the use of the synoptical data with three hour intervals. Data over about 100 past years were used on the mesoscale at the range of about 100 km and climate change scenarios for future changes. Increasing air temperature typically increases water temperature in epilimnion and decreases ice cover. Lake ice data were used for modelling different kinds of lakes. They were also analyzed statistically in global context. The results were also compared with results of a hydrological watershed model and data from very small lakes for seasonal development.
Resumo:
This thesis studies binary time series models and their applications in empirical macroeconomics and finance. In addition to previously suggested models, new dynamic extensions are proposed to the static probit model commonly used in the previous literature. In particular, we are interested in probit models with an autoregressive model structure. In Chapter 2, the main objective is to compare the predictive performance of the static and dynamic probit models in forecasting the U.S. and German business cycle recession periods. Financial variables, such as interest rates and stock market returns, are used as predictive variables. The empirical results suggest that the recession periods are predictable and dynamic probit models, especially models with the autoregressive structure, outperform the static model. Chapter 3 proposes a Lagrange Multiplier (LM) test for the usefulness of the autoregressive structure of the probit model. The finite sample properties of the LM test are considered with simulation experiments. Results indicate that the two alternative LM test statistics have reasonable size and power in large samples. In small samples, a parametric bootstrap method is suggested to obtain approximately correct size. In Chapter 4, the predictive power of dynamic probit models in predicting the direction of stock market returns are examined. The novel idea is to use recession forecast (see Chapter 2) as a predictor of the stock return sign. The evidence suggests that the signs of the U.S. excess stock returns over the risk-free return are predictable both in and out of sample. The new "error correction" probit model yields the best forecasts and it also outperforms other predictive models, such as ARMAX models, in terms of statistical and economic goodness-of-fit measures. Chapter 5 generalizes the analysis of univariate models considered in Chapters 2 4 to the case of a bivariate model. A new bivariate autoregressive probit model is applied to predict the current state of the U.S. business cycle and growth rate cycle periods. Evidence of predictability of both cycle indicators is obtained and the bivariate model is found to outperform the univariate models in terms of predictive power.
Resumo:
Most of the world’s languages lack electronic word form dictionaries. The linguists who gather such dictionaries could be helped with an efficient morphology workbench that adapts to different environments and uses. A widely usable workbench could be characterized, ideally, as generally applicable, extensible, and freely available (GEA). It seems that such a solution could be implemented in the framework of finite-state methods. The current work defines the GEA desiderata and starts a series of articles concerning these desiderata in finite- state morphology. Subsequent parts will review the state of the art and present an action plan toward creating a widely usable finite-state morphology workbench.
Resumo:
The aim of this thesis was to unravel the functional-structural characteristics of root systems of Betula pendula Roth., Picea abies (L.) Karst., and Pinus sylvestris L. in mixed boreal forest stands differing in their developmental stage and site fertility. The root systems of these species had similar structural regularities: horizontally-oriented shallow roots defined the horizontal area of influence, and within this area, each species placed fine roots in the uppermost soil layers, while sinker roots defined the maximum rooting depth. Large radial spread and high ramification of coarse roots, and the high specific root length (SRL) and root length density (RLD) of fine roots indicated the high belowground competitiveness and root plasticity of B. pendula. Smaller radial root spread and sparser branching of coarse roots, and low SRL and RLD of fine roots of the conifers could indicate their more conservative resource use and high association with and dependence on ectomycorrhiza-forming fungi. The vertical fine root distributions of the species were mostly overlapping, implying the possibility for intense belowground competition for nutrients. In each species, conduits tapered and their frequency increased from distal roots to the stem, from the stem to the branches, and to leaf petioles in B. pendula. Conduit tapering was organ-specific in each species violating the assumptions of the general vascular scaling model (WBE). This reflects the hierarchical organization of a tree and differences between organs in the relative importance of transport, safety, and mechanical demands. The applied root model was capable of depicting the mass, length and spread of coarse roots of B. pendula and P. abies, and to the lesser extent in P. sylvestris. The roots did not follow self-similar fractal branching, because the parameter values varied within the root systems. Model parameters indicate differences in rooting behavior, and therefore different ecophysiological adaptations between species.
Resumo:
Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain patterns by permuting the rows and columns. We concentrate on the following patterns in binary matrices: consecutive-ones (C1P), simultaneous consecutive-ones (SC1P), nestedness, k-nestedness, and bandedness. These patterns reflect specific types of interplay and variation between the rows and columns, such as continuity and hierarchies. Furthermore, their combinatorial properties are interlinked, which helps us to develop the theory of binary matrices and efficient algorithms. Indeed, we can detect all these patterns in a binary matrix efficiently, that is, in polynomial time in the size of the matrix. Since real-world datasets often contain noise and errors, we rarely witness perfect patterns. Therefore we also need to assess how far an input matrix is from a pattern: we count the number of flips (from 0s to 1s or vice versa) needed to bring out the perfect pattern in the matrix. Unfortunately, for most patterns it is an NP-complete problem to find the minimum distance to a matrix that has the perfect pattern, which means that the existence of a polynomial-time algorithm is unlikely. To find patterns in datasets with noise, we need methods that are noise-tolerant and work in practical time with large datasets. The theory of binary matrices gives rise to robust heuristics that have good performance with synthetic data and discover easily interpretable structures in real-world datasets: dialectical variation in the spoken Finnish language, division of European locations by the hierarchies found in mammal occurrences, and co-occuring groups in network data. In addition to determining the distance from a dataset to a pattern, we need to determine whether the pattern is significant or a mere occurrence of a random chance. To this end, we use significance testing: we deem a dataset significant if it appears exceptional when compared to datasets generated from a certain null hypothesis. After detecting a significant pattern in a dataset, it is up to domain experts to interpret the results in the terms of the application.