24 resultados para Efficient Solutions

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we develop efficient algorithms for a commonly occurring search problem - searching for the statistically most significant dependency rules in binary data. We consider dependency rules of the form X->A or X->not A, where X is a set of positive-valued attributes and A is a single attribute. Such rules describe which factors either increase or decrease the probability of the consequent A. A classical example are genetic and environmental factors, which can either cause or prevent a disease. The emphasis in this research is that the discovered dependencies should be genuine - i.e. they should also hold in future data. This is an important distinction from the traditional association rules, which - in spite of their name and a similar appearance to dependency rules - do not necessarily represent statistical dependencies at all or represent only spurious connections, which occur by chance. Therefore, the principal objective is to search for the rules with statistical significance measures. Another important objective is to search for only non-redundant rules, which express the real causes of dependence, without any occasional extra factors. The extra factors do not add any new information on the dependence, but can only blur it and make it less accurate in future data. The problem is computationally very demanding, because the number of all possible rules increases exponentially with the number of attributes. In addition, neither the statistical dependency nor the statistical significance are monotonic properties, which means that the traditional pruning techniques do not work. As a solution, we first derive the mathematical basis for pruning the search space with any well-behaving statistical significance measures. The mathematical theory is complemented by a new algorithmic invention, which enables an efficient search without any heuristic restrictions. The resulting algorithm can be used to search for both positive and negative dependencies with any commonly used statistical measures, like Fisher's exact test, the chi-squared measure, mutual information, and z scores. According to our experiments, the algorithm is well-scalable, especially with Fisher's exact test. It can easily handle even the densest data sets with 10000-20000 attributes. Still, the results are globally optimal, which is a remarkable improvement over the existing solutions. In practice, this means that the user does not have to worry whether the dependencies hold in future data or if the data still contains better, but undiscovered dependencies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Increasing attention has been focused on methods that deliver pharmacologically active compounds (e.g. drugs, peptides and proteins) in a controlled fashion, so that constant, sustained, site-specific or pulsatile action can be attained. Ion-exchange resins have been widely studied in medical and pharmaceutical applications, including controlled drug delivery, leading to commercialisation of some resin based formulations. Ion-exchangers provide an efficient means to adjust and control drug delivery, as the electrostatic interactions enable precise control of the ion-exchange process and, thus, a more uniform and accurate control of drug release compared to systems that are based only on physical interactions. Unlike the resins, only few studies have been reported on ion-exchange fibers in drug delivery. However, the ion-exchange fibers have many advantageous properties compared to the conventional ion-exchange resins, such as more efficient compound loading into and release from the ion-exchanger, easier incorporation of drug-sized compounds, enhanced control of the ion-exchange process, better mechanical, chemical and thermal stability, and good formulation properties, which make the fibers attractive materials for controlled drug delivery systems. In this study, the factors affecting the nature and strength of the binding/loading of drug-sized model compounds into the ion-exchange fibers was evaluated comprehensively and, moreover, the controllability of subsequent drug release/delivery from the fibers was assessed by modifying the conditions of external solutions. Also the feasibility of ion-exchange fibers for simultaneous delivery of two drugs in combination was studied by dual loading. Donnan theory and theoretical modelling were applied to gain mechanistic understanding on these factors. The experimental results imply that incorporation of model compounds into the ion-exchange fibers was attained mainly as a result of ionic bonding, with additional contribution of non-specific interactions. Increasing the ion-exchange capacity of the fiber or decreasing the valence of loaded compounds increased the molar loading, while more efficient release of the compounds was observed consistently at conditions where the valence or concentration of the extracting counter-ion was increased. Donnan theory was capable of fully interpreting the ion-exchange equilibria and the theoretical modelling supported precisely the experimental observations. The physico-chemical characteristics (lipophilicity, hydrogen bonding ability) of the model compounds and the framework of the fibrous ion-exchanger influenced the affinity of the drugs towards the fibers and may, thus, affect both drug loading and release. It was concluded that precisely controlled drug delivery may be tailored for each compound, in particularly, by choosing a suitable ion-exchange fiber and optimizing the delivery system to take into account the external conditions, also when delivering two drugs simultaneously.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this study was to extend understanding of how large firms pursuing sustained and profitable growth manage organisational renewal. A multiple-case study was conducted in 27 North American and European wood-industry companies, of which 11 were chosen for closer study. The study combined the organisational-capabilities approach to strategic management with corporate-entrepreneurship thinking. It charted the further development of an identification and classification system for capabilities comprising three dimensions: (i) the dynamism between firm-specific and industry-significant capabilities, (ii) hierarchies of capabilities and capability portfolios, and (iii) their internal structure. Capability building was analysed in the context of the organisational design, the technological systems and the type of resource-bundling process (creating new vs. entrenching existing capabilities). The thesis describes the current capability portfolios and the organisational changes in the case companies. It also clarifies the mechanisms through which companies can influence the balance between knowledge search and the efficiency of knowledge transfer and integration in their daily business activities, and consequently the diversity of their capability portfolio and the breadth and novelty of their product/service range. The largest wood-industry companies of today must develop a seemingly dual strategic focus: they have to combine leading-edge, innovative solutions with cost-efficient, large-scale production. The use of modern technology in production was no longer a primary source of competitiveness in the case companies, but rather belonged to the portfolio of basic capabilities. Knowledge and information management had become an industry imperative, on a par with cost effectiveness. Yet, during the period of this research, the case companies were better in supporting growth in volume of the existing activity than growth through new economic activities. Customer-driven, incremental innovation was preferred over firm-driven innovation through experimentation. The three main constraints on organisational renewal were the lack of slack resources, the aim for lean, centralised designs, and the inward-bound communication climate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Achieving sustainable consumption patterns is a crucial step on the way towards sustainability. The scientific knowledge used to decide which priorities to set and how to enforce them has to converge with societal, political, and economic initiatives on various levels: from individual household decision-making to agreements and commitments in global policy processes. The aim of this thesis is to draw a comprehensive and systematic picture of sustainable consumption and to do this it develops the concept of Strong Sustainable Consumption Governance. In this concept, consumption is understood as resource consumption. This includes consumption by industries, public consumption, and household consumption. Next to the availability of resources (including the available sink capacity of the ecosystem) and their use and distribution among the Earth’s population, the thesis also considers their contribution to human well-being. This implies giving specific attention to the levels and patterns of consumption. Methods: The thesis introduces the terminology and various concepts of Sustainable Consumption and of Governance. It briefly elaborates on the methodology of Critical Realism and its potential for analysing Sustainable Consumption. It describes the various methods on which the research is based and sets out the political implications a governance approach towards Strong Sustainable Consumption may have. Two models are developed: one for the assessment of the environmental relevance of consumption activities, another to identify the influences of globalisation on the determinants of consumption opportunities. Results: One of the major challenges for Strong Sustainable Consumption is that it is not in line with the current political mainstream: that is, the belief that economic growth can cure all our problems. So, the proponents have to battle against a strong headwind. Their motivation however is the conviction that there is no alternative. Efforts have to be taken on multiple levels by multiple actors. And all of them are needed as they constitute the individual strings that together make up the rope. However, everyone must ensure that they are pulling in the same direction. It might be useful to apply a carrot and stick strategy to stimulate public debate. The stick in this case is to create a sense of urgency. The carrot would be to articulate better the message to the public that a shrinking of the economy is not as much of a disaster as mainstream economics tends to suggest. In parallel to this it is necessary to demand that governments take responsibility for governance. The dominant strategy is still information provision. But there is ample evidence that hard policies like regulatory instruments and economic instruments are most effective. As for Civil Society Organizations it is recommended that they overcome the habit of promoting Sustainable (in fact green) Consumption by using marketing strategies and instead foster public debate in values and well-being. This includes appreciating the potential of social innovation. A countless number of such initiatives are on the way but their potential is still insufficiently explored. Beyond the question of how to multiply such approaches, it is also necessary to establish political macro structures to foster them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The molecular level structure of mixtures of water and alcohols is very complicated and has been under intense research in the recent past. Both experimental and computational methods have been used in the studies. One method for studying the intra- and intermolecular bindings in the mixtures is the use of the so called difference Compton profiles, which are a way to obtain information about changes in the electron wave functions. In the process of Compton scattering a photon scatters inelastically from an electron. The Compton profile that is obtained from the electron wave functions is directly proportional to the probability of photon scattering at a given energy to a given solid angle. In this work we develop a method to compute Compton profiles numerically for mixtures of liquids. In order to obtain the electronic wave functions necessary to calculate the Compton profiles we need some statistical information about atomic coordinates. Acquiring this using ab-initio molecular dynamics is beyond our computational capabilities and therefore we use classical molecular dynamics to model the movement of atoms in the mixture. We discuss the validity of the chosen method in view of the results obtained from the simulations. There are some difficulties in using classical molecular dynamics for the quantum mechanical calculations, but these can possibly be overcome by parameter tuning. According to the calculations clear differences can be seen in the Compton profiles of different mixtures. This prediction needs to be tested in experiments in order to find out whether the approximations made are valid.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Event-based systems are seen as good candidates for supporting distributed applications in dynamic and ubiquitous environments because they support decoupled and asynchronous many-to-many information dissemination. Event systems are widely used, because asynchronous messaging provides a flexible alternative to RPC (Remote Procedure Call). They are typically implemented using an overlay network of routers. A content-based router forwards event messages based on filters that are installed by subscribers and other routers. The filters are organized into a routing table in order to forward incoming events to proper subscribers and neighbouring routers. This thesis addresses the optimization of content-based routing tables organized using the covering relation and presents novel data structures and configurations for improving local and distributed operation. Data structures are needed for organizing filters into a routing table that supports efficient matching and runtime operation. We present novel results on dynamic filter merging and the integration of filter merging with content-based routing tables. In addition, the thesis examines the cost of client mobility using different protocols and routing topologies. We also present a new matching technique called temporal subspace matching. The technique combines two new features. The first feature, temporal operation, supports notifications, or content profiles, that persist in time. The second feature, subspace matching, allows more expressive semantics, because notifications may contain intervals and be defined as subspaces of the content space. We also present an application of temporal subspace matching pertaining to metadata-based continuous collection and object tracking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Minimum Description Length (MDL) is an information-theoretic principle that can be used for model selection and other statistical inference tasks. There are various ways to use the principle in practice. One theoretically valid way is to use the normalized maximum likelihood (NML) criterion. Due to computational difficulties, this approach has not been used very often. This thesis presents efficient floating-point algorithms that make it possible to compute the NML for multinomial, Naive Bayes and Bayesian forest models. None of the presented algorithms rely on asymptotic analysis and with the first two model classes we also discuss how to compute exact rational number solutions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In visual object detection and recognition, classifiers have two interesting characteristics: accuracy and speed. Accuracy depends on the complexity of the image features and classifier decision surfaces. Speed depends on the hardware and the computational effort required to use the features and decision surfaces. When attempts to increase accuracy lead to increases in complexity and effort, it is necessary to ask how much are we willing to pay for increased accuracy. For example, if increased computational effort implies quickly diminishing returns in accuracy, then those designing inexpensive surveillance applications cannot aim for maximum accuracy at any cost. It becomes necessary to find trade-offs between accuracy and effort. We study efficient classification of images depicting real-world objects and scenes. Classification is efficient when a classifier can be controlled so that the desired trade-off between accuracy and effort (speed) is achieved and unnecessary computations are avoided on a per input basis. A framework is proposed for understanding and modeling efficient classification of images. Classification is modeled as a tree-like process. In designing the framework, it is important to recognize what is essential and to avoid structures that are narrow in applicability. Earlier frameworks are lacking in this regard. The overall contribution is two-fold. First, the framework is presented, subjected to experiments, and shown to be satisfactory. Second, certain unconventional approaches are experimented with. This allows the separation of the essential from the conventional. To determine if the framework is satisfactory, three categories of questions are identified: trade-off optimization, classifier tree organization, and rules for delegation and confidence modeling. Questions and problems related to each category are addressed and empirical results are presented. For example, related to trade-off optimization, we address the problem of computational bottlenecks that limit the range of trade-offs. We also ask if accuracy versus effort trade-offs can be controlled after training. For another example, regarding classifier tree organization, we first consider the task of organizing a tree in a problem-specific manner. We then ask if problem-specific organization is necessary.