7 resultados para greedy heuristics
em Helda - Digital Repository of University of Helsinki
Resumo:
The aim of this dissertation is to provide conceptual tools for the social scientist for clarifying, evaluating and comparing explanations of social phenomena based on formal mathematical models. The focus is on relatively simple theoretical models and simulations, not statistical models. These studies apply a theory of explanation according to which explanation is about tracing objective relations of dependence, knowledge of which enables answers to contrastive why and how-questions. This theory is developed further by delineating criteria for evaluating competing explanations and by applying the theory to social scientific modelling practices and to the key concepts of equilibrium and mechanism. The dissertation is comprised of an introductory essay and six published original research articles. The main theses about model-based explanations in the social sciences argued for in the articles are the following. 1) The concept of explanatory power, often used to argue for the superiority of one explanation over another, compasses five dimensions which are partially independent and involve some systematic trade-offs. 2) All equilibrium explanations do not causally explain the obtaining of the end equilibrium state with the multiple possible initial states. Instead, they often constitutively explain the macro property of the system with the micro properties of the parts (together with their organization). 3) There is an important ambivalence in the concept mechanism used in many model-based explanations and this difference corresponds to a difference between two alternative research heuristics. 4) Whether unrealistic assumptions in a model (such as a rational choice model) are detrimental to an explanation provided by the model depends on whether the representation of the explanatory dependency in the model is itself dependent on the particular unrealistic assumptions. Thus evaluating whether a literally false assumption in a model is problematic requires specifying exactly what is supposed to be explained and by what. 5) The question of whether an explanatory relationship depends on particular false assumptions can be explored with the process of derivational robustness analysis and the importance of robustness analysis accounts for some of the puzzling features of the tradition of model-building in economics. 6) The fact that economists have been relatively reluctant to use true agent-based simulations to formulate explanations can partially be explained by the specific ideal of scientific understanding implicit in the practise of orthodox economics.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
The increasing focus of relationship marketing and customer relationship management (CRM) studies on issues of customer profitability has led to the emergence of an area of research on profitable customer management. Nevertheless, there is a notable lack of empirical research examining the current practices of firms specifically with regard to the profitable management of customer relationships according to the approaches suggested in theory. This thesis fills this research gap by exploring profitable customer management in the retail banking sector. Several topics are covered, including marketing metrics and accountability; challenges in the implementation of profitable customer management approaches in practice; analytic versus heuristic (‘rule of thumb’) decision making; and the modification of costly customer behavior in order to increase customer profitability, customer lifetime value (CLV), and customer equity, i.e. the financial value of the customer base. The thesis critically reviews the concept of customer equity and proposes a Customer Equity Scorecard, providing a starting point for a constructive dialog between marketing and finance concerning the development of appropriate metrics to measure marketing outcomes. Since customer management and measurement issues go hand in hand, profitable customer management is contingent on both marketing management skills and financial measurement skills. A clear gap between marketing theory and practice regarding profitable customer management is also identified. The findings show that key customer management aspects that have been proposed within the literature on profitable customer management for many years, are not being actively applied by the banks included in the research. Instead, several areas of customer management decision making are found to be influenced by heuristics. This dilemma for marketing accountability is addressed by emphasizing that CLV and customer equity, which are aggregate metrics, only provide certain indications regarding the relative value of customers and the approximate value of the customer base (or groups of customers), respectively. The value created by marketing manifests itself in the effect of marketing actions on customer perceptions, behavior, and ultimately the components of CLV, namely revenues, costs, risk, and retention, as well as additional components of customer equity, such as customer acquisition. The thesis also points out that although costs are a crucial component of CLV, they have largely been neglected in prior CRM research. Cost-cutting has often been viewed negatively in customer-focused marketing literature on service quality and customer profitability, but the case studies in this thesis demonstrate that reduced costs do not necessarily have to lead to lower service quality, customer retention, and customer-related revenues. Consequently, this thesis provides an expanded foundation upon which marketers can stake their claim for accountability. By focusing on the range of drivers and all of the components of CLV and customer equity, marketing has the potential to provide specific evidence concerning how various activities have affected the drivers and components of CLV within different groups of customers, and the implications for customer equity on a customer base level.
Resumo:
Tutkielmassa tarkastellaan kuluttajien näkemyksiä ilmastonmuutoksesta ja ilmastovaikutusten seuranta- ja palautejärjestelmän hyväksyttävyydestä kulutuksen ohjauskeinona. 15 kuluttajaa kokeili kuukauden ajan kulutuksen ilmastovaikutusten seuranta- ja palautejärjestelmän demonstraatioversiota ja he osallistuivat kokeilun pohjalta aihepiiriä käsitelleeseen verkkokeskusteluun. Analysoin verkkokeskustelun aineistoa arkisen järkeilyn näkökulmasta tutkien kuluttajien ilmastonmuutokseen ja ympäristövastuullisuuteen liittyvää arkitietoa sekä palvelun hyväksyttävyyteen liittyviä heuristiikkoja. Ilmastonmuutoksen todettiin yleisesti olevan vielä melko abstrakti ja monitulkintainen ilmiö, minkä vuoksi kuluttajilla on vaikeuksia ymmärtää omien valintojensa konkreettista merkitystä ilmastonmuutoksen kannalta. Vaikka tietoa kulutuksen ilmastovaikutuksista on saatavilla paljon, niin erityisesti yritysten tuottama tieto koettiin ristiriitaiseksi ja osin epäluotettavaksi. Kuluttajat myös kritisoivat ilmastonmuutoskeskustelun tarjoamaa kapeaa näkemystä kulutuksen ympäristövaikutuksista. Hiilidioksidipäästöihin keskittymisen sijaan ympäristövaikutuksia tulisi kuluttajien mielestä tarkastella kokonaisuutena, josta ilmastovaikutukset muodostavat vain yhden osan. Tutkimukseen osallistuneiden kuluttajien kulutustottumuksiin ilmastonmuutos vaikutti eriasteisesti. Toisille ilmastonmuutoksesta oli muodostunut keskeinen omaa kulutusta ohjaava normi, kun taas toiset kertoivat pohtivansa ilmastovaikutuksia pääasiassa suurimpien hankintojen kohdalla. Ympäristövastuullisuudessa merkitykselliseksi koettiin tasapainon löytäminen ja henkilökohtainen tunne siitä, että kokee toimivansa oikein. Ilmasto- tai ympäristökysymyksiä punnitaan valinnoissa joustavasti yhdessä muiden tekijöiden kanssa. Vaikka kuluttajilla toisaalta olisi tietoa ja halua ottaa ilmasto- ja ympäristövaikutukset huomioon valinnoissaan, toimintaympäristö rajaa keskeisesti kuluttajien mahdollisuuksia toimia ympäristövastuullisesti. Tutkimus toi esille neljä kuluttajien käyttämää heuristiikkaa heidän pohtiessaan ilmastovaikutusten seuranta- ja palautejärjestelmän hyväksyttävyyden ehtoja ja toimivuutta ohjauskeinona. Ensinnäkin palvelun tulee olla käytettävyydeltään nopea ja vaivaton sekä tarjota tietoa havainnollisessa ja helposti ymmärrettävässä muodossa. Toiseksi palvelun tarjoaman tiedon tulee olla ehdottoman luotettavaa ja kuluttajien valintojen kannalta merkityksellistä siten, että palvelu ottaa huomioon erilaiset kuluttajat ja tiedontarpeet. Kolmanneksi palvelu tulee toteuttaa kokonaisvaltaisesti ja läpinäkyvästi useampien kaupparyhmittymien ja julkisten toimijoiden yhteistyönä. Neljänneksi toteutuksessa tulee huomioda palvelun kannustavuus ja kytkeytyminen muihin ohjauskeinoihin.
Resumo:
Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain patterns by permuting the rows and columns. We concentrate on the following patterns in binary matrices: consecutive-ones (C1P), simultaneous consecutive-ones (SC1P), nestedness, k-nestedness, and bandedness. These patterns reflect specific types of interplay and variation between the rows and columns, such as continuity and hierarchies. Furthermore, their combinatorial properties are interlinked, which helps us to develop the theory of binary matrices and efficient algorithms. Indeed, we can detect all these patterns in a binary matrix efficiently, that is, in polynomial time in the size of the matrix. Since real-world datasets often contain noise and errors, we rarely witness perfect patterns. Therefore we also need to assess how far an input matrix is from a pattern: we count the number of flips (from 0s to 1s or vice versa) needed to bring out the perfect pattern in the matrix. Unfortunately, for most patterns it is an NP-complete problem to find the minimum distance to a matrix that has the perfect pattern, which means that the existence of a polynomial-time algorithm is unlikely. To find patterns in datasets with noise, we need methods that are noise-tolerant and work in practical time with large datasets. The theory of binary matrices gives rise to robust heuristics that have good performance with synthetic data and discover easily interpretable structures in real-world datasets: dialectical variation in the spoken Finnish language, division of European locations by the hierarchies found in mammal occurrences, and co-occuring groups in network data. In addition to determining the distance from a dataset to a pattern, we need to determine whether the pattern is significant or a mere occurrence of a random chance. To this end, we use significance testing: we deem a dataset significant if it appears exceptional when compared to datasets generated from a certain null hypothesis. After detecting a significant pattern in a dataset, it is up to domain experts to interpret the results in the terms of the application.
Resumo:
The paper examines the needs, premises and criteria for effective public participation in tactical forest planning. A method for participatory forest planning utilizing the techniques of preference analysis, professional expertise and heuristic optimization is introduced. The techniques do not cover the whole process of participatory planning, but are applied as a tool constituting the numerical core for decision support. The complexity of multi-resource management is addressed by hierarchical decision analysis which assesses the public values, preferences and decision criteria toward the planning situation. An optimal management plan is sought using heuristic optimization. The plan can further be improved through mutual negotiations, if necessary. The use of the approach is demonstrated with an illustrative example, it's merits and challenges for participatory forest planning and decision making are discussed and a model for applying it in general forest planning context is depicted. By using the approach, valuable information can be obtained about public preferences and the effects of taking them into consideration on the choice of the combination of standwise treatment proposals for a forest area. Participatory forest planning calculations, carried out by the approach presented in the paper, can be utilized in conflict management and in developing compromises between competing interests.