9 resultados para Empirical Bayes Methods
em Helda - Digital Repository of University of Helsinki
Resumo:
This thesis discusses the use of sub- and supercritical fluids as the medium in extraction and chromatography. Super- and subcritical extraction was used to separate essential oils from herbal plant Angelica archangelica. The effect of extraction parameters was studied and sensory analyses of the extracts were done by an expert panel. The results of the sensory analyses were compared to the analytically determined contents of the extracts. Sub- and supercritical fluid chromatography (SFC) was used to separate and purify high-value pharmaceuticals. Chiral SFC was used to separate the enantiomers of racemic mixtures of pharmaceutical compounds. Very low (cryogenic) temperatures were applied to substantially enhance the separation efficiency of chiral SFC. The thermodynamic aspects affecting the resolving ability of chiral stationary phases are briefly reviewed. The process production rate which is a key factor in industrial chromatography was optimized by empirical multivariate methods. General linear model was used to optimize the separation of omega-3 fatty acid ethyl esters from esterized fish oil by using reversed-phase SFC. Chiral separation of racemic mixtures of guaifenesin and ferulic acid dimer ethyl ester was optimized by using response surface method with three variables per time. It was found that by optimizing four variables (temperature, load, flowate and modifier content) the production rate of the chiral resolution of racemic guaifenesin by cryogenic SFC could be increased severalfold compared to published results of similar application. A novel pressure-compensated design of industrial high pressure chromatographic column was introduced, using the technology developed in building the deep-sea submersibles (Mir 1 and 2). A demonstration SFC plant was built and the immunosuppressant drug cyclosporine A was purified to meet the requirements of US Pharmacopoeia. A smaller semi-pilot size column with similar design was used for cryogenic chiral separation of aromatase inhibitor Finrozole for use in its development phase 2.
Resumo:
In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
A vast amount of public services and goods are contracted through procurement auctions. Therefore it is very important to design these auctions in an optimal way. Typically, we are interested in two different objectives. The first objective is efficiency. Efficiency means that the contract is awarded to the bidder that values it the most, which in the procurement setting means the bidder that has the lowest cost of providing a service with a given quality. The second objective is to maximize public revenue. Maximizing public revenue means minimizing the costs of procurement. Both of these goals are important from the welfare point of view. In this thesis, I analyze field data from procurement auctions and show how empirical analysis can be used to help design the auctions to maximize public revenue. In particular, I concentrate on how competition, which means the number of bidders, should be taken into account in the design of auctions. In the first chapter, the main policy question is whether the auctioneer should spend resources to induce more competition. The information paradigm is essential in analyzing the effects of competition. We talk of a private values information paradigm when the bidders know their valuations exactly. In a common value information paradigm, the information about the value of the object is dispersed among the bidders. With private values more competition always increases the public revenue but with common values the effect of competition is uncertain. I study the effects of competition in the City of Helsinki bus transit market by conducting tests for common values. I also extend an existing test by allowing bidder asymmetry. The information paradigm seems to be that of common values. The bus companies that have garages close to the contracted routes are influenced more by the common value elements than those whose garages are further away. Therefore, attracting more bidders does not necessarily lower procurement costs, and thus the City should not implement costly policies to induce more competition. In the second chapter, I ask how the auctioneer can increase its revenue by changing contract characteristics like contract sizes and durations. I find that the City of Helsinki should shorten the contract duration in the bus transit auctions because that would decrease the importance of common value components and cheaply increase entry which now would have a more beneficial impact on the public revenue. Typically, cartels decrease the public revenue in a significant way. In the third chapter, I propose a new statistical method for detecting collusion and compare it with an existing test. I argue that my test is robust to unobserved heterogeneity unlike the existing test. I apply both methods to procurement auctions that contract snow removal in schools of Helsinki. According to these tests, the bidding behavior of two of the bidders seems consistent with a contract allocation scheme.
Resumo:
Modern-day weather forecasting is highly dependent on Numerical Weather Prediction (NWP) models as the main data source. The evolving state of the atmosphere with time can be numerically predicted by solving a set of hydrodynamic equations, if the initial state is known. However, such a modelling approach always contains approximations that by and large depend on the purpose of use and resolution of the models. Present-day NWP systems operate with horizontal model resolutions in the range from about 40 km to 10 km. Recently, the aim has been to reach operationally to scales of 1 4 km. This requires less approximations in the model equations, more complex treatment of physical processes and, furthermore, more computing power. This thesis concentrates on the physical parameterization methods used in high-resolution NWP models. The main emphasis is on the validation of the grid-size-dependent convection parameterization in the High Resolution Limited Area Model (HIRLAM) and on a comprehensive intercomparison of radiative-flux parameterizations. In addition, the problems related to wind prediction near the coastline are addressed with high-resolution meso-scale models. The grid-size-dependent convection parameterization is clearly beneficial for NWP models operating with a dense grid. Results show that the current convection scheme in HIRLAM is still applicable down to a 5.6 km grid size. However, with further improved model resolution, the tendency of the model to overestimate strong precipitation intensities increases in all the experiment runs. For the clear-sky longwave radiation parameterization, schemes used in NWP-models provide much better results in comparison with simple empirical schemes. On the other hand, for the shortwave part of the spectrum, the empirical schemes are more competitive for producing fairly accurate surface fluxes. Overall, even the complex radiation parameterization schemes used in NWP-models seem to be slightly too transparent for both long- and shortwave radiation in clear-sky conditions. For cloudy conditions, simple cloud correction functions are tested. In case of longwave radiation, the empirical cloud correction methods provide rather accurate results, whereas for shortwave radiation the benefit is only marginal. Idealised high-resolution two-dimensional meso-scale model experiments suggest that the reason for the observed formation of the afternoon low level jet (LLJ) over the Gulf of Finland is an inertial oscillation mechanism, when the large-scale flow is from the south-east or west directions. The LLJ is further enhanced by the sea-breeze circulation. A three-dimensional HIRLAM experiment, with a 7.7 km grid size, is able to generate a similar LLJ flow structure as suggested by the 2D-experiments and observations. It is also pointed out that improved model resolution does not necessary lead to better wind forecasts in the statistical sense. In nested systems, the quality of the large-scale host model is really important, especially if the inner meso-scale model domain is small.
Resumo:
Governance has been one of the most popular buzzwords in recent political science. As with any term shared by numerous fields of research, as well as everyday language, governance is encumbered by a jungle of definitions and applications. This work elaborates on the concept of network governance. Network governance refers to complex policy-making situations, where a variety of public and private actors collaborate in order to produce and define policy. Governance is processes of autonomous, self-organizing networks of organizations exchanging information and deliberating. Network governance is a theoretical concept that corresponds to an empirical phenomenon. Often, this phenomenon is used to descirbe a historical development: governance is often used to describe changes in political processes of Western societies since the 1980s. In this work, empirical governance networks are used as an organizing framework, and the concepts of autonomy, self-organization and network structure are developed as tools for empirical analysis of any complex decision-making process. This work develops this framework and explores the governance networks in the case of environmental policy-making in the City of Helsinki, Finland. The crafting of a local ecological sustainability programme required support and knowledge from all sectors of administration, a number of entrepreneurs and companies and the inhabitants of Helsinki. The policy process relied explicitly on networking, with public and private actors collaborating to design policy instruments. Communication between individual organizations led to the development of network structures and patterns. This research analyses these patterns and their effects on policy choice, by applying the methods of social network analysis. A variety of social network analysis methods are used to uncover different features of the networked process. Links between individual network positions, network subgroup structures and macro-level network patterns are compared to the types of organizations involved and final policy instruments chosen. By using governance concepts to depict a policy process, the work aims to assess whether they contribute to models of policy-making. The conclusion is that the governance literature sheds light on events that would otherwise go unnoticed, or whose conceptualization would remain atheoretical. The framework of network governance should be in the toolkit of the policy analyst.
Resumo:
This paper uses the Value-at-Risk approach to define the risk in both long and short trading positions. The investigation is done on some major market indices(Japanese, UK, German and US). The performance of models that takes into account skewness and fat-tails are compared to symmetric models in relation to both the specific model for estimating the variance, and the distribution of the variance estimate used as input in the VaR estimation. The results indicate that more flexible models not necessarily perform better in predicting the VaR forecast; the reason for this is most probably the complexity of these models. A general result is that different methods for estimating the variance are needed for different confidence levels of the VaR, and for the different indices. Also, different models are to be used for the left respectively the right tail of the distribution.
Resumo:
The relationship between site characteristics and understorey vegetation composition was analysed with quantitative methods, especially from the viewpoint of site quality estimation. Theoretical models were applied to an empirical data set collected from the upland forests of southern Finland comprising 104 sites dominated by Scots pine (Pinus sylvestris L.), and 165 sites dominated by Norway spruce (Picea abies (L.) Karsten). Site index H100 was used as an independent measure of site quality. A new model for the estimation of site quality at sites with a known understorey vegetation composition was introduced. It is based on the application of Bayes' theorem to the density function of site quality within the study area combined with the species-specific presence-absence response curves. The resulting posterior probability density function may be used for calculating an estimate for the site variable. Using this method, a jackknife estimate of site index H100 was calculated separately for pine- and spruce-dominated sites. The results indicated that the cross-validation root mean squared error (RMSEcv) of the estimates improved from 2.98 m down to 2.34 m relative to the "null" model (standard deviation of the sample distribution) in pine-dominated forests. In spruce-dominated forests RMSEcv decreased from 3.94 m down to 3.16 m. In order to assess these results, four other estimation methods based on understorey vegetation composition were applied to the same data set. The results showed that none of the methods was clearly superior to the others. In pine-dominated forests, RMSEcv varied between 2.34 and 2.47 m, and the corresponding range for spruce-dominated forests was from 3.13 to 3.57 m.
Resumo:
This thesis studies the effect of income inequality on economic growth. This is done by analyzing panel data from several countries with both short and long time dimensions of the data. Two of the chapters study the direct effect of inequality on growth, and one chapter also looks at the possible indirect effect of inequality on growth by assessing the effect of inequality on savings. In Chapter two, the effect of inequality on growth is studied by using a panel of 70 countries and a new EHII2008 inequality measure. Chapter contributes on two problems that panel econometric studies on the economic effect of inequality have recently encountered: the comparability problem associated with the commonly used Deininger and Squire s Gini index, and the problem relating to the estimation of group-related elasticities in panel data. In this study, a simple way to 'bypass' vagueness related to the use of parametric methods to estimate group-related parameters is presented. The idea is to estimate the group-related elasticities implicitly using a set of group-related instrumental variables. The estimation results with new data and method indicate that the relationship between income inequality and growth is likely to be non-linear. Chapter three incorporates the EHII2.1 inequality measure and a panel with annual time series observations from 38 countries to test the existence of long-run equilibrium relation(s) between inequality and the level of GDP. Panel unit root tests indicate that both the logarithmic EHII2.1 inequality measure and the logarithmic GDP per capita series are I(1) nonstationary processes. They are also found to be cointegrated of order one, which implies that there is a long-run equilibrium relation between them. The long-run growth elasticity of inequality is found to be negative in the middle-income and rich economies, but the results for poor economies are inconclusive. In the fourth Chapter, macroeconomic data on nine developed economies spanning across four decades starting from the year 1960 is used to study the effect of the changes in the top income share to national and private savings. The income share of the top 1 % of population is used as proxy for the distribution of income. The effect of inequality on private savings is found to be positive in the Nordic and Central-European countries, but for the Anglo-Saxon countries the direction of the effect (positive vs. negative) remains somewhat ambiguous. Inequality is found to have an effect national savings only in the Nordic countries, where it is positive.