8 resultados para Statistical method
em Helda - Digital Repository of University of Helsinki
Resumo:
Socioeconomic health inequalities have been widely documented, with a lower social position being associated with poorer physical and general health and higher mortality. For mental health the results have been more varied. However, the mechanisms by which the various dimensions of socioeconomic circumstances are associated with different domains of health are not yet fully understood. This is related to a lack of studies tackling the interrelations and pathways between multiple dimensions of socioeconomic circumstances and domains of health. In particular, evidence from comparative studies of populations from different national contexts that consider the complexity of the causes of socioeconomic health inequalities is needed. The aim of this study was to examine the associations of multiple socioeconomic circumstances with physical and mental health, more specifically physical functioning and common mental disorders. This was done in a comparative setting of two cohorts of white-collar public sector employees, one from Finland and one from Britain. The study also sought to find explanations for the observed associations between economic difficulties and health by analysing the contribution of health behaviours, living arrangements and work-family conflicts. The survey data were derived from the Finnish Helsinki Health Study baseline surveys in 2000-2002 among the City of Helsinki employees aged 40-60 years, and from the fifth phase of the London-based Whitehall II study (1997-9) which is a prospective study of civil servants aged 35-55 years at the time of recruitment. The data collection in the two countries was harmonised to safeguard maximal comparability. Physical functioning was measured with the Short Form (SF-36) physical component summary and common mental disorders with the General Health Questionnaire (GHQ-12). Socioeconomic circumstances were parental education, childhood economic difficulties, own education, occupational class, household income, housing tenure, and current economic difficulties. Further explanatory factors were health behaviours, living arrangements and work-family conflicts. The main statistical method used was logistic regression analysis. Analyses were conducted separately for the two sexes and two cohorts. Childhood and current economic difficulties were associated with poorer physical functioning and common mental disorders generally in both cohorts and sexes. Conventional dimensions of socioeconomic circumstances i.e. education, occupational class and income were associated with physical functioning and mediated each other’s effects, but in different ways in the two cohorts: education was more important in Helsinki and occupational class in London. The associations of economic difficulties with health were partly explained by work-family conflicts and other socioeconomic circumstances in both cohorts and sexes. In conclusion, this study on two country-specific cohorts confirms that different dimensions of socioeconomic circumstances are related but not interchangeable. They are also somewhat differently associated with physical and mental domains of health. In addition to conventionally measured dimensions of past and present socioeconomic circumstances, economic difficulties should be taken into account in studies and attempts to reduce health inequalities. Further explanatory factors, particularly conflicts between work and family, should also be considered when aiming to reduce inequalities and maintain the health of employees.
Resumo:
The Baltic countries share public health problems typical of most Eastern European transition economies: morbidity and mortality from non-communicable diseases is higher than in Western European countries. This situation has many similarities compared to a neighbouring country, Finland during the late 1960s. There are reasons to expect that health disadvantage may be increasing among the less advantaged population groups in the Baltic countries. The evidence on social differences in health in the Baltic countries is, however, scattered to studies using different methodologies making comparisons difficult. This study aims to bridge the evidence gap by providing comparable standardized cross-sectional and time trend analyses to the social patterning of variation in health and two key health behaviours i.e. smoking and drinking in Estonia, Latvia, Lithuania and Finland in 1994-2004 representing Eastern European transition countries and a stable Western European country. The data consisted of similar cross-sectional postal surveys conducted in 1994, 1996, 1998, 2000, 2002 and 2004 on adult populations (aged 20 64 years) in Estonia (n=9049), Latvia (n=7685), Lithuania (n=11634) and Finland (n=18821) in connection with the Finbalt Health Monitor project. The main statistical method was logistic regression analysis. Perceived health was found to be worse among both men and women in the Baltic countries than in Finland. Poor health was associated with older age and lower education in all countries studied. Urbanization and marital status were not consistently related to health. The existing educational inequalities in health remained generally stable over time from 1994 to 2004. In the Baltic countries, however, improvement in perceived health was mainly found among the better educated men and women. Daily smoking was associated with young age, lower education and psychological distress in all countries. Among women smoking was also associated with urbanisation in all countries except Estonia. Among Lithuanian women, the educational gradient in smoking was weakest, and the overall prevalence of smoking increased over time. Drinking was generally associated with young age among men and women, and with education among women. Better educated women were more often frequent drinkers and less educated binge drinkers. The exception was that in Latvian men and women both frequent drinking and binge drinking were associated with low education. In conclusion, the Baltic countries are likely to resemble Western European countries rather than other transition societies. While health inequalities did not markedly change, substantial inequalities do remain, and there were indications of favourable developments mainly among the better educated. Pressures towards increasing health inequalities may therefore be visible in the future, which would be in accordance with the results on smoking and drinking in this study.
Resumo:
A vast amount of public services and goods are contracted through procurement auctions. Therefore it is very important to design these auctions in an optimal way. Typically, we are interested in two different objectives. The first objective is efficiency. Efficiency means that the contract is awarded to the bidder that values it the most, which in the procurement setting means the bidder that has the lowest cost of providing a service with a given quality. The second objective is to maximize public revenue. Maximizing public revenue means minimizing the costs of procurement. Both of these goals are important from the welfare point of view. In this thesis, I analyze field data from procurement auctions and show how empirical analysis can be used to help design the auctions to maximize public revenue. In particular, I concentrate on how competition, which means the number of bidders, should be taken into account in the design of auctions. In the first chapter, the main policy question is whether the auctioneer should spend resources to induce more competition. The information paradigm is essential in analyzing the effects of competition. We talk of a private values information paradigm when the bidders know their valuations exactly. In a common value information paradigm, the information about the value of the object is dispersed among the bidders. With private values more competition always increases the public revenue but with common values the effect of competition is uncertain. I study the effects of competition in the City of Helsinki bus transit market by conducting tests for common values. I also extend an existing test by allowing bidder asymmetry. The information paradigm seems to be that of common values. The bus companies that have garages close to the contracted routes are influenced more by the common value elements than those whose garages are further away. Therefore, attracting more bidders does not necessarily lower procurement costs, and thus the City should not implement costly policies to induce more competition. In the second chapter, I ask how the auctioneer can increase its revenue by changing contract characteristics like contract sizes and durations. I find that the City of Helsinki should shorten the contract duration in the bus transit auctions because that would decrease the importance of common value components and cheaply increase entry which now would have a more beneficial impact on the public revenue. Typically, cartels decrease the public revenue in a significant way. In the third chapter, I propose a new statistical method for detecting collusion and compare it with an existing test. I argue that my test is robust to unobserved heterogeneity unlike the existing test. I apply both methods to procurement auctions that contract snow removal in schools of Helsinki. According to these tests, the bidding behavior of two of the bidders seems consistent with a contract allocation scheme.
Resumo:
Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.
Resumo:
Population dynamics are generally viewed as the result of intrinsic (purely density dependent) and extrinsic (environmental) processes. Both components, and potential interactions between those two, have to be modelled in order to understand and predict dynamics of natural populations; a topic that is of great importance in population management and conservation. This thesis focuses on modelling environmental effects in population dynamics and how effects of potentially relevant environmental variables can be statistically identified and quantified from time series data. Chapter I presents some useful models of multiplicative environmental effects for unstructured density dependent populations. The presented models can be written as standard multiple regression models that are easy to fit to data. Chapters II IV constitute empirical studies that statistically model environmental effects on population dynamics of several migratory bird species with different life history characteristics and migration strategies. In Chapter II, spruce cone crops are found to have a strong positive effect on the population growth of the great spotted woodpecker (Dendrocopos major), while cone crops of pine another important food resource for the species do not effectively explain population growth. The study compares rate- and ratio-dependent effects of cone availability, using state-space models that distinguish between process and observation error in the time series data. Chapter III shows how drought, in combination with settling behaviour during migration, produces asymmetric spatially synchronous patterns of population dynamics in North American ducks (genus Anas). Chapter IV investigates the dynamics of a Finnish population of skylark (Alauda arvensis), and point out effects of rainfall and habitat quality on population growth. Because the skylark time series and some of the environmental variables included show strong positive autocorrelation, the statistical significances are calculated using a Monte Carlo method, where random autocorrelated time series are generated. Chapter V is a simulation-based study, showing that ignoring observation error in analyses of population time series data can bias the estimated effects and measures of uncertainty, if the environmental variables are autocorrelated. It is concluded that the use of state-space models is an effective way to reach more accurate results. In summary, there are several biological assumptions and methodological issues that can affect the inferential outcome when estimating environmental effects from time series data, and that therefore need special attention. The functional form of the environmental effects and potential interactions between environment and population density are important to deal with. Other issues that should be considered are assumptions about density dependent regulation, modelling potential observation error, and when needed, accounting for spatial and/or temporal autocorrelation.
Resumo:
An efficient and statistically robust solution for the identification of asteroids among numerous sets of astrometry is presented. In particular, numerical methods have been developed for the short-term identification of asteroids at discovery, and for the long-term identification of scarcely observed asteroids over apparitions, a task which has been lacking a robust method until now. The methods are based on the solid foundation of statistical orbital inversion properly taking into account the observational uncertainties, which allows for the detection of practically all correct identifications. Through the use of dimensionality-reduction techniques and efficient data structures, the exact methods have a loglinear, that is, O(nlog(n)), computational complexity, where n is the number of included observation sets. The methods developed are thus suitable for future large-scale surveys which anticipate a substantial increase in the astrometric data rate. Due to the discontinuous nature of asteroid astrometry, separate sets of astrometry must be linked to a common asteroid from the very first discovery detections onwards. The reason for the discontinuity in the observed positions is the rotation of the observer with the Earth as well as the motion of the asteroid and the observer about the Sun. Therefore, the aim of identification is to find a set of orbital elements that reproduce the observed positions with residuals similar to the inevitable observational uncertainty. Unless the astrometric observation sets are linked, the corresponding asteroid is eventually lost as the uncertainty of the predicted positions grows too large to allow successful follow-up. Whereas the presented identification theory and the numerical comparison algorithm are generally applicable, that is, also in fields other than astronomy (e.g., in the identification of space debris), the numerical methods developed for asteroid identification can immediately be applied to all objects on heliocentric orbits with negligible effects due to non-gravitational forces in the time frame of the analysis. The methods developed have been successfully applied to various identification problems. Simulations have shown that the methods developed are able to find virtually all correct linkages despite challenges such as numerous scarce observation sets, astrometric uncertainty, numerous objects confined to a limited region on the celestial sphere, long linking intervals, and substantial parallaxes. Tens of previously unknown main-belt asteroids have been identified with the short-term method in a preliminary study to locate asteroids among numerous unidentified sets of single-night astrometry of moving objects, and scarce astrometry obtained nearly simultaneously with Earth-based and space-based telescopes has been successfully linked despite a substantial parallax. Using the long-term method, thousands of realistic 3-linkages typically spanning several apparitions have so far been found among designated observation sets each spanning less than 48 hours.
Resumo:
In this study we explore the concurrent, combined use of three research methods, statistical corpus analysis and two psycholinguistic experiments (a forced-choice and an acceptability rating task), using verbal synonymy in Finnish as a case in point. In addition to supporting conclusions from earlier studies concerning the relationships between corpus-based and ex- perimental data (e. g., Featherston 2005), we show that each method adds to our understanding of the studied phenomenon, in a way which could not be achieved through any single method by itself. Most importantly, whereas relative rareness in a corpus is associated with dispreference in selection, such infrequency does not categorically always entail substantially lower acceptability. Furthermore, we show that forced-choice and acceptability rating tasks pertain to distinct linguistic processes, with category-wise in- commensurable scales of measurement, and should therefore be merged with caution, if at all.
Resumo:
Modern sample surveys started to spread after statistician at the U.S. Bureau of the Census in the 1940s had developed a sampling design for the Current Population Survey (CPS). A significant factor was also that digital computers became available for statisticians. In the beginning of 1950s, the theory was documented in textbooks on survey sampling. This thesis is about the development of the statistical inference for sample surveys. For the first time the idea of statistical inference was enunciated by a French scientist, P. S. Laplace. In 1781, he published a plan for a partial investigation in which he determined the sample size needed to reach the desired accuracy in estimation. The plan was based on Laplace s Principle of Inverse Probability and on his derivation of the Central Limit Theorem. They were published in a memoir in 1774 which is one of the origins of statistical inference. Laplace s inference model was based on Bernoulli trials and binominal probabilities. He assumed that populations were changing constantly. It was depicted by assuming a priori distributions for parameters. Laplace s inference model dominated statistical thinking for a century. Sample selection in Laplace s investigations was purposive. In 1894 in the International Statistical Institute meeting, Norwegian Anders Kiaer presented the idea of the Representative Method to draw samples. Its idea was that the sample would be a miniature of the population. It is still prevailing. The virtues of random sampling were known but practical problems of sample selection and data collection hindered its use. Arhtur Bowley realized the potentials of Kiaer s method and in the beginning of the 20th century carried out several surveys in the UK. He also developed the theory of statistical inference for finite populations. It was based on Laplace s inference model. R. A. Fisher contributions in the 1920 s constitute a watershed in the statistical science He revolutionized the theory of statistics. In addition, he introduced a new statistical inference model which is still the prevailing paradigm. The essential idea is to draw repeatedly samples from the same population and the assumption that population parameters are constants. Fisher s theory did not include a priori probabilities. Jerzy Neyman adopted Fisher s inference model and applied it to finite populations with the difference that Neyman s inference model does not include any assumptions of the distributions of the study variables. Applying Fisher s fiducial argument he developed the theory for confidence intervals. Neyman s last contribution to survey sampling presented a theory for double sampling. This gave the central idea for statisticians at the U.S. Census Bureau to develop the complex survey design for the CPS. Important criterion was to have a method in which the costs of data collection were acceptable, and which provided approximately equal interviewer workloads, besides sufficient accuracy in estimation.