13 resultados para random regression
em Helda - Digital Repository of University of Helsinki
Resumo:
The aim of the present study was to determine relationships between insurance status and utilization of oral health care and its characteristics and to identify factors related to insured patients’ selection of dental clinic or dentist. The study was based on cross-sectional data obtained through phone interviews. The target population included adults in the city of Tehran. Using a two-stage stratified random technique, 3,200 seven-digit numbers resembling real phone numbers were drawn; when calling, 1,669 numbers were unavailable (busy, no answer, fax, line blocked). Of the 1,531 subjects who answered the phone call, 224 were outside the target age (under 18), and 221 refused to respond, leaving 1,086 subjects in the final sample. The interviews were carried out using a structured questionnaire and covered characteristics of dental visits, the respondent’s reason for selecting a particular dentist or clinic and demographic and socio-economic background (gender, age, level of education, income, and insurance status). Data analysis included the Chi-square test, ANOVA, and logistic regression and the corresponding odds ratios (OR). Of all the 1,086 respondents, 57% were women, 62% were under age 35, 46% had a medium and 34% a high level of education, 13% were under the poverty line, and 70% had insurance coverage; 64% with the public, and 6% with a commercial insurance. Having insurance coverage was more likely for women (OR=1.5), for those in the oldest age group (OR=2.0), and for those with a high level of education (OR=2.5). Of those with dental insurance, 54% reported having had a dental visit within the past 12 months ; more often by those with commercial insurance in comparison with public (65% vs. 53% p<0.001). Check-up as the reason for the most recent visit occurred most frequently among those with commercial insurance (28%) compared with those having public insurance (16%) or being non-insured (13%) (p<0.001). Having had two or more dental visits within the past 12 months was most common among insured respondents, when compared with the non-insured (31% vs. 22% p=0.01). The non-insured respondents reported tooth extractions almost twice as frequently as did the insured ones (p<0.001). Of the 726 insured subjects, 60% selected fully out-of-pocket-paid services (FOP), and 53% were unaware of their insurance benefits. Of those who selected FOP, good interpersonal aspects (OR=4.6), being unaware of dental insurance benefits (OR=4.6), and good technical aspects (OR=2.3) as a reason had greater odds of selecting FOP. The present study revealed that dental insurance was positively related to demand for oral health care as well as to utilization of services, but to the latter with a minor extent. Among insured respondents, despite their opportunity to use fully or highly subsidized oral health care services, good interpersonal relationship and high quality of services were the most important factors when an insured patient selected a dentist or a clinic. The present findings indicate a clear need to modify dental insurance systems in Iran to facilitate optimal use of oral health care services to maximize the oral health of the population. A special emphasis in the insurance schemes should be focused on preventive care.
Resumo:
The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.
Resumo:
Planar curves arise naturally as interfaces between two regions of the plane. An important part of statistical physics is the study of lattice models. This thesis is about the interfaces of 2D lattice models. The scaling limit is an infinite system limit which is taken by letting the lattice mesh decrease to zero. At criticality, the scaling limit of an interface is one of the SLE curves (Schramm-Loewner evolution), introduced by Oded Schramm. This family of random curves is parametrized by a real variable, which determines the universality class of the model. The first and the second paper of this thesis study properties of SLEs. They contain two different methods to study the whole SLE curve, which is, in fact, the most interesting object from the statistical physics point of view. These methods are applied to study two symmetries of SLE: reversibility and duality. The first paper uses an algebraic method and a representation of the Virasoro algebra to find common martingales to different processes, and that way, to confirm the symmetries for polynomial expected values of natural SLE data. In the second paper, a recursion is obtained for the same kind of expected values. The recursion is based on stationarity of the law of the whole SLE curve under a SLE induced flow. The third paper deals with one of the most central questions of the field and provides a framework of estimates for describing 2D scaling limits by SLE curves. In particular, it is shown that a weak estimate on the probability of an annulus crossing implies that a random curve arising from a statistical physics model will have scaling limits and those will be well-described by Loewner evolutions with random driving forces.
Resumo:
Population dynamics are generally viewed as the result of intrinsic (purely density dependent) and extrinsic (environmental) processes. Both components, and potential interactions between those two, have to be modelled in order to understand and predict dynamics of natural populations; a topic that is of great importance in population management and conservation. This thesis focuses on modelling environmental effects in population dynamics and how effects of potentially relevant environmental variables can be statistically identified and quantified from time series data. Chapter I presents some useful models of multiplicative environmental effects for unstructured density dependent populations. The presented models can be written as standard multiple regression models that are easy to fit to data. Chapters II IV constitute empirical studies that statistically model environmental effects on population dynamics of several migratory bird species with different life history characteristics and migration strategies. In Chapter II, spruce cone crops are found to have a strong positive effect on the population growth of the great spotted woodpecker (Dendrocopos major), while cone crops of pine another important food resource for the species do not effectively explain population growth. The study compares rate- and ratio-dependent effects of cone availability, using state-space models that distinguish between process and observation error in the time series data. Chapter III shows how drought, in combination with settling behaviour during migration, produces asymmetric spatially synchronous patterns of population dynamics in North American ducks (genus Anas). Chapter IV investigates the dynamics of a Finnish population of skylark (Alauda arvensis), and point out effects of rainfall and habitat quality on population growth. Because the skylark time series and some of the environmental variables included show strong positive autocorrelation, the statistical significances are calculated using a Monte Carlo method, where random autocorrelated time series are generated. Chapter V is a simulation-based study, showing that ignoring observation error in analyses of population time series data can bias the estimated effects and measures of uncertainty, if the environmental variables are autocorrelated. It is concluded that the use of state-space models is an effective way to reach more accurate results. In summary, there are several biological assumptions and methodological issues that can affect the inferential outcome when estimating environmental effects from time series data, and that therefore need special attention. The functional form of the environmental effects and potential interactions between environment and population density are important to deal with. Other issues that should be considered are assumptions about density dependent regulation, modelling potential observation error, and when needed, accounting for spatial and/or temporal autocorrelation.
Resumo:
Time-dependent backgrounds in string theory provide a natural testing ground for physics concerning dynamical phenomena which cannot be reliably addressed in usual quantum field theories and cosmology. A good, tractable example to study is the rolling tachyon background, which describes the decay of an unstable brane in bosonic and supersymmetric Type II string theories. In this thesis I use boundary conformal field theory along with random matrix theory and Coulomb gas thermodynamics techniques to study open and closed string scattering amplitudes off the decaying brane. The calculation of the simplest example, the tree-level amplitude of n open strings, would give us the emission rate of the open strings. However, even this has been unknown. I will organize the open string scattering computations in a more coherent manner and will argue how to make further progress.
Resumo:
This study examines the properties of Generalised Regression (GREG) estimators for domain class frequencies and proportions. The family of GREG estimators forms the class of design-based model-assisted estimators. All GREG estimators utilise auxiliary information via modelling. The classic GREG estimator with a linear fixed effects assisting model (GREG-lin) is one example. But when estimating class frequencies, the study variable is binary or polytomous. Therefore logistic-type assisting models (e.g. logistic or probit model) should be preferred over the linear one. However, other GREG estimators than GREG-lin are rarely used, and knowledge about their properties is limited. This study examines the properties of L-GREG estimators, which are GREG estimators with fixed-effects logistic-type models. Three research questions are addressed. First, I study whether and when L-GREG estimators are more accurate than GREG-lin. Theoretical results and Monte Carlo experiments which cover both equal and unequal probability sampling designs and a wide variety of model formulations show that in standard situations, the difference between L-GREG and GREG-lin is small. But in the case of a strong assisting model, two interesting situations arise: if the domain sample size is reasonably large, L-GREG is more accurate than GREG-lin, and if the domain sample size is very small, estimation of assisting model parameters may be inaccurate, resulting in bias for L-GREG. Second, I study variance estimation for the L-GREG estimators. The standard variance estimator (S) for all GREG estimators resembles the Sen-Yates-Grundy variance estimator, but it is a double sum of prediction errors, not of the observed values of the study variable. Monte Carlo experiments show that S underestimates the variance of L-GREG especially if the domain sample size is minor, or if the assisting model is strong. Third, since the standard variance estimator S often fails for the L-GREG estimators, I propose a new augmented variance estimator (A). The difference between S and the new estimator A is that the latter takes into account the difference between the sample fit model and the census fit model. In Monte Carlo experiments, the new estimator A outperformed the standard estimator S in terms of bias, root mean square error and coverage rate. Thus the new estimator provides a good alternative to the standard estimator.
Resumo:
Hormone therapy (HT) is widely used to relieve climacteric symptoms in order to increase the well-being of the women. The benefits as well as side-effects of HT are well documented. The principal menopausal oral symptoms are dry mouth (DM) and sensation of painful mouth (PM) due to various causes. Profile studies have indicated that HT users are more health-conscious than non-users. The hypothesis of the present study was that there are differences in oral health between woman using HT and those not using HT. A questionnaire study of 3173 women of menopausal age (50-58 years old) was done to investigate the prevalence of self-assessed sensations of PM and DM. Of those women participating in the questionnaire study, a random sample of 400 (200 using, 200 not using HT) was examined clinically in a 2-year follow-up study. Oral status was recorded according to WHO methods using DMFT and CPITN indices. The saliva flows were measured, salivary total protein, albumin and immunoglobulin concentrations and selected periodontal micro-organisms were analysed, and panoramic tomography of the jaws was taken. The patients filled in a structured questionnaire on their systemic health, medication and health habits. According to our questionnaire study there was no significant difference in the occurrence of self- assessed PM or DM between the HT users and non-users. According to logistic regression analyses, climacteric complaints significantly correlated with the occurrence of PM (p=0.000) and DM (p=0.000) irrespective of the use of HT, indicating that PM and DM are associated with climacteric symptoms in general. There was no difference between the groups in DMFT index values at follow up. The number of filled teeth (FT) showed a significant (p<0.05) increase in the HT group at follow-up. Periodontitis was diagnosed in 79% of HT users at baseline and in 71% at the follow-up. The values for non-HT users were 80% vs. 76%, respectively (Ns.). The mean numbers of ≥ 6 mm deep periodontal pockets were 0.9 ± 1.7 at baseline vs. 1.1 ± 2.1 two years later in the HT group, and 1.0 ± 1.7 vs. 1.2 ± 1.9, respectively, in the non-HT group. In a large Finnish national health survey, the prevalence of peridontitis of women of this age group was lower, but the prevalence of severe periodontitis seemed to be higher than in our study. Salivary albumin, IgG and IgM concentrations decreased in the HT group during the 2-year follow up (p<0.05), possibly indicating an improvement in epithelial integrity. No difference was found in any other salivary parameters or in the prevalence of the periodontal bacteria between or within the groups. In conclusion, the present findings showed that 50 to 58 year old women living in Helsinki have fairly good oral and dental health. The occurrence of PM and DM seemed to be associated with climacteric symptoms in general, and the use of HT did not affect the oral symptoms studied.
Resumo:
Detecting Earnings Management Using Neural Networks. Trying to balance between relevant and reliable accounting data, generally accepted accounting principles (GAAP) allow, to some extent, the company management to use their judgment and to make subjective assessments when preparing financial statements. The opportunistic use of the discretion in financial reporting is called earnings management. There have been a considerable number of suggestions of methods for detecting accrual based earnings management. A majority of these methods are based on linear regression. The problem with using linear regression is that a linear relationship between the dependent variable and the independent variables must be assumed. However, previous research has shown that the relationship between accruals and some of the explanatory variables, such as company performance, is non-linear. An alternative to linear regression, which can handle non-linear relationships, is neural networks. The type of neural network used in this study is the feed-forward back-propagation neural network. Three neural network-based models are compared with four commonly used linear regression-based earnings management detection models. All seven models are based on the earnings management detection model presented by Jones (1991). The performance of the models is assessed in three steps. First, a random data set of companies is used. Second, the discretionary accruals from the random data set are ranked according to six different variables. The discretionary accruals in the highest and lowest quartiles for these six variables are then compared. Third, a data set containing simulated earnings management is used. Both expense and revenue manipulation ranging between -5% and 5% of lagged total assets is simulated. Furthermore, two neural network-based models and two linear regression-based models are used with a data set containing financial statement data from 110 failed companies. Overall, the results show that the linear regression-based models, except for the model using a piecewise linear approach, produce biased estimates of discretionary accruals. The neural network-based model with the original Jones model variables and the neural network-based model augmented with ROA as an independent variable, however, perform well in all three steps. Especially in the second step, where the highest and lowest quartiles of ranked discretionary accruals are examined, the neural network-based model augmented with ROA as an independent variable outperforms the other models.
Resumo:
Markov random fields (MRF) are popular in image processing applications to describe spatial dependencies between image units. Here, we take a look at the theory and the models of MRFs with an application to improve forest inventory estimates. Typically, autocorrelation between study units is a nuisance in statistical inference, but we take an advantage of the dependencies to smooth noisy measurements by borrowing information from the neighbouring units. We build a stochastic spatial model, which we estimate with a Markov chain Monte Carlo simulation method. The smooth values are validated against another data set increasing our confidence that the estimates are more accurate than the originals.
Resumo:
Light scattering, or scattering and absorption of electromagnetic waves, is an important tool in all remote-sensing observations. In astronomy, the light scattered or absorbed by a distant object can be the only source of information. In Solar-system studies, the light-scattering methods are employed when interpreting observations of atmosphereless bodies such as asteroids, atmospheres of planets, and cometary or interplanetary dust. Our Earth is constantly monitored from artificial satellites at different wavelengths. With remote sensing of Earth the light-scattering methods are not the only source of information: there is always the possibility to make in situ measurements. The satellite-based remote sensing is, however, superior in the sense of speed and coverage if only the scattered signal can be reliably interpreted. The optical properties of many industrial products play a key role in their quality. Especially for products such as paint and paper, the ability to obscure the background and to reflect light is of utmost importance. High-grade papers are evaluated based on their brightness, opacity, color, and gloss. In product development, there is a need for computer-based simulation methods that could predict the optical properties and, therefore, could be used in optimizing the quality while reducing the material costs. With paper, for instance, pilot experiments with an actual paper machine can be very time- and resource-consuming. The light-scattering methods presented in this thesis solve rigorously the interaction of light and material with wavelength-scale structures. These methods are computationally demanding, thus the speed and accuracy of the methods play a key role. Different implementations of the discrete-dipole approximation are compared in the thesis and the results provide practical guidelines in choosing a suitable code. In addition, a novel method is presented for the numerical computations of orientation-averaged light-scattering properties of a particle, and the method is compared against existing techniques. Simulation of light scattering for various targets and the possible problems arising from the finite size of the model target are discussed in the thesis. Scattering by single particles and small clusters is considered, as well as scattering in particulate media, and scattering in continuous media with porosity or surface roughness. Various techniques for modeling the scattering media are presented and the results are applied to optimizing the structure of paper. However, the same methods can be applied in light-scattering studies of Solar-system regoliths or cometary dust, or in any remote-sensing problem involving light scattering in random media with wavelength-scale structures.
Resumo:
Energiataseen mallinnus on osa KarjaKompassi-hankkeeseen liittyvää kehitystyötä. Tutkielman tavoitteena oli kehittää lypsylehmän energiatasetta etukäteen ennustavia ja tuotoskauden aikana saatavia tietoja hyödyntäviä matemaattisia malleja. Selittävinä muuttujina olivat dieetti-, rehu-, maitotuotos-, koelypsy-, elopaino- ja kuntoluokkatiedot. Tutkimuksen aineisto kerättiin 12 Suomessa tehdyistä 8 – 28 laktaatioviikon pituisesta ruokintakokeesta, jotka alkoivat heti poikimisen jälkeen. Mukana olleista 344 lypsylehmästä yksi neljäsosa oli friisiläis- ja loput ayshire-rotuisia. Vanhempien lehmien päätiedosto sisälsi 2647 havaintoa (koe * lehmä * laktaatioviikko) ja ensikoiden 1070. Aineisto käsiteltiin SAS-ohjelmiston Mixed-proseduuria käyttäen ja poikkeavat havainnot poistettiin Tukeyn menetelmällä. Korrelaatioanalyysillä tarkasteltiin energiataseen ja selittävien muuttujien välisiä yhteyksiä. Energiatase mallinnettiin regressioanalyysillä. Laktaatiopäivän vaikutusta energiataseeseen selitettiin viiden eri funktion avulla. Satunnaisena tekijänä mallissa oli lehmä kokeen sisällä. Mallin sopivuutta aineistoon tarkasteltiin jäännösvirheen, selitysasteen ja Bayesin informaatiokriteerin avulla. Parhaat mallit testattiin riippumattomassa aineistossa. Laktaatiopäivän vaikutusta energiataseeseen selitti hyvin Ali-Schaefferin funktio, jota käytettiin perusmallina. Kaikissa energiatasemalleissa vaihtelu kasvoi laktaatioviikosta 12. alkaen, kun havaintojen määrä väheni ja energiatase muuttui positiiviseksi. Ennen poikimista käytettävissä olevista muuttujista dieetin väkirehuosuus ja väkirehun syönti-indeksi paransivat selitysastetta ja pienensivät jäännösvirhettä. Ruokinnan onnistumista voidaan seurata maitotuotoksen, maidon rasvapitoisuuden ja rasva-valkuaissuhteen tai EKM:n sisältävillä malleilla. EKM:n vakiointi pienensi mallin jäännösvirhettä. Elopaino ja kuntoluokka olivat heikkoja selittäjiä. Malleja voidaan hyödyntää karjatason ruokinnan suunnittelussa ja seurannassa, mutta yksittäisen lehmän energiataseen ennustamiseen ne eivät sovellu.