50 resultados para penalized likelihood
em Helda - Digital Repository of University of Helsinki
Resumo:
We propose an efficient and parameter-free scoring criterion, the factorized conditional log-likelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional log-likelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional log-likelihood scoring criterion. The resulting criterion has an information-theoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with state-of-the-art classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLL-trained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.
Resumo:
Aptitude-based student selection: A study concerning the admission processes of some technically oriented healthcare degree programmes in Finland (Orthotics and Prosthetics, Dental Technology and Optometry). The data studied consisted of conveniencesamples of preadmission information and the results of the admission processes of three technically oriented healthcare degree programmes (Orthotics and Prosthetics, Dental Technology and Optometry) in Finland during the years 1977-1986 and 2003. The number of the subjects tested and interviewed in the first samples was 191, 615 and 606, and in the second 67, 64 and 89, respectively. The questions of the six studies were: I. How were different kinds of preadmission data related to each other? II. Which were the major determinants of the admission decisions? III. Did the graduated students and those who dropped out differ from each other? IV. Was it possible to predict how well students would perform in the programmes? V. How was the student selection executed in the year 2003? VI. Should clinical vs. statistical prediction or both be used? (Some remarks are presented on Meehl's argument: "Always, we might as well face it, the shadow of the statistician hovers in the background; always the actuary will have the final word.") The main results of the study were as follows: Ability tests, dexterity tests and judgements of personality traits (communication skills, initiative, stress tolerance and motivation) provided unique, non-redundant information about the applicants. Available demographic variables did not bias the judgements of personality traits. In all three programme settings, four-factor solutions (personality, reasoning, gender-technical and age-vocational with factor scores) could be extracted by the Maximum Likelihood method with graphical Varimax rotation. The personality factor dominated the final aptitude judgements and very strongly affected the selection decisions. There were no clear differences between graduated students and those who had dropped out in regard to the four factors. In addition, the factor scores did not predict how well the students performed in the programmes. Meehl's argument on the uncertainty of clinical prediction was supported by the results, which on the other hand did not provide any relevant data for rules on statistical prediction. No clear arguments for or against the aptitude-based student selection was presented. However, the structure of the aptitude measures and their impact on the admission process are now better known. The concept of "personal aptitude" is not necessarily included in the values and preferences of those in charge of organizing the schooling. Thus, obviously the most well-founded and cost-effective way to execute student selection is to rely on e.g. the grade point averages of the matriculation examination and/or written entrance exams. This procedure, according to the present study, would result in a student group which has a quite different makeup (60%) from the group selected on the basis of aptitude tests. For the recruiting organizations, instead, "personal aptitude" may be a matter of great importance. The employers, of course, decide on personnel selection. The psychologists, if consulted, are responsible for the proper use of psychological measures.
Resumo:
In clinical settings impulsivity refers to a symptom of psychiatric disorder, but nonclinically oriented research treats impulsivity as a personality and temperament dimension. This prospective study examined whether impulsivity predicts adverse health-related behaviour and increased risk of health problems in a large, nonclinical sample of 5433 subjects working in 12 Finnish hospitals. The data were collected using two questionnaire surveys at a 2-year interval. After controlling for alcohol use at baseline, higher impulsivity predicted increased alcohol consumption at follow-up in both genders (p < .01) and was associated with increased likelihood of becoming a heavy drinker or taking up smoking (p < .05). Impulsivity also predicted an increased number of cigarettes smoked per day in the follow-up among women (p < .001), but not among men, although adjustment for the number of cigarettes smoked at baseline attenuated these associations (p = .08 for women). In men, higher impulsivity was associated with shorter sleep duration and waking up several times per night independent of baseline characteristics (p < .01), whereas in women, higher impulsivity predicted difficulty in falling asleep and waking up feeling tired after the usual amount of sleep (p < .05). In women, these associations became nonsignificant after adjustment for pre-existing somatic and psychiatric diseases. Finally, higher impulsivity was associated with an increased 2-year incidence of physician-diagnosed peptic ulcer disease (adjusted odds ratio (OR) = 2.42, 95% confidence interval (CI) = 1.21 - 4.82) and onset of depression (OR = 1.95, 95% CI = 1.28 - 2.97) after adjustment for a variety of baseline covariates. In conclusion, this study shows that in a nonclinical population, impulsivity appears to be a risk factor for various unhealthy behaviour and health problems.
Resumo:
In line with major demographic changes in other Northern European and North American countries and Australia, being nonmarried is becoming increasingly common in Finland, and the proportion of cohabiters and of persons living alone has grown in recent decades. Official marital status no longer reflects an individual s living arrangement, as single, divorced and widowed persons may live alone, with a partner, with children, with parents, with siblings, or with unrelated persons. Thus, more than official marital status, living arrangements may be a stronger discriminator of one s social bonds and health. The general purpose of this study was to deepen our current understanding of the magnitude, trends, and determinants of ill health by living arrangements in the Finnish working-age population. Distinct measures of different dimensions of poor health, as well as an array of associated factors, provided a comprehensive picture of health differences by living arrangements and helped to assess the role of other factors in the interpretation of these differences . Mortality analyses were based on Finnish census records at the end of 1995 linked with cause-of-death registers for 1996 2000. The data included all persons aged 30 and over. Morbidity analyses were based on two comparable cross-sectional studies conducted twenty years apart (the Mini-Finland Survey in 1978 80 and the Health 2000 Survey in 2000 01). Both surveys were based on nationally representative samples of Finns aged 30 and over, and benefited from high participation rates. With the exception of mortality analyses, this study focused on health differences among the working-age population (mortality in age groups 30-64 and 65 and over, self-rated health and mental health in the age group 30-64, and unhealthy alcohol use in the age group 30-54). Compared with all nonmarried groups, married men and women exhibited the best health in terms of mortality, self-rated health, mental health and unhealthy alcohol use. Cohabiters did not differ from married persons in terms of self-rated health or mental health, but did exhibit excess unhealthy alcohol use and high mortality, particularly from alcohol-related causes. Compared with the married, persons living alone or with someone other than a partner exhibited elevated mortality as well as excess poor mental health and unhealthy alcohol use. By all measures of health, men and women living alone tended to be in the worst position. Over the past twenty years, SRH had improved least among single men and women and widowed women, and most among cohabiting women. The association between living arrangements and health has many possible explanations. The health-related selection theory suggests that healthy people are more likely to enter and maintain a marriage or a consensual union than those who are unhealthy (direct selection) or that a variety of health-damaging behavioural and social factors increase the likelihood of ill health and the probability of remaining without a partner or becoming separated from one s partner (indirect selection). According to the social causation theory, marriage or cohabitation has a health-promoting effect, whereas living alone or with others than a partner has a detrimental effect on health. In this study, the role of other factors that are mainly assumed to reflect selection, appeared to be rather modest. Social support, which reflects social causation, contributed only modestly to differences in unhealthy alcohol use by living arrangements, but had a larger effect on differences in poor mental health. Socioeconomic factors and health-related behaviour, which reflect both selection and causation, appeared to play a more important role in the excess poor health of cohabiters and of persons living alone or with someone other than a partner, than of married persons. Living arrangements were strongly connected to various dimensions of ill health. In particular, alcohol consumption appeared to be of great importance in the association between living arrangements and health. To the extent that the proportion of nonmarried persons continues to grow and their health does not improve at the same rate as that of married persons, the challenges that currently nonmarried persons pose to public health will likely increase.
Resumo:
Prescribing for older patients is challenging. The prevalence of diseases increases with advancing age and causes extensive drug use. Impairments in cognitive, sensory, social and physical functioning, multimorbidity and comorbidities, as well as age-related changes in pharmacokinetics and pharmacodynamics all add to the complexity of prescribing. This study is a cross-sectional assessment of all long-term residents aged ≥ 65 years in all nursing homes in Helsinki, Finland. The residents’ health status was assessed and data on their demographic factors, health and medications were collected from their medical records in February 2003. This study assesses some essential issues in prescribing for older people: psychotropic drugs (Paper I), laxatives (Paper II), vitamin D and calcium supplements (Paper III), potentially inappropriate drugs for older adults (PIDs) and drug-drug interactions (DDIs)(Paper IV), as well as prescribing in public and private nursing homes. A resident was classified as a medication user if his or her medication record indicated a regular sequence for its dosage. Others were classified as non-users. Mini Nutritional Assessment (MNA) was used to assess residents’ nutritional status, Beers 2003 criteria to assess the use of PIDs, and the Swedish, Finnish, INteraction X-referencing database (SFINX) to evaluate their exposure to DDIs. Of all nursing home residents in Helsinki, 82% (n=1987) participated in studies I, II, and IV and 87% (n=2114) participated in the study III. The residents’ mean age was 84 years, 81% were female, and 70% were diagnosed with dementia. The mean number of drugs was 7.9 per resident; 40% of the residents used ≥ 9 drugs per day, and were thus exposed to polypharmacy. Eighty percent of the residents received psychotropics; 43% received antipsychotics, and 45% used antidepressants. Anxiolytics were prescribed to 26%, and hypnotics to 28% of the residents. Of those residents diagnosed with dementia, 11% received antidementia drugs. Fifty five percent of the residents used laxatives regularly. In multivariate analysis, those factors associated with regular laxative use were advanced age, immobility, poor nutritional status, chewing problems, Parkinson’s disease, and a high number of drugs. Eating snacks between meals was associated with lower risk for laxative use. Of all participants, 33% received vitamin D supplementation, 28% received calcium supplementation, and 20% received both vitamin D and calcium. The dosage of vitamin D was rather low: 21% received vitamin D 400 IU (10 µg) or more, and only 4% received 800 IU (20 µg) or more. In multivariate analysis, residents who received vitamin D supplementation enjoyed better nutritional status, ate snacks between meals, suffered no constipation, and received regular weight monitoring. Those residents receiving PIDs (34% of all residents) more often used psychotropic medication and were more often exposed to polypharmacy than residents receiving no PIDs. Residents receiving PIDs were less often diagnosed with dementia than were residents receiving no PIDs. The three most prevalent PIDs were short-acting benzodiazepine in greater dosages than recommended, hydroxyzine, and nitrofurantoin. These three drugs accounted for nearly 77% of all PID use. Of all residents, less than 5% were susceptible to a clinically significant DDI. The most common DDIs were related to the use of potassium-sparing diuretics, carbamazepine, and codeine. Residents exposed to potential DDIs were younger, had more often suffered a previous stroke, more often used psychotropics, and were more often exposed to PIDs and polypharmacy than were residents not exposed to DDIs. Residents in private nursing homes were less often exposed to polypharmacy than were residents in public nursing homes. Long-term residents in nursing homes in Helsinki use, on average, nearly eight drugs daily. The use of psychotropic drugs in our study was notably more common than in international studies. The prevalence of laxatives equaled other prior international studies. Regardless of the known benefit and recommendation of vitamin D supplementation for elderly residing mostly indoors, the proportion of nursing home residents receiving vitamin D and calcium was surprisingly low. The use of PIDs was common among nursing home residents. PIDs increased the likelihood of DDIs. However, DDIs did not seem a major concern among the nursing home population. Monitoring PIDs and potential drug interactions could improve the quality of prescribing.
Resumo:
In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.
Resumo:
Sepsis is associated with a systemic inflammatory response. It is characterised by an early proinflammatory response and followed by a state of immunosuppression. In order to improve the outcome of patients with infection and sepsis, novel therapies that influence the systemic inflammatory response are being developed and utilised. Thus, an accurate and early diagnosis of infection and evaluation of immune state are crucial. In this thesis, various markers of systemic inflammation were studied with respect to enhancing the diagnostics of infection and of predicting outcome in patients with suspected community-acquired infection. A total of 1092 acutely ill patients admitted to a university hospital medical emergency department were evaluated, and 531 patients with a suspicion of community-acquired infection were included for the analysis. Markers of systemic inflammation were determined from a blood sample obtained simultaneously with a blood culture sample on admission to hospital. Levels of phagocyte CD11b/CD18 and CD14 expression were measured by whole blood flow cytometry. Concentrations of soluble CD14, interleukin (IL)-8, and soluble IL-2 receptor α (sIL-2Rα) were determined by ELISA, those of sIL-2R, IL-6, and IL-8 by a chemiluminescent immunoassay, that of procalcitonin by immunoluminometric assay, and that of C-reactive protein by immunoturbidimetric assay. Clinical data were collected retrospectively from the medical records. No marker of systemic inflammation, neither CRP, PCT, IL-6, IL-8, nor sIL-2R predicted bacteraemia better than did the clinical signs of infection, i.e., the presence of infectious focus or fever or both. IL-6 and PCT had the highest positive likelihood ratios to identify patients with hidden community-acquired infection. However, the use of a single marker failed to detect all patients with infection. A combination of markers including a fast-responding reactant (CD11b expression), a later-peaking reactant (CRP), and a reactant originating from inflamed tissues (IL-8) detected all patients with infection. The majority of patients (86.5%) with possible but not verified infection showed levels exceeding at least one cut-off limit of combination, supporting the view that infection was the cause of their acute illness. The 28-day mortality of patients with community-acquired infection was low (3.4%). On admission to hospital, the low expression of cell-associated lipopolysaccharide receptor CD14 (mCD14) was predictive for 28-day mortality. In the patients with severe forms of community-acquired infection, namely pneumonia and sepsis, high levels of soluble CD14 alone did not predict mortality, but a high sCD14 level measured simultaneously with a low mCD14 raised the possibility of poor prognosis. In conclusion, to further enhance the diagnostics of hidden community-acquired infection, a combination of inflammatory markers is useful; 28-day mortality is associated with low levels of mCD14 expression at an early phase of the disease.
Resumo:
Sorkkasairaudet ovat kasvava ongelma lypsykarjatiloilla. Sorkka- ja jalkaviat aiheuttavat ennenaikaisten poistojen lisäksi taloudellisia tappioita alentamalla maitotuotosta ja lisäämällä eläinlääkintä- ja sorkkahoitokuluja. Tämän työn tavoitteena oli tutkia sorkkasairauksien periytyvyyttä ja sorkkasairauksiin vaikuttavia tekijöitä. Tutkimusaineisto saatiin Terveet Sorkat -ohjelmasta, johon liittyminen on vapaaehtoista. Sorkkahoitajat olivat luokitelleet sorkkasairaudet vuosina 2003 2004. Sorkkasairaudet (vertymät anturassa, krooninen sorkkakuume, valkoviivan repeämä, anturahaavauma, sorkkavälin ihotulehdus, kantasyöpymä, sorkka-alueen ihotulehdus ja sorkkakiertymä ja muut sorkkasairaudet) oli luokiteltu aineistossa kaksiluokkaisina (kyllä/ei) ominaisuuksina. Aineiston esikäsittelyyn, alustaviin analyyseihin ja kiinteiden tekijöiden tilastollisen merkitsevyyden testaamiseen F-testillä käytettiin WSYS-ohjelmistoa. Lisäksi kiinteiden tekijöiden merkitsevyyttä testattiin logit-mallilla SAS-ohjelmistolla. Varianssikomponentit laskettiin Restricted Maximum Likelihood (REML)-menetelmällä VCE4-ohjelmistolla. Toistuvuuseläinmallilla saatiin seuraavia periytymisasteen arvioita: vertymät anturassa 0,05, valkoviivan repeämä 0,04, sorkkakiertymä 0,05, kantasyöpymä 0,01, anturahaavauma 0,03 ja sorkkasairaudet yhtenä ominaisuutena 0,06. Sorkkasairauksien periytymisasteiden arviot muutettuna sorkkasairausalttiuksien periytymisasteiksi olivat: vertymät anturassa 0,11, valkoviivan repeämä 0,12, sorkkakiertymä 0,15, kantasyöpymä 0,03, anturahaavauma 0,17 ja sorkkasairaudet yhtenä ominaisuutena 0,09. Sorkkasairauksien väliset geneettiset korrelaatiot olivat positiivisia lukuun ottamatta valkoviivan repeämän ja kantasyöpymän välistä geneettistä korrelaatiota, joka oli lievästi negatiivinen. Sorkkasairauksien geneettiset korrelaatiot 305 päivän maitotuotokseen olivat -0,20 0,27. Tämän tutkimuksen ja aiempien tutkimusten perusteella perimän osuus sorkkasairauksiin ei ole kovin suuri. Koska ympäristötekijöillä on suuri merkitys sorkkasairauksien esiintymiseen, sorkkasairauksien ennaltaehkäisyssä tulisi kiinnittää erityistä huomiota navetan olosuhteisiin, säännölliseen sorkkahoitoon ja oikeaan ruokintaan.
Resumo:
This study analysed whether the land tenure insecurity problem has led to a decline in long-term land improvements (liming and phosphorus fertilization) under the Common Agricultural Policy (CAP) and Nordic production conditions in European Union (EU) countries such as Finland. The results suggests that under traditional cash lease contracts, which are encouraged by the existing land leasing regulations and agricultural subsidy programs, the land tenure insecurity problem on leased land reduces land improvements that have a long pay-back period. In particular, soil pH was found to be significantly lower on land cultivated under a lease contract compared to land owned by the farmers themselves. The results also indicate that land improvements could not be reversed by land markets, because land owners would otherwise have carried out land improvements even if not farming by themselves. To reveal the causality between land tenure and land improvements, the dynamic optimisation problem was solved by a stochastic dynamic programming routine with known parameters for one-period returns and transition equations. The model parameters represented Finnish soil quality and production conditions. The decision rules were solved for alternative likelihood scenarios over the continuation of the fixed-term lease contract. The results suggest that as the probability of non-renewal of the lease contract increases, farmers quickly reduce investments in irreversible land improvements and, thereafter, yields gradually decline. The simulations highlighted the observed trends of a decline in land improvements on land parcels that are cultivated under lease contracts. Land tenure has resulted in the neglect of land improvement in Finland. This study aimed to analyze whether these challenges could be resolved by a tax policy that encourages land sales. Using Finnish data, real estate tax and a temporal relaxation on the taxation of capital gains showed some potential for the restructuring of land ownership. Potential sellers who could not be revealed by traditional logit models were identified with the latent class approach. Those landowners with an intention to sell even without a policy change were sensitive to temporal relaxation in the taxation of capital gains. In the long term, productivity and especially productivity growth are necessary conditions for the survival of farms and the food industry in Finland. Technical progress was found to drive the increase in productivity. The scale had only a moderate effect and for the whole study period (1976–2006) the effect was close to zero. Total factor productivity (TFP) increased, depending on the model, by 0.6–1.7% per year. The results demonstrated that the increase in productivity was hindered by the policy changes introduced in 1995. It is also evidenced that the increase in land leasing is connected to these policy changes. Land institutions and land tenure questions are essential in agricultural and rural policies on all levels, from local to international. Land ownership and land titles are commonly tied to fundamental political, economic and social questions. A fair resolution calls for innovative and new solutions both on national and international levels. However, this seems to be a problem when considering the application of EU regulations to member states inheriting divergent landownership structures and farming cultures. The contribution of this study is in describing the consequences of fitting EU agricultural policy to Finnish agricultural land tenure conditions and heritage.
Resumo:
In the past decade, the Finnish agricultural sector has undergone rapid structural changes. The number of farms has decreased and the average farm size has increased when the number of farms transferred to new entrants has decreased. Part of the structural change in agriculture is manifested in early retirement programmes. In studying farmers exit behaviour in different countries, institutional differences, incentive programmes and constraints are found to matter. In Finland, farmers early retirement programmes were first introduced in 1974 and, during the last ten years, they have been carried out within the European Union framework for these programmes. The early retirement benefits are farmer specific and de-pend on the level of pension insurance the farmer has paid over his active farming years. In order to predict the future development of the agricultural sector, farmers have been frequently asked about their future plans and their plans for succession. However, the plans the farmers made for succession have been found to be time inconsistent. This study estimates the value of farmers stated succession plans in predicting revealed succession decisions. A stated succession plan exists when a farmer answers in a survey questionnaire that the farm is going to be transferred to a new entrant within a five-year period. The succession is revealed when the farm is transferred to a suc-cessor. Stated and revealed behaviour was estimated as a recursive Binomial Probit Model, which accounts for the censoring of the decision variables and controls for a potential correlation between the two equations. The results suggest that the succession plans, as stated by elderly farmers in the questionnaires, do not provide information that is significant and valuable in predicting true, com-pleted successions. Therefore, farmer exit should be analysed based on observed behaviour rather than on stated plans and intentions. As farm retirement plays a crucial role in determining the characteristics of structural change in agriculture, it is important to establish the factors which determine an exit from farming among eld-erly farmers and how off-farm income and income losses affect their exit choices. In this study, the observed choice of pension scheme by elderly farmers was analysed by a bivariate probit model. Despite some variations in significance and the effects of each factor, the ages of the farmer and spouse, the age and number of potential successors, farm size, income loss when retiring and the location of the farm together with the production line were found to be the most important determi-nants of early retirement and the transfer or closure of farms. Recently, the labour status of the spouse has been found to contribute significantly to individual retirement decisions. In this study, the effect of spousal retirement and economic incentives related to the timing of a farming couple s early retirement decision were analysed with a duration model. The results suggest that an expected pension in particular advances farm transfers. It was found that on farms operated by a couple, both early retirement and farm succession took place more often than on farms operated by a single person. However, the existence of a spouse delayed the timing of early retirement. Farming couples were found to co-ordinate their early retirement decisions when they both exit through agricultural retirement programmes, but such a co-ordination did not exist when one of the spouses retired under other pension schemes. Besides changes in the agricultural structure, the share and amount of off-farm income of a farm family s total income has also increased. In the study, the effect of off-farm income on farmers retirement decisions, in addition to other financial factors, was analysed. The unknown parameters were first estimated by a switching-type multivariate probit model and then by the simulated maxi-mum likelihood (SML) method, controlling for farmer specific fixed effects and serial correlation of the errors. The results suggest that elderly farmers off-farm income is a significant determinant in a farmer s choice to exit and close down the farm. However, off-farm income only has a short term effect on structural changes in agriculture since it does not significantly contribute to the timing of farm successions.
Resumo:
The Taita Hills in southeastern Kenya form the northernmost part of Africa’s Eastern Arc Mountains, which have been identified by Conservation International as one of the top ten biodiversity hotspots on Earth. As with many areas of the developing world, over recent decades the Taita Hills have experienced significant population growth leading to associated major changes in land use and land cover (LULC), as well as escalating land degradation, particularly soil erosion. Multi-temporal medium resolution multispectral optical satellite data, such as imagery from the SPOT HRV, HRVIR, and HRG sensors, provides a valuable source of information for environmental monitoring and modelling at a landscape level at local and regional scales. However, utilization of multi-temporal SPOT data in quantitative remote sensing studies requires the removal of atmospheric effects and the derivation of surface reflectance factor. Furthermore, for areas of rugged terrain, such as the Taita Hills, topographic correction is necessary to derive comparable reflectance throughout a SPOT scene. Reliable monitoring of LULC change over time and modelling of land degradation and human population distribution and abundance are of crucial importance to sustainable development, natural resource management, biodiversity conservation, and understanding and mitigating climate change and its impacts. The main purpose of this thesis was to develop and validate enhanced processing of SPOT satellite imagery for use in environmental monitoring and modelling at a landscape level, in regions of the developing world with limited ancillary data availability. The Taita Hills formed the application study site, whilst the Helsinki metropolitan region was used as a control site for validation and assessment of the applied atmospheric correction techniques, where multiangular reflectance field measurements were taken and where horizontal visibility meteorological data concurrent with image acquisition were available. The proposed historical empirical line method (HELM) for absolute atmospheric correction was found to be the only applied technique that could derive surface reflectance factor within an RMSE of < 0.02 ps in the SPOT visible and near-infrared bands; an accuracy level identified as a benchmark for successful atmospheric correction. A multi-scale segmentation/object relationship modelling (MSS/ORM) approach was applied to map LULC in the Taita Hills from the multi-temporal SPOT imagery. This object-based procedure was shown to derive significant improvements over a uni-scale maximum-likelihood technique. The derived LULC data was used in combination with low cost GIS geospatial layers describing elevation, rainfall and soil type, to model degradation in the Taita Hills in the form of potential soil loss, utilizing the simple universal soil loss equation (USLE). Furthermore, human population distribution and abundance were modelled with satisfactory results using only SPOT and GIS derived data and non-Gaussian predictive modelling techniques. The SPOT derived LULC data was found to be unnecessary as a predictor because the first and second order image texture measurements had greater power to explain variation in dwelling unit occurrence and abundance. The ability of the procedures to be implemented locally in the developing world using low-cost or freely available data and software was considered. The techniques discussed in this thesis are considered equally applicable to other medium- and high-resolution optical satellite imagery, as well the utilized SPOT data.
Resumo:
Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.
Resumo:
The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.
Resumo:
Minimum Description Length (MDL) is an information-theoretic principle that can be used for model selection and other statistical inference tasks. There are various ways to use the principle in practice. One theoretically valid way is to use the normalized maximum likelihood (NML) criterion. Due to computational difficulties, this approach has not been used very often. This thesis presents efficient floating-point algorithms that make it possible to compute the NML for multinomial, Naive Bayes and Bayesian forest models. None of the presented algorithms rely on asymptotic analysis and with the first two model classes we also discuss how to compute exact rational number solutions.