4 resultados para seemingly unrelated regression

em Helda - Digital Repository of University of Helsinki


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis report attempts to improve the models for predicting forest stand structure for practical use, e.g. forest management planning (FMP) purposes in Finland. Comparisons were made between Weibull and Johnson s SB distribution and alternative regression estimation methods. Data used for preliminary studies was local but the final models were based on representative data. Models were validated mainly in terms of bias and RMSE in the main stand characteristics (e.g. volume) using independent data. The bivariate SBB distribution model was used to mimic realistic variations in tree dimensions by including within-diameter-class height variation. Using the traditional method, diameter distribution with the expected height resulted in reduced height variation, whereas the alternative bivariate method utilized the error-term of the height model. The lack of models for FMP was covered to some extent by the models for peatland and juvenile stands. The validation of these models showed that the more sophisticated regression estimation methods provided slightly improved accuracy. A flexible prediction and application for stand structure consisted of seemingly unrelated regression models for eight stand characteristics, the parameters of three optional distributions and Näslund s height curve. The cross-model covariance structure was used for linear prediction application, in which the expected values of the models were calibrated with the known stand characteristics. This provided a framework to validate the optional distributions and the optional set of stand characteristics. Height distribution is recommended for the earliest state of stands because of its continuous feature. From the mean height of about 4 m, Weibull dbh-frequency distribution is recommended in young stands if the input variables consist of arithmetic stand characteristics. In advanced stands, basal area-dbh distribution models are recommended. Näslund s height curve proved useful. Some efficient transformations of stand characteristics are introduced, e.g. the shape index, which combined the basal area, the stem number and the median diameter. Shape index enabled SB model for peatland stands to detect large variation in stand densities. This model also demonstrated reasonable behaviour for stands in mineral soils.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examines the properties of Generalised Regression (GREG) estimators for domain class frequencies and proportions. The family of GREG estimators forms the class of design-based model-assisted estimators. All GREG estimators utilise auxiliary information via modelling. The classic GREG estimator with a linear fixed effects assisting model (GREG-lin) is one example. But when estimating class frequencies, the study variable is binary or polytomous. Therefore logistic-type assisting models (e.g. logistic or probit model) should be preferred over the linear one. However, other GREG estimators than GREG-lin are rarely used, and knowledge about their properties is limited. This study examines the properties of L-GREG estimators, which are GREG estimators with fixed-effects logistic-type models. Three research questions are addressed. First, I study whether and when L-GREG estimators are more accurate than GREG-lin. Theoretical results and Monte Carlo experiments which cover both equal and unequal probability sampling designs and a wide variety of model formulations show that in standard situations, the difference between L-GREG and GREG-lin is small. But in the case of a strong assisting model, two interesting situations arise: if the domain sample size is reasonably large, L-GREG is more accurate than GREG-lin, and if the domain sample size is very small, estimation of assisting model parameters may be inaccurate, resulting in bias for L-GREG. Second, I study variance estimation for the L-GREG estimators. The standard variance estimator (S) for all GREG estimators resembles the Sen-Yates-Grundy variance estimator, but it is a double sum of prediction errors, not of the observed values of the study variable. Monte Carlo experiments show that S underestimates the variance of L-GREG especially if the domain sample size is minor, or if the assisting model is strong. Third, since the standard variance estimator S often fails for the L-GREG estimators, I propose a new augmented variance estimator (A). The difference between S and the new estimator A is that the latter takes into account the difference between the sample fit model and the census fit model. In Monte Carlo experiments, the new estimator A outperformed the standard estimator S in terms of bias, root mean square error and coverage rate. Thus the new estimator provides a good alternative to the standard estimator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The progressive myoclonic epilepsies (PMEs) are a clinically and etiologically heterogeneous group of symptomatic epilepsies characterized by myoclonus, tonic-clonic seizures, psychomotor regression and ataxia. Different disorders have been classified as PMEs. Of these, the group of neuronal ceroid lipofuscinoses (NCLs) comprise an entity that has onset in childhood, being the most common cause of neurodegeneration in children. The primary aim of this thesis was to dissect the molecular genetic background of patients with childhood onset PME by studying candidate genes and attempting to identify novel PME-associated genes. Another specific aim was to study the primary protein properties of the most recently identified member of the NCL-causing proteins, MFSD8. To dissect the genetic background of a cohort of Turkish patients with childhood onset PME, a screen of the NCL-associated genes PPT1, TPP1, CLN3, CLN5, CLN6, MFSD8, CLN8 and CTSD was performed. Altogether 49 novel mutations were identified, which together with 56 mutations found by collaborators raised the total number of known NCL mutations to 364. Fourteen of the novel mutations affect the recently identified MFSD8 gene, which had originally been identified in a subset of mainly Turkish patients as the underlying cause of CLN7 disease. To investigate the distribution of MFSD8 defects, a total of 211 patients of different ethnic origins were evaluated for mutations in the gene. Altogether 45 patients from nine different countries were provided with a CLN7 molecular diagnosis, denoting the wide geographical occurrence of MFSD8 defects. The mutations are private with only one having been established by a founder-effect in the Roma population from the former Czechoslovakia. All mutations identified except one are associated with the typical clinical picture of variant late-infantile NCL. To address the trafficking properties of MFSD8, lysosomal targeting of the protein was confirmed in both neuronal and non-neuronal cells. The major determinant for this lysosomal sorting was identified to be an N-terminal dileucine based signal (9-EQEPLL-14), recognized by heterotetrameric AP-1 adaptor proteins, suggesting that MFSD8 takes the direct trafficking pathway en route to the lysosomes. Expression studies revealed the neurons as the primary cell-type and the hippocampus and cerebellar granular cell layer as the predominant regions in which MFSD8 is expressed. To identify novel genes associated with childhood onset PME, a single nucleotide polymorphism (SNP) genomewide scan was performed in three small families and 18 sporadic patients followed by homozygosity mapping to determine the candidate loci. One of the families and a sporadic patient were positive for mutations in PLA2G6, a gene that had previously been shown to cause infantile neuroaxonal dystrophy. Application of next-generation sequencing of candidate regions in the remaining two families led to identification of a homozygous missense mutation in USP19 for the first and TXNDC6 for the second family. Analysis of the 18 sporadic cases mapped the best candidate interval in a 1.5 Mb region on chromosome 7q21. Screening of the positional candidate KCTD7 revealed six mutations in seven unrelated families. All patients with mutations in KCTD7 were reported to have early onset PME, rapid disease progression leading to dementia and no pathologic hallmarks. The identification of KCTD7 mutations in nine patients and the clinical delineation of their phenotype establish KCTD7 as a gene for early onset PME. The findings presented in this thesis denote MFSD8 and KCTD7 as genes commonly associated with childhood onset symptomatic epilepsy. The disease-associated role of TXNDC6 awaits verification through identification of additional mutations in patients with similar phenotypes. Completion of the genetic spectrum underlying childhood onset PMEs and understanding of the gene products functions will comprise important steps towards understanding the underlying pathogenetic mechanisms, and will possibly shed light on the general processes of neurodegeneration and nervous system regulation, facilitating the diagnosis, classification and possibly treatment of the affected cases.