13 resultados para Error estimate.

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis, two separate single nucleotide polymorphism (SNP) genotyping techniques were set up at the Finnish Genome Center, pooled genotyping was evaluated as a screening method for large-scale association studies, and finally, the former approaches were used to identify genetic factors predisposing to two distinct complex diseases by utilizing large epidemiological cohorts and also taking environmental factors into account. The first genotyping platform was based on traditional but improved restriction-fragment-length-polymorphism (RFLP) utilizing 384-microtiter well plates, multiplexing, small reaction volumes (5 µl), and automated genotype calling. We participated in the development of the second genotyping method, based on single nucleotide primer extension (SNuPeTM by Amersham Biosciences), by carrying out the alpha- and beta tests for the chemistry and the allele-calling software. Both techniques proved to be accurate, reliable, and suitable for projects with thousands of samples and tens of markers. Pooled genotyping (genotyping of pooled instead of individual DNA samples) was evaluated with Sequenom s MassArray MALDI-TOF, in addition to SNuPeTM and PCR-RFLP techniques. We used MassArray mainly as a point of comparison, because it is known to be well suited for pooled genotyping. All three methods were shown to be accurate, the standard deviations between measurements being 0.017 for the MassArray, 0.022 for the PCR-RFLP, and 0.026 for the SNuPeTM. The largest source of error in the process of pooled genotyping was shown to be the volumetric error, i.e., the preparation of pools. We also demonstrated that it would have been possible to narrow down the genetic locus underlying congenital chloride diarrhea (CLD), an autosomal recessive disorder, by using the pooling technique instead of genotyping individual samples. Although the approach seems to be well suited for traditional case-control studies, it is difficult to apply if any kind of stratification based on environmental factors is needed. Therefore we chose to continue with individual genotyping in the following association studies. Samples in the two separate large epidemiological cohorts were genotyped with the PCR-RFLP and SNuPeTM techniques. The first of these association studies concerned various pregnancy complications among 100,000 consecutive pregnancies in Finland, of which we genotyped 2292 patients and controls, in addition to a population sample of 644 blood donors, with 7 polymorphisms in the potentially thrombotic genes. In this thesis, the analysis of a sub-study of pregnancy-related venous thromboses was included. We showed that the impact of factor V Leiden polymorphism on pregnancy-related venous thrombosis, but not the other tested polymorphisms, was fairly large (odds ratio 11.6; 95% CI 3.6-33.6), and increased multiplicatively when combined with other risk factors such as obesity or advanced age. Owing to our study design, we were also able to estimate the risks at the population level. The second epidemiological cohort was the Helsinki Birth Cohort of men and women who were born during 1924-1933 in Helsinki. The aim was to identify genetic factors that might modify the well known link between small birth size and adult metabolic diseases, such as type 2 diabetes and impaired glucose tolerance. Among ~500 individuals with detailed birth measurements and current metabolic profile, we found that an insertion/deletion polymorphism of the angiotensin converting enzyme (ACE) gene was associated with the duration of gestation, and weight and length at birth. Interestingly, the ACE insertion allele was also associated with higher indices of insulin secretion (p=0.0004) in adult life, but only among individuals who were born small (those among the lowest third of birth weight). Likewise, low birth weight was associated with higher indices of insulin secretion (p=0.003), but only among carriers of the ACE insertion allele. The association with birth measurements was also found with a common haplotype of the glucocorticoid receptor (GR) gene. Furthermore, the association between short length at birth and adult impaired glucose tolerance was confined to carriers of this haplotype (p=0.007). These associations exemplify the interaction between environmental factors and genotype, which, possibly due to altered gene expression, predisposes to complex metabolic diseases. Indeed, we showed that the common GR gene haplotype associated with reduced mRNA expression in thymus of three individuals (p=0.0002).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study evaluates how the advection of precipitation, or wind drift, between the radar volume and ground affects radar measurements of precipitation. Normally precipitation is assumed to fall vertically to the ground from the contributing volume, and thus the radar measurement represents the geographical location immediately below. In this study radar measurements are corrected using hydrometeor trajectories calculated from measured and forecasted winds, and the effect of trajectory-correction on the radar measurements is evaluated. Wind drift statistics for Finland are compiled using sounding data from two weather stations spanning two years. For each sounding, the hydrometeor phase at ground level is estimated and drift distance calculated using different originating level heights. This way the drift statistics are constructed as a function of range from radar and elevation angle. On average, wind drift of 1 km was exceeded at approximately 60 km distance, while drift of 10 km was exceeded at 100 km distance. Trajectories were calculated using model winds in order to produce a trajectory-corrected ground field from radar PPI images. It was found that at the upwind side from the radar the effective measuring area was reduced as some trajectories exited the radar volume scan. In the downwind side areas near the edge of the radar measuring area experience improved precipitation detection. The effect of trajectory-correction is most prominent in instant measurements and diminishes when accumulating over longer time periods. Furthermore, measurements of intensive and small scale precipitation patterns benefit most from wind drift correction. The contribution of wind drift on the uncertainty of estimated Ze (S) - relationship was studied by simulating the effect of different error sources to the uncertainty in the relationship coefficients a and b. The overall uncertainty was assumed to consist of systematic errors of both the radar and the gauge, as well as errors by turbulence at the gauge orifice and by wind drift of precipitation. The focus of the analysis is error associated with wind drift, which was determined by describing the spatial structure of the reflectivity field using spatial autocovariance (or variogram). This spatial structure was then used with calculated drift distances to estimate the variance in radar measurement produced by precipitation drift, relative to the other error sources. It was found that error by wind drift was of similar magnitude with error by turbulence at gauge orifice at all ranges from radar, with systematic errors of the instruments being a minor issue. The correction method presented in the study could be used in radar nowcasting products to improve the estimation of visibility and local precipitation intensities. The method however only considers pure snow, and for operational purposes some improvements are desirable, such as melting layer detection, VPR correction and taking solid state hydrometeor type into account, which would improve the estimation of vertical velocities of the hydrometeors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digital elevation models (DEMs) have been an important topic in geography and surveying sciences for decades due to their geomorphological importance as the reference surface for gravita-tion-driven material flow, as well as the wide range of uses and applications. When DEM is used in terrain analysis, for example in automatic drainage basin delineation, errors of the model collect in the analysis results. Investigation of this phenomenon is known as error propagation analysis, which has a direct influence on the decision-making process based on interpretations and applications of terrain analysis. Additionally, it may have an indirect influence on data acquisition and the DEM generation. The focus of the thesis was on the fine toposcale DEMs, which are typically represented in a 5-50m grid and used in the application scale 1:10 000-1:50 000. The thesis presents a three-step framework for investigating error propagation in DEM-based terrain analysis. The framework includes methods for visualising the morphological gross errors of DEMs, exploring the statistical and spatial characteristics of the DEM error, making analytical and simulation-based error propagation analysis and interpreting the error propagation analysis results. The DEM error model was built using geostatistical methods. The results show that appropriate and exhaustive reporting of various aspects of fine toposcale DEM error is a complex task. This is due to the high number of outliers in the error distribution and morphological gross errors, which are detectable with presented visualisation methods. In ad-dition, the use of global characterisation of DEM error is a gross generalisation of reality due to the small extent of the areas in which the decision of stationarity is not violated. This was shown using exhaustive high-quality reference DEM based on airborne laser scanning and local semivariogram analysis. The error propagation analysis revealed that, as expected, an increase in the DEM vertical error will increase the error in surface derivatives. However, contrary to expectations, the spatial au-tocorrelation of the model appears to have varying effects on the error propagation analysis depend-ing on the application. The use of a spatially uncorrelated DEM error model has been considered as a 'worst-case scenario', but this opinion is now challenged because none of the DEM derivatives investigated in the study had maximum variation with spatially uncorrelated random error. Sig-nificant performance improvement was achieved in simulation-based error propagation analysis by applying process convolution in generating realisations of the DEM error model. In addition, typology of uncertainty in drainage basin delineations is presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual acuities at the time of referral and on the day before surgery were compared in 124 patients operated on for cataract in Vaasa Central Hospital, Finland. Preoperative visual acuity and the occurrence of ocular and general disease were compared in samples of consecutive cataract extractions performed in 1982, 1985, 1990, 1995 and 2000 in two hospitals in the Vaasa region in Finland. The repeatability and standard deviation of random measurement error in visual acuity and refractive error determination in a clinical environment in cataractous, pseudophakic and healthy eyes were estimated by re-examining visual acuity and refractive error of patients referred to cataract surgery or consultation by ophthalmic professionals. Altogether 99 eyes of 99 persons (41 cataractous, 36 pseudophakic and 22 healthy eyes) with a visual acuity range of Snellen 0.3 to 1.3 (0.52 to -0.11 logMAR) were examined. During an average waiting time of 13 months, visual acuity in the study eye decreased from 0.68 logMAR to 0.96 logMAR (from 0.2 to 0.1 in Snellen decimal values). The average decrease in vision was 0.27 logMAR per year. In the fastest quartile, visual acuity change per year was 0.75 logMAR, and in the second fastest 0.29 logMAR, the third and fourth quartiles were virtually unaffected. From 1982 to 2000, the incidence of cataract surgery increased from 1.0 to 7.2 operations per 1000 inhabitants per year in the Vaasa region. The average preoperative visual acuity in the operated eye increased by 0.85 logMAR (in decimal values from 0.03to 0.2) and in the better eye 0.27 logMAR (in decimal values from 0.23 to 0.43) over this period. The proportion of patients profoundly visually handicapped (VA in the better eye <0.1) before the operation fell from 15% to 4%, and that of patients less profoundly visually handicapped (VA in the better eye 0.1 to <0.3) from 47% to 15%. The repeatability visual acuity measurement estimated as a coefficient of repeatability for all 99 eyes was ±0.18 logMAR, and the standard deviation of measurement error was 0.06 logMAR. Eyes with the lowest visual acuity (0.3-0.45) had the largest variability, the coefficient of repeatability values being ±0.24 logMAR and eyes with a visual acuity of 0.7 or better had the smallest, ±0.12 logMAR. The repeatability of refractive error measurement was studied in the same patient material as the repeatability of visual acuity. Differences between measurements 1 and 2 were calculated as three-dimensional vector values and spherical equivalents and expressed by coefficients of repeatability. Coefficients of repeatability for all eyes for vertical, torsional and horisontal vectors were ±0.74D, ±0.34D and ±0.93D, respectively, and for spherical equivalent for all eyes ±0.74D. Eyes with lower visual acuity (0.3-0.45) had larger variability in vector and spherical equivalent values (±1.14), but the difference between visual acuity groups was not statistically significant. The difference in the mean defocus equivalent between measurements 1 and 2 was, however, significantly greater in the lower visual acuity group. If a change of ±0.5D (measured in defocus equivalents) is accepted as a basis for change of spectacles for eyes with good vision, the basis for eyes in the visual acuity range of 0.3 - 0.65 would be ±1D. Differences in repeated visual acuity measurements are partly explained by errors in refractive error measurements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aims: Develop and validate tools to estimate residual noise covariance in Planck frequency maps. Quantify signal error effects and compare different techniques to produce low-resolution maps. Methods: We derive analytical estimates of covariance of the residual noise contained in low-resolution maps produced using a number of map-making approaches. We test these analytical predictions using Monte Carlo simulations and their impact on angular power spectrum estimation. We use simulations to quantify the level of signal errors incurred in different resolution downgrading schemes considered in this work. Results: We find an excellent agreement between the optimal residual noise covariance matrices and Monte Carlo noise maps. For destriping map-makers, the extent of agreement is dictated by the knee frequency of the correlated noise component and the chosen baseline offset length. The significance of signal striping is shown to be insignificant when properly dealt with. In map resolution downgrading, we find that a carefully selected window function is required to reduce aliasing to the sub-percent level at multipoles, ell > 2Nside, where Nside is the HEALPix resolution parameter. We show that sufficient characterization of the residual noise is unavoidable if one is to draw reliable contraints on large scale anisotropy. Conclusions: We have described how to compute the low-resolution maps, with a controlled sky signal level, and a reliable estimate of covariance of the residual noise. We have also presented a method to smooth the residual noise covariance matrices to describe the noise correlations in smoothed, bandwidth limited maps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with using the bootstrap to obtain improved critical values for the error correction model (ECM) cointegration test in dynamic models. In the paper we investigate the effects of dynamic specification on the size and power of the ECM cointegration test with bootstrap critical values. The results from a Monte Carlo study show that the size of the bootstrap ECM cointegration test is close to the nominal significance level. We find that overspecification of the lag length results in a loss of power. Underspecification of the lag length results in size distortion. The performance of the bootstrap ECM cointegration test deteriorates if the correct lag length is not used in the ECM. The bootstrap ECM cointegration test is therefore not robust to model misspecification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, thanks to developments in information technology, large-dimensional datasets have been increasingly available. Researchers now have access to thousands of economic series and the information contained in them can be used to create accurate forecasts and to test economic theories. To exploit this large amount of information, researchers and policymakers need an appropriate econometric model.Usual time series models, vector autoregression for example, cannot incorporate more than a few variables. There are two ways to solve this problem: use variable selection procedures or gather the information contained in the series to create an index model. This thesis focuses on one of the most widespread index model, the dynamic factor model (the theory behind this model, based on previous literature, is the core of the first part of this study), and its use in forecasting Finnish macroeconomic indicators (which is the focus of the second part of the thesis). In particular, I forecast economic activity indicators (e.g. GDP) and price indicators (e.g. consumer price index), from 3 large Finnish datasets. The first dataset contains a large series of aggregated data obtained from the Statistics Finland database. The second dataset is composed by economic indicators from Bank of Finland. The last dataset is formed by disaggregated data from Statistic Finland, which I call micro dataset. The forecasts are computed following a two steps procedure: in the first step I estimate a set of common factors from the original dataset. The second step consists in formulating forecasting equations including the factors extracted previously. The predictions are evaluated using relative mean squared forecast error, where the benchmark model is a univariate autoregressive model. The results are dataset-dependent. The forecasts based on factor models are very accurate for the first dataset (the Statistics Finland one), while they are considerably worse for the Bank of Finland dataset. The forecasts derived from the micro dataset are still good, but less accurate than the ones obtained in the first case. This work leads to multiple research developments. The results here obtained can be replicated for longer datasets. The non-aggregated data can be represented in an even more disaggregated form (firm level). Finally, the use of the micro data, one of the major contributions of this thesis, can be useful in the imputation of missing values and the creation of flash estimates of macroeconomic indicator (nowcasting).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relationship between site characteristics and understorey vegetation composition was analysed with quantitative methods, especially from the viewpoint of site quality estimation. Theoretical models were applied to an empirical data set collected from the upland forests of southern Finland comprising 104 sites dominated by Scots pine (Pinus sylvestris L.), and 165 sites dominated by Norway spruce (Picea abies (L.) Karsten). Site index H100 was used as an independent measure of site quality. A new model for the estimation of site quality at sites with a known understorey vegetation composition was introduced. It is based on the application of Bayes' theorem to the density function of site quality within the study area combined with the species-specific presence-absence response curves. The resulting posterior probability density function may be used for calculating an estimate for the site variable. Using this method, a jackknife estimate of site index H100 was calculated separately for pine- and spruce-dominated sites. The results indicated that the cross-validation root mean squared error (RMSEcv) of the estimates improved from 2.98 m down to 2.34 m relative to the "null" model (standard deviation of the sample distribution) in pine-dominated forests. In spruce-dominated forests RMSEcv decreased from 3.94 m down to 3.16 m. In order to assess these results, four other estimation methods based on understorey vegetation composition were applied to the same data set. The results showed that none of the methods was clearly superior to the others. In pine-dominated forests, RMSEcv varied between 2.34 and 2.47 m, and the corresponding range for spruce-dominated forests was from 3.13 to 3.57 m.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Benthic processes were measured at a coastal deposition area in the northern Baltic Sea, covering all seasons. The N-2 production rates, 90-400 mu mol N m(-2) d(-1), were highest in autumn-early winter and lowest in spring. Heterotrophic bacterial production peaked unexpectedly late in the year, indicating that in addition to the temperature, the availability of carbon compounds suitable for the heterotrophic bacteria also plays a major role in regulating the denitrification rate. Anaerobic ammonium oxidation (anammox) was measured in spring and autumn and contributed 10% and 15%, respectively, to the total N-2 production. The low percentage did, however, result in a significant error in the total N-2 production rate estimate, calculated using the isotope pairing technique. Anammox must be taken into account in the Gulf of Finland in future sediment nitrogen cycling research.