34 resultados para Error Estimation
em Helda - Digital Repository of University of Helsinki
Resumo:
There is an increasing need to compare the results obtained with different methods of estimation of tree biomass in order to reduce the uncertainty in the assessment of forest biomass carbon. In this study, tree biomass was investigated in a 30-year-old Scots pine (Pinus sylvestris) (Young-Stand) and a 130-year-old mixed Norway spruce (Picea abies)-Scots pine stand (Mature-Stand) located in southern Finland (61º50' N, 24º22' E). In particular, a comparison of the results of different estimation methods was conducted to assess the reliability and suitability of their applications. For the trees in Mature-Stand, annual stem biomass increment fluctuated following a sigmoid equation, and the fitting curves reached a maximum level (from about 1 kg/yr for understorey spruce to 7 kg/yr for dominant pine) when the trees were 100 years old. Tree biomass was estimated to be about 70 Mg/ha in Young-Stand and about 220 Mg/ha in Mature-Stand. In the region (58.00-62.13 ºN, 14-34 ºE, ≤ 300 m a.s.l.) surrounding the study stands, the tree biomass accumulation in Norway spruce and Scots pine stands followed a sigmoid equation with stand age, with a maximum of 230 Mg/ha at the age of 140 years. In Mature-Stand, lichen biomass on the trees was 1.63 Mg/ha with more than half of the biomass occurring on dead branches, and the standing crop of litter lichen on the ground was about 0.09 Mg/ha. There were substantial differences among the results estimated by different methods in the stands. These results imply that a possible estimation error should be taken into account when calculating tree biomass in a stand with an indirect approach.
Resumo:
This study examines the properties of Generalised Regression (GREG) estimators for domain class frequencies and proportions. The family of GREG estimators forms the class of design-based model-assisted estimators. All GREG estimators utilise auxiliary information via modelling. The classic GREG estimator with a linear fixed effects assisting model (GREG-lin) is one example. But when estimating class frequencies, the study variable is binary or polytomous. Therefore logistic-type assisting models (e.g. logistic or probit model) should be preferred over the linear one. However, other GREG estimators than GREG-lin are rarely used, and knowledge about their properties is limited. This study examines the properties of L-GREG estimators, which are GREG estimators with fixed-effects logistic-type models. Three research questions are addressed. First, I study whether and when L-GREG estimators are more accurate than GREG-lin. Theoretical results and Monte Carlo experiments which cover both equal and unequal probability sampling designs and a wide variety of model formulations show that in standard situations, the difference between L-GREG and GREG-lin is small. But in the case of a strong assisting model, two interesting situations arise: if the domain sample size is reasonably large, L-GREG is more accurate than GREG-lin, and if the domain sample size is very small, estimation of assisting model parameters may be inaccurate, resulting in bias for L-GREG. Second, I study variance estimation for the L-GREG estimators. The standard variance estimator (S) for all GREG estimators resembles the Sen-Yates-Grundy variance estimator, but it is a double sum of prediction errors, not of the observed values of the study variable. Monte Carlo experiments show that S underestimates the variance of L-GREG especially if the domain sample size is minor, or if the assisting model is strong. Third, since the standard variance estimator S often fails for the L-GREG estimators, I propose a new augmented variance estimator (A). The difference between S and the new estimator A is that the latter takes into account the difference between the sample fit model and the census fit model. In Monte Carlo experiments, the new estimator A outperformed the standard estimator S in terms of bias, root mean square error and coverage rate. Thus the new estimator provides a good alternative to the standard estimator.
Resumo:
This study contributes to the neglect effect literature by looking at the relative trading volume in terms of value. The results for the Swedish market show a significant positive relationship between the accuracy of estimation and the relative trading volume. Market capitalisation and analyst coverage have in prior studies been used as proxies for neglect. These measures however, do not take into account the effort analysts put in when estimating corporate pre-tax profits. I also find evidence that the industry of the firm influence the accuracy of estimation. In addition, supporting earlier findings, loss making firms are associated with larger forecasting errors. Further, I find that the average forecast error increased in the year 2000 – in Sweden.
Resumo:
The forest simulator is a computerized model for predicting forest growth and future development as well as effects of forest harvests and treatments. The forest planning system is a decision support tool, usually including a forest simulator and an optimisation model, for finding the optimal forest management actions. The information produced by forest simulators and forest planning systems is used for various analytical purposes and in support of decision making. However, the quality and reliability of this information can often be questioned. Natural variation in forest growth and estimation errors in forest inventory, among other things, cause uncertainty in predictions of forest growth and development. This uncertainty stemming from different sources has various undesirable effects. In many cases outcomes of decisions based on uncertain information are something else than desired. The objective of this thesis was to study various sources of uncertainty and their effects in forest simulators and forest planning systems. The study focused on three notable sources of uncertainty: errors in forest growth predictions, errors in forest inventory data, and stochastic fluctuation of timber assortment prices. Effects of uncertainty were studied using two types of forest growth models, individual tree-level models and stand-level models, and with various error simulation methods. New method for simulating more realistic forest inventory errors was introduced and tested. Also, three notable sources of uncertainty were combined and their joint effects on stand-level net present value estimates were simulated. According to the results, the various sources of uncertainty can have distinct effects in different forest growth simulators. The new forest inventory error simulation method proved to produce more realistic errors. The analysis on the joint effects of various sources of uncertainty provided interesting knowledge about uncertainty in forest simulators.
Resumo:
This thesis examines the feasibility of a forest inventory method based on two-phase sampling in estimating forest attributes at the stand or substand levels for forest management purposes. The method is based on multi-source forest inventory combining auxiliary data consisting of remote sensing imagery or other geographic information and field measurements. Auxiliary data are utilized as first-phase data for covering all inventory units. Various methods were examined for improving the accuracy of the forest estimates. Pre-processing of auxiliary data in the form of correcting the spectral properties of aerial imagery was examined (I), as was the selection of aerial image features for estimating forest attributes (II). Various spatial units were compared for extracting image features in a remote sensing aided forest inventory utilizing very high resolution imagery (III). A number of data sources were combined and different weighting procedures were tested in estimating forest attributes (IV, V). Correction of the spectral properties of aerial images proved to be a straightforward and advantageous method for improving the correlation between the image features and the measured forest attributes. Testing different image features that can be extracted from aerial photographs (and other very high resolution images) showed that the images contain a wealth of relevant information that can be extracted only by utilizing the spatial organization of the image pixel values. Furthermore, careful selection of image features for the inventory task generally gives better results than inputting all extractable features to the estimation procedure. When the spatial units for extracting very high resolution image features were examined, an approach based on image segmentation generally showed advantages compared with a traditional sample plot-based approach. Combining several data sources resulted in more accurate estimates than any of the individual data sources alone. The best combined estimate can be derived by weighting the estimates produced by the individual data sources by the inverse values of their mean square errors. Despite the fact that the plot-level estimation accuracy in two-phase sampling inventory can be improved in many ways, the accuracy of forest estimates based mainly on single-view satellite and aerial imagery is a relatively poor basis for making stand-level management decisions.
Resumo:
This study evaluates how the advection of precipitation, or wind drift, between the radar volume and ground affects radar measurements of precipitation. Normally precipitation is assumed to fall vertically to the ground from the contributing volume, and thus the radar measurement represents the geographical location immediately below. In this study radar measurements are corrected using hydrometeor trajectories calculated from measured and forecasted winds, and the effect of trajectory-correction on the radar measurements is evaluated. Wind drift statistics for Finland are compiled using sounding data from two weather stations spanning two years. For each sounding, the hydrometeor phase at ground level is estimated and drift distance calculated using different originating level heights. This way the drift statistics are constructed as a function of range from radar and elevation angle. On average, wind drift of 1 km was exceeded at approximately 60 km distance, while drift of 10 km was exceeded at 100 km distance. Trajectories were calculated using model winds in order to produce a trajectory-corrected ground field from radar PPI images. It was found that at the upwind side from the radar the effective measuring area was reduced as some trajectories exited the radar volume scan. In the downwind side areas near the edge of the radar measuring area experience improved precipitation detection. The effect of trajectory-correction is most prominent in instant measurements and diminishes when accumulating over longer time periods. Furthermore, measurements of intensive and small scale precipitation patterns benefit most from wind drift correction. The contribution of wind drift on the uncertainty of estimated Ze (S) - relationship was studied by simulating the effect of different error sources to the uncertainty in the relationship coefficients a and b. The overall uncertainty was assumed to consist of systematic errors of both the radar and the gauge, as well as errors by turbulence at the gauge orifice and by wind drift of precipitation. The focus of the analysis is error associated with wind drift, which was determined by describing the spatial structure of the reflectivity field using spatial autocovariance (or variogram). This spatial structure was then used with calculated drift distances to estimate the variance in radar measurement produced by precipitation drift, relative to the other error sources. It was found that error by wind drift was of similar magnitude with error by turbulence at gauge orifice at all ranges from radar, with systematic errors of the instruments being a minor issue. The correction method presented in the study could be used in radar nowcasting products to improve the estimation of visibility and local precipitation intensities. The method however only considers pure snow, and for operational purposes some improvements are desirable, such as melting layer detection, VPR correction and taking solid state hydrometeor type into account, which would improve the estimation of vertical velocities of the hydrometeors.
Resumo:
Remote sensing provides methods to infer land cover information over large geographical areas at a variety of spatial and temporal resolutions. Land cover is input data for a range of environmental models and information on land cover dynamics is required for monitoring the implications of global change. Such data are also essential in support of environmental management and policymaking. Boreal forests are a key component of the global climate and a major sink of carbon. The northern latitudes are expected to experience a disproportionate and rapid warming, which can have a major impact on vegetation at forest limits. This thesis examines the use of optical remote sensing for estimating aboveground biomass, leaf area index (LAI), tree cover and tree height in the boreal forests and tundra taiga transition zone in Finland. The continuous fields of forest attributes are required, for example, to improve the mapping of forest extent. The thesis focus on studying the feasibility of satellite data at multiple spatial resolutions, assessing the potential of multispectral, -angular and -temporal information, and provides regional evaluation for global land cover data. Preprocessed ASTER, MISR and MODIS products are the principal satellite data. The reference data consist of field measurements, forest inventory data and fine resolution land cover maps. Fine resolution studies demonstrate how statistical relationships between biomass and satellite data are relatively strong in single species and low biomass mountain birch forests in comparison to higher biomass coniferous stands. The combination of forest stand data and fine resolution ASTER images provides a method for biomass estimation using medium resolution MODIS data. The multiangular data improve the accuracy of land cover mapping in the sparsely forested tundra taiga transition zone, particularly in mires. Similarly, multitemporal data improve the accuracy of coarse resolution tree cover estimates in comparison to single date data. Furthermore, the peak of the growing season is not necessarily the optimal time for land cover mapping in the northern boreal regions. The evaluated coarse resolution land cover data sets have considerable shortcomings in northernmost Finland and should be used with caution in similar regions. The quantitative reference data and upscaling methods for integrating multiresolution data are required for calibration of statistical models and evaluation of land cover data sets. The preprocessed image products have potential for wider use as they can considerably reduce the time and effort used for data processing.
Resumo:
Digital elevation models (DEMs) have been an important topic in geography and surveying sciences for decades due to their geomorphological importance as the reference surface for gravita-tion-driven material flow, as well as the wide range of uses and applications. When DEM is used in terrain analysis, for example in automatic drainage basin delineation, errors of the model collect in the analysis results. Investigation of this phenomenon is known as error propagation analysis, which has a direct influence on the decision-making process based on interpretations and applications of terrain analysis. Additionally, it may have an indirect influence on data acquisition and the DEM generation. The focus of the thesis was on the fine toposcale DEMs, which are typically represented in a 5-50m grid and used in the application scale 1:10 000-1:50 000. The thesis presents a three-step framework for investigating error propagation in DEM-based terrain analysis. The framework includes methods for visualising the morphological gross errors of DEMs, exploring the statistical and spatial characteristics of the DEM error, making analytical and simulation-based error propagation analysis and interpreting the error propagation analysis results. The DEM error model was built using geostatistical methods. The results show that appropriate and exhaustive reporting of various aspects of fine toposcale DEM error is a complex task. This is due to the high number of outliers in the error distribution and morphological gross errors, which are detectable with presented visualisation methods. In ad-dition, the use of global characterisation of DEM error is a gross generalisation of reality due to the small extent of the areas in which the decision of stationarity is not violated. This was shown using exhaustive high-quality reference DEM based on airborne laser scanning and local semivariogram analysis. The error propagation analysis revealed that, as expected, an increase in the DEM vertical error will increase the error in surface derivatives. However, contrary to expectations, the spatial au-tocorrelation of the model appears to have varying effects on the error propagation analysis depend-ing on the application. The use of a spatially uncorrelated DEM error model has been considered as a 'worst-case scenario', but this opinion is now challenged because none of the DEM derivatives investigated in the study had maximum variation with spatially uncorrelated random error. Sig-nificant performance improvement was achieved in simulation-based error propagation analysis by applying process convolution in generating realisations of the DEM error model. In addition, typology of uncertainty in drainage basin delineations is presented.
Resumo:
This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.
Resumo:
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.
Resumo:
Visual acuities at the time of referral and on the day before surgery were compared in 124 patients operated on for cataract in Vaasa Central Hospital, Finland. Preoperative visual acuity and the occurrence of ocular and general disease were compared in samples of consecutive cataract extractions performed in 1982, 1985, 1990, 1995 and 2000 in two hospitals in the Vaasa region in Finland. The repeatability and standard deviation of random measurement error in visual acuity and refractive error determination in a clinical environment in cataractous, pseudophakic and healthy eyes were estimated by re-examining visual acuity and refractive error of patients referred to cataract surgery or consultation by ophthalmic professionals. Altogether 99 eyes of 99 persons (41 cataractous, 36 pseudophakic and 22 healthy eyes) with a visual acuity range of Snellen 0.3 to 1.3 (0.52 to -0.11 logMAR) were examined. During an average waiting time of 13 months, visual acuity in the study eye decreased from 0.68 logMAR to 0.96 logMAR (from 0.2 to 0.1 in Snellen decimal values). The average decrease in vision was 0.27 logMAR per year. In the fastest quartile, visual acuity change per year was 0.75 logMAR, and in the second fastest 0.29 logMAR, the third and fourth quartiles were virtually unaffected. From 1982 to 2000, the incidence of cataract surgery increased from 1.0 to 7.2 operations per 1000 inhabitants per year in the Vaasa region. The average preoperative visual acuity in the operated eye increased by 0.85 logMAR (in decimal values from 0.03to 0.2) and in the better eye 0.27 logMAR (in decimal values from 0.23 to 0.43) over this period. The proportion of patients profoundly visually handicapped (VA in the better eye <0.1) before the operation fell from 15% to 4%, and that of patients less profoundly visually handicapped (VA in the better eye 0.1 to <0.3) from 47% to 15%. The repeatability visual acuity measurement estimated as a coefficient of repeatability for all 99 eyes was ±0.18 logMAR, and the standard deviation of measurement error was 0.06 logMAR. Eyes with the lowest visual acuity (0.3-0.45) had the largest variability, the coefficient of repeatability values being ±0.24 logMAR and eyes with a visual acuity of 0.7 or better had the smallest, ±0.12 logMAR. The repeatability of refractive error measurement was studied in the same patient material as the repeatability of visual acuity. Differences between measurements 1 and 2 were calculated as three-dimensional vector values and spherical equivalents and expressed by coefficients of repeatability. Coefficients of repeatability for all eyes for vertical, torsional and horisontal vectors were ±0.74D, ±0.34D and ±0.93D, respectively, and for spherical equivalent for all eyes ±0.74D. Eyes with lower visual acuity (0.3-0.45) had larger variability in vector and spherical equivalent values (±1.14), but the difference between visual acuity groups was not statistically significant. The difference in the mean defocus equivalent between measurements 1 and 2 was, however, significantly greater in the lower visual acuity group. If a change of ±0.5D (measured in defocus equivalents) is accepted as a basis for change of spectacles for eyes with good vision, the basis for eyes in the visual acuity range of 0.3 - 0.65 would be ±1D. Differences in repeated visual acuity measurements are partly explained by errors in refractive error measurements.
Resumo:
Data assimilation provides an initial atmospheric state, called the analysis, for Numerical Weather Prediction (NWP). This analysis consists of pressure, temperature, wind, and humidity on a three-dimensional NWP model grid. Data assimilation blends meteorological observations with the NWP model in a statistically optimal way. The objective of this thesis is to describe methodological development carried out in order to allow data assimilation of ground-based measurements of the Global Positioning System (GPS) into the High Resolution Limited Area Model (HIRLAM) NWP system. Geodetic processing produces observations of tropospheric delay. These observations can be processed either for vertical columns at each GPS receiver station, or for the individual propagation paths of the microwave signals. These alternative processing methods result in Zenith Total Delay (ZTD) and Slant Delay (SD) observations, respectively. ZTD and SD observations are of use in the analysis of atmospheric humidity. A method is introduced for estimation of the horizontal error covariance of ZTD observations. The method makes use of observation minus model background (OmB) sequences of ZTD and conventional observations. It is demonstrated that the ZTD observation error covariance is relatively large in station separations shorter than 200 km, but non-zero covariances also appear at considerably larger station separations. The relatively low density of radiosonde observing stations limits the ability of the proposed estimation method to resolve the shortest length-scales of error covariance. SD observations are shown to contain a statistically significant signal on the asymmetry of the atmospheric humidity field. However, the asymmetric component of SD is found to be nearly always smaller than the standard deviation of the SD observation error. SD observation modelling is described in detail, and other issues relating to SD data assimilation are also discussed. These include the determination of error statistics, the tuning of observation quality control and allowing the taking into account of local observation error correlation. The experiments made show that the data assimilation system is able to retrieve the asymmetric information content of hypothetical SD observations at a single receiver station. Moreover, the impact of real SD observations on humidity analysis is comparable to that of other observing systems.