909 resultados para Bias correction


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Climatic changes are most pronounced in northern high latitude regions. Yet, there is a paucity of observational data, both spatially and temporally, such that regional-scale dynamics are not fully captured, limiting our ability to make reliable projections. In this study, a group of dynamical downscaling products were created for the period 1950 to 2100 to better understand climate change and its impacts on hydrology, permafrost, and ecosystems at a resolution suitable for northern Alaska. An ERA-interim reanalysis dataset and the Community Earth System Model (CESM) served as the forcing mechanisms in this dynamical downscaling framework, and the Weather Research & Forecast (WRF) model, embedded with an optimization for the Arctic (Polar WRF), served as the Regional Climate Model (RCM). This downscaled output consists of multiple climatic variables (precipitation, temperature, wind speed, dew point temperature, and surface air pressure) for a 10 km grid spacing at three-hour intervals. The modeling products were evaluated and calibrated using a bias-correction approach. The ERA-interim forced WRF (ERA-WRF) produced reasonable climatic variables as a result, yielding a more closely correlated temperature field than precipitation field when long-term monthly climatology was compared with its forcing and observational data. A linear scaling method then further corrected the bias, based on ERA-interim monthly climatology, and bias-corrected ERA-WRF fields were applied as a reference for calibration of both the historical and the projected CESM forced WRF (CESM-WRF) products. Biases, such as, a cold temperature bias during summer and a warm temperature bias during winter as well as a wet bias for annual precipitation that CESM holds over northern Alaska persisted in CESM-WRF runs. The linear scaling of CESM-WRF eventually produced high-resolution downscaling products for the Alaskan North Slope for hydrological and ecological research, together with the calibrated ERA-WRF run, and its capability extends far beyond that. Other climatic research has been proposed, including exploration of historical and projected climatic extreme events and their possible connections to low-frequency sea-atmospheric oscillations, as well as near-surface permafrost degradation and ice regime shifts of lakes. These dynamically downscaled, bias corrected climatic datasets provide improved spatial and temporal resolution data necessary for ongoing modeling efforts in northern Alaska focused on reconstructing and projecting hydrologic changes, ecosystem processes and responses, and permafrost thermal regimes. The dynamical downscaling methods presented in this study can also be used to create more suitable model input datasets for other sub-regions of the Arctic.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In large epidemiological studies missing data can be a problem, especially if information is sought on a sensitive topic or when a composite measure is calculated from several variables each affected by missing values. Multiple imputation is the method of choice for 'filling in' missing data based on associations among variables. Using an example about body mass index from the Australian Longitudinal Study on Women's Health, we identify a subset of variables that are particularly useful for imputing values for the target variables. Then we illustrate two uses of multiple imputation. The first is to examine and correct for bias when data are not missing completely at random. The second is to impute missing values for an important covariate; in this case omission from the imputation process of variables to be used in the analysis may introduce bias. We conclude with several recommendations for handling issues of missing data. Copyright (C) 2004 John Wiley Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Surveys can collect important data that inform policy decisions and drive social science research. Large government surveys collect information from the U.S. population on a wide range of topics, including demographics, education, employment, and lifestyle. Analysis of survey data presents unique challenges. In particular, one needs to account for missing data, for complex sampling designs, and for measurement error. Conceptually, a survey organization could spend lots of resources getting high-quality responses from a simple random sample, resulting in survey data that are easy to analyze. However, this scenario often is not realistic. To address these practical issues, survey organizations can leverage the information available from other sources of data. For example, in longitudinal studies that suffer from attrition, they can use the information from refreshment samples to correct for potential attrition bias. They can use information from known marginal distributions or survey design to improve inferences. They can use information from gold standard sources to correct for measurement error.

This thesis presents novel approaches to combining information from multiple sources that address the three problems described above.

The first method addresses nonignorable unit nonresponse and attrition in a panel survey with a refreshment sample. Panel surveys typically suffer from attrition, which can lead to biased inference when basing analysis only on cases that complete all waves of the panel. Unfortunately, the panel data alone cannot inform the extent of the bias due to attrition, so analysts must make strong and untestable assumptions about the missing data mechanism. Many panel studies also include refreshment samples, which are data collected from a random sample of new

individuals during some later wave of the panel. Refreshment samples offer information that can be utilized to correct for biases induced by nonignorable attrition while reducing reliance on strong assumptions about the attrition process. To date, these bias correction methods have not dealt with two key practical issues in panel studies: unit nonresponse in the initial wave of the panel and in the

refreshment sample itself. As we illustrate, nonignorable unit nonresponse

can significantly compromise the analyst's ability to use the refreshment samples for attrition bias correction. Thus, it is crucial for analysts to assess how sensitive their inferences---corrected for panel attrition---are to different assumptions about the nature of the unit nonresponse. We present an approach that facilitates such sensitivity analyses, both for suspected nonignorable unit nonresponse

in the initial wave and in the refreshment sample. We illustrate the approach using simulation studies and an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.

The second method incorporates informative prior beliefs about

marginal probabilities into Bayesian latent class models for categorical data.

The basic idea is to append synthetic observations to the original data such that

(i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data.

We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.

The third method leverages the information from a gold standard survey to model reporting error. Survey data are subject to reporting error when respondents misunderstand the question or accidentally select the wrong response. Sometimes survey respondents knowingly select the wrong response, for example, by reporting a higher level of education than they actually have attained. We present an approach that allows an analyst to model reporting error by incorporating information from a gold standard survey. The analyst can specify various reporting error models and assess how sensitive their conclusions are to different assumptions about the reporting error process. We illustrate the approach using simulations based on data from the 1993 National Survey of College Graduates. We use the method to impute error-corrected educational attainments in the 2010 American Community Survey using the 2010 National Survey of College Graduates as the gold standard survey.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A regional cross-calibration between the first Delay Doppler altimetry dataset from Cryosat-2 and a retracked Envisat dataset is here presented, in order to test the benefits of the Delay-Doppler processing and to expand the Envisat time series in the coastal ocean. The Indonesian Seas are chosen for the calibration, since the availability of altimetry data in this region is particularly beneficial due to the lack of in-situ measurements and its importance for global ocean circulation. The Envisat data in the region are retracked with the Adaptive Leading Edge Subwaveform (ALES) Retracker, which has been previously validated and applied successfully to coastal sea level research. The study demonstrates that CryoSat-2 is able to decrease the 1-Hz noise of sea level estimations by 0.3 cm within 50 km of the coast, when compared to the ALES-reprocessed Envisat dataset. It also shows that Envisat can be confidently used for detailed oceanographic research after the orbit change of October 2010. Cross-calibration at the crossover points indicates that in the region of study a sea state bias correction equal to 5% of the significant wave height is an acceptable approximation for Delay-Doppler altimetry. The analysis of the joint sea level time series reveals the geographic extent of the semiannual signal caused by Kelvin waves during the monsoon transitions, the larger amplitudes of the annual signal due to the Java Coastal Current and the impact of the strong La Nina event of 2010 on rising sea level trends.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A regional cross-calibration between the first Delay Doppler altimetry dataset from Cryosat-2 and a retracked Envisat dataset is here presented, in order to test the benefits of the Delay-Doppler processing and to expand the Envisat time series in the coastal ocean. The Indonesian Seas are chosen for the calibration, since the availability of altimetry data in this region is particularly beneficial due to the lack of in-situ measurements and its importance for global ocean circulation. The Envisat data in the region are retracked with the Adaptive Leading Edge Subwaveform (ALES) Retracker, which has been previously validated and applied successfully to coastal sea level research. The study demonstrates that CryoSat-2 is able to decrease the 1-Hz noise of sea level estimations by 0.3 cm within 50 km of the coast, when compared to the ALES-reprocessed Envisat dataset. It also shows that Envisat can be confidently used for detailed oceanographic research after the orbit change of October 2010. Cross-calibration at the crossover points indicates that in the region of study a sea state bias correction equal to 5% of the significant wave height is an acceptable approximation for Delay-Doppler altimetry. The analysis of the joint sea level time series reveals the geographic extent of the semiannual signal caused by Kelvin waves during the monsoon transitions, the larger amplitudes of the annual signal due to the Java Coastal Current and the impact of the strong La Nina event of 2010 on rising sea level trends.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Global climate change is predicted to have impacts on the frequency and severity of flood events. In this study, output from Global Circulation Models (GCMs) for a range of possible future climate scenarios was used to force hydrologic models for four case study watersheds built using the Soil and Water Assessment Tool (SWAT). GCM output was applied with either the "delta change" method or a bias correction. Potential changes in flood risk are assessed based on modeling results and possible relationships to watershed characteristics. Differences in model outputs when using the two different methods of adjusting GCM output are also compared. Preliminary results indicate that watersheds exhibiting higher proportions of runoff in streamflow are more vulnerable to changes in flood risk. The delta change method appears to be more useful when simulating extreme events as it better preserves daily climate variability as opposed to using bias corrected GCM output.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Strong convective events can produce extreme precipitation, hail, lightning or gusts, potentially inducing severe socio-economic impacts. These events have a relatively small spatial extension and, in most cases, a short lifetime. In this study, a model is developed for estimating convective extreme events based on large scale conditions. It is shown that strong convective events can be characterized by a Weibull distribution of radar-based rainfall with a low shape and high scale parameter value. A radius of 90km around a station reporting a convective situation turned out to be suitable. A methodology is developed to estimate the Weibull parameters and thus the occurrence probability of convective events from large scale atmospheric instability and enhanced near-surface humidity, which are usually found on a larger scale than the convective event itself. Here, the probability for the occurrence of extreme convective events is estimated from the KO-index indicating the stability, and relative humidity at 1000hPa. Both variables are computed from ERA-Interim reanalysis. In a first version of the methodology, these two variables are applied to estimate the spatial rainfall distribution and to estimate the occurrence of a convective event. The developed method shows significant skill in estimating the occurrence of convective events as observed at synoptic stations, lightning measurements, and severe weather reports. In order to take frontal influences into account, a scheme for the detection of atmospheric fronts is implemented. While generally higher instability is found in the vicinity of fronts, the skill of this approach is largely unchanged. Additional improvements were achieved by a bias-correction and the use of ERA-Interim precipitation. The resulting estimation method is applied to the ERA-Interim period (1979-2014) to establish a ranking of estimated convective extreme events. Two strong estimated events that reveal a frontal influence are analysed in detail. As a second application, the method is applied to GCM-based decadal predictions in the period 1979-2014, which were initialized every year. It is shown that decadal predictive skill for convective event frequencies over Germany is found for the first 3-4 years after the initialization.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a statistical downscaling model, it is important to remove the bias of General Circulations Model (GCM) outputs resulting from various assumptions about the geophysical processes. One conventional method for correcting such bias is standardisation, which is used prior to statistical downscaling to reduce systematic bias in the mean and variances of GCM predictors relative to the observations or National Centre for Environmental Prediction/ National Centre for Atmospheric Research (NCEP/NCAR) reanalysis data. A major drawback of standardisation is that it may reduce the bias in the mean and variance of the predictor variable but it is much harder to accommodate the bias in large-scale patterns of atmospheric circulation in GCMs (e.g. shifts in the dominant storm track relative to observed data) or unrealistic inter-variable relationships. While predicting hydrologic scenarios, such uncorrected bias should be taken care of; otherwise it will propagate in the computations for subsequent years. A statistical method based on equi-probability transformation is applied in this study after downscaling, to remove the bias from the predicted hydrologic variable relative to the observed hydrologic variable for a baseline period. The model is applied in prediction of monsoon stream flow of Mahanadi River in India, from GCM generated large scale climatological data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantum computing offers powerful new techniques for speeding up the calculation of many classically intractable problems. Quantum algorithms can allow for the efficient simulation of physical systems, with applications to basic research, chemical modeling, and drug discovery; other algorithms have important implications for cryptography and internet security.

At the same time, building a quantum computer is a daunting task, requiring the coherent manipulation of systems with many quantum degrees of freedom while preventing environmental noise from interacting too strongly with the system. Fortunately, we know that, under reasonable assumptions, we can use the techniques of quantum error correction and fault tolerance to achieve an arbitrary reduction in the noise level.

In this thesis, we look at how additional information about the structure of noise, or "noise bias," can improve or alter the performance of techniques in quantum error correction and fault tolerance. In Chapter 2, we explore the possibility of designing certain quantum gates to be extremely robust with respect to errors in their operation. This naturally leads to structured noise where certain gates can be implemented in a protected manner, allowing the user to focus their protection on the noisier unprotected operations.

In Chapter 3, we examine how to tailor error-correcting codes and fault-tolerant quantum circuits in the presence of dephasing biased noise, where dephasing errors are far more common than bit-flip errors. By using an appropriately asymmetric code, we demonstrate the ability to improve the amount of error reduction and decrease the physical resources required for error correction.

In Chapter 4, we analyze a variety of protocols for distilling magic states, which enable universal quantum computation, in the presence of faulty Clifford operations. Here again there is a hierarchy of noise levels, with a fixed error rate for faulty gates, and a second rate for errors in the distilled states which decreases as the states are distilled to better quality. The interplay of of these different rates sets limits on the achievable distillation and how quickly states converge to that limit.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The major problem of the engineering entrance examination is the exclusion of certain sections of the society in social, economic, regional and gender dimensions. This has seldom been taken for analysis towards policy correction. To lessen this problem a minor policy shift was prepared in the year 2011 with a 50–50 proportion in academic marks and entrance marks. The impact of this change is yet to be scrutinized. The data for the study is obtained from the Nodal Centre of Kerala functioning at Cochin University of Science and Technology under the National Technical Manpower Information System and also estimated from the Centralized Allotment Process. The article focuses on two aspects of exclusion based on engineering entrance examination; gender centred as well as caste-linked. Rank order spectral density and Lorenz ratio are used to cognize the exclusion and inequality in community and gender levels in various performance scales. The article unfolds the fact that social status in society coupled with economic affordability to quality education seems to have significant influence in the performance of students in the Kerala engineering entrance examinations. But it also shows that there is wide gender disparity with respect to performance in the high ranking levels irrespective of social groups

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Subjective measures of health tend to suffer from bias given by reporting heterogeneity. however, some methodologies are used to correct the bias in order to compare self-assessed health for respondents with different sociodemographic characteristics. One of the methods to correct this is the hierarchical ordered probit (hopit), which includes rates of vignettes -hypothetical individuals with a fixed health state- and where two assumptions have to be fulfilled, vignette equivalence and response consistency. this methodology is used for the self-reported work disability for a sample of the united states for 2011. The results show that even though sociodemographic variables influence rating scales, adjusting for this does not change their effect on work disability, which is only influenced by income. the inclusion of variables related with ethnicity or place of birth does not influence the true work disability. however, when only one of them is excluded, it becomes significant and affects the true level of work disability as well as income.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimation of a population size by means of capture-recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero-truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so-called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current commercially available Doppler lidars provide an economical and robust solution for measuring vertical and horizontal wind velocities, together with the ability to provide co- and cross-polarised backscatter profiles. The high temporal resolution of these instruments allows turbulent properties to be obtained from studying the variation in radial velocities. However, the instrument specifications mean that certain characteristics, especially the background noise behaviour, become a limiting factor for the instrument sensitivity in regions where the aerosol load is low. Turbulent calculations require an accurate estimate of the contribution from velocity uncertainty estimates, which are directly related to the signal-to-noise ratio. Any bias in the signal-to-noise ratio will propagate through as a bias in turbulent properties. In this paper we present a method to correct for artefacts in the background noise behaviour of commercially available Doppler lidars and reduce the signal-to-noise ratio threshold used to discriminate between noise, and cloud or aerosol signals. We show that, for Doppler lidars operating continuously at a number of locations in Finland, the data availability can be increased by as much as 50 % after performing this background correction and subsequent reduction in the threshold. The reduction in bias also greatly improves subsequent calculations of turbulent properties in weak signal regimes.