Biblioteca Digital

971 resultados para Predictive Mean Squared Efficiency

Missing value imputation in longitudinal measures of alcohol consumption

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003-2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, "real" observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. Copyright (C) 2011 John Wiley & Sons, Ltd.

CODAMAT: a modern analogue techinque for compositional data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quantitative estimation of Sea Surface Temperatures from fossils assemblages is afundamental issue in palaeoclimatic and paleooceanographic investigations. TheModern Analogue Technique, a widely adopted method based on direct comparison offossil assemblages with modern coretop samples, was revised with the aim ofconforming it to compositional data analysis. The new CODAMAT method wasdeveloped by adopting the Aitchison metric as distance measure. Modern coretopdatasets are characterised by a large amount of zeros. The zero replacement was carriedout by adopting a Bayesian approach to the zero replacement, based on a posteriorestimation of the parameter of the multinomial distribution. The number of modernanalogues from which reconstructing the SST was determined by means of a multipleapproach by considering the Proxies correlation matrix, Standardized Residual Sum ofSquares and Mean Squared Distance. This new CODAMAT method was applied to theplanktonic foraminiferal assemblages of a core recovered in the Tyrrhenian Sea.Kew words: Modern analogues, Aitchison distance, Proxies correlation matrix,Standardized Residual Sum of Squares

Robust response transformations based on optimal prediction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonlinear regression problems can often be reduced to linearity by transforming the response variable (e.g., using the Box-Cox family of transformations). The classic estimates of the parameter defining the transformation as well as of the regression coefficients are based on the maximum likelihood criterion, assuming homoscedastic normal errors for the transformed response. These estimates are nonrobust in the presence of outliers and can be inconsistent when the errors are nonnormal or heteroscedastic. This article proposes new robust estimates that are consistent and asymptotically normal for any unimodal and homoscedastic error distribution. For this purpose, a robust version of conditional expectation is introduced for which the prediction mean squared error is replaced with an M scale. This concept is then used to develop a nonparametric criterion to estimate the transformation parameter as well as the regression coefficients. A finite sample estimate of this criterion based on a robust version of smearing is also proposed. Monte Carlo experiments show that the new estimates compare favorably with respect to the available competitors.

An assessment of empirical Bayes and composite estimators for small areas

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We compare a set of empirical Bayes and composite estimators of the population means of the districts (small areas) of a country, and show that the natural modelling strategy of searching for a well fitting empirical Bayes model and using it for estimation of the area-level means can be inefficient.

Using composite estimators to improve both domain and total area estimation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article we propose using small area estimators to improve the estimatesof both the small and large area parameters. When the objective is to estimateparameters at both levels accurately, optimality is achieved by a mixed sampledesign of fixed and proportional allocations. In the mixed sample design, oncea sample size has been determined, one fraction of it is distributedproportionally among the different small areas while the rest is evenlydistributed among them. We use Monte Carlo simulations to assess theperformance of the direct estimator and two composite covariant-freesmall area estimators, for different sample sizes and different sampledistributions. Performance is measured in terms of Mean Squared Errors(MSE) of both small and large area parameters. It is found that the adoptionof small area composite estimators open the possibility of 1) reducingsample size when precision is given, or 2) improving precision for a givensample size.

Measuring and explaining farm inefficiency in a panel data set of mixed farms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper aims to estimate a translog stochastic frontier production function in the analysis of a panel of 150 mixed Catalan farms in the period 1989-1993, in order to attempt to measure and explain variation in technical inefficiency scores with a one-stage approach. The model uses gross value added as the output aggregate measure. Total employment, fixed capital, current assets, specific costs and overhead costs are introduced into the model as inputs. Stochasticfrontier estimates are compared with those obtained using a linear programming method using a two-stage approach. The specification of the translog stochastic frontier model appears as an appropriate representation of the data, technical change was rejected and the technical inefficiency effects were statistically significant. The mean technical efficiency in the period analyzed was estimated to be 64.0%. Farm inefficiency levels were found significantly at 5%level and positively correlated with the number of economic size units.

Improving small area estimation by combining surveys: new perspectives in regional statistics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A national survey designed for estimating a specific population quantity is sometimes used for estimation of this quantity also for a small area, such as a province. Budget constraints do not allow a greater sample size for the small area, and so other means of improving estimation have to be devised. We investigate such methods and assess them by a Monte Carlo study. We explore how a complementary survey can be exploited in small area estimation. We use the context of the Spanish Labour Force Survey (EPA) and the Barometer in Spain for our study.

iLOGP: A Simple, Robust, and Efficient Description of n-Octanol/Water Partition Coefficient for Drug Design Using the GB/SA Approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The n-octanol/water partition coefficient (log Po/w) is a key physicochemical parameter for drug discovery, design, and development. Here, we present a physics-based approach that shows a strong linear correlation between the computed solvation free energy in implicit solvents and the experimental log Po/w on a cleansed data set of more than 17,500 molecules. After internal validation by five-fold cross-validation and data randomization, the predictive power of the most interesting multiple linear model, based on two GB/SA parameters solely, was tested on two different external sets of molecules. On the Martel druglike test set, the predictive power of the best model (N = 706, r = 0.64, MAE = 1.18, and RMSE = 1.40) is similar to six well-established empirical methods. On the 17-drug test set, our model outperformed all compared empirical methodologies (N = 17, r = 0.94, MAE = 0.38, and RMSE = 0.52). The physical basis of our original GB/SA approach together with its predictive capacity, computational efficiency (1 to 2 s per molecule), and tridimensional molecular graphics capability lay the foundations for a promising predictor, the implicit log P method (iLOGP), to complement the portfolio of drug design tools developed and provided by the SIB Swiss Institute of Bioinformatics.

A zero-delay sequential scheme for lossy coding of individual sequences

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider adaptive sequential lossy coding of bounded individual sequences when the performance is measured by the sequentially accumulated mean squared distortion. Theencoder and the decoder are connected via a noiseless channel of capacity $R$ and both are assumed to have zero delay. No probabilistic assumptions are made on how the sequence to be encoded is generated. For any bounded sequence of length $n$, the distortion redundancy is defined as the normalized cumulative distortion of the sequential scheme minus the normalized cumulative distortion of the best scalarquantizer of rate $R$ which is matched to this particular sequence. We demonstrate the existence of a zero-delay sequential scheme which uses common randomization in the encoder and the decoder such that the normalized maximum distortion redundancy converges to zero at a rate $n^{-1/5}\log n$ as the length of the encoded sequence $n$ increases without bound.

Estimation of percentage body fat in 6- to 13-year-old children by skinfold thickness, body mass index and waist circumference.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We evaluated the accuracy of skinfold thicknesses, BMI and waist circumference for the prediction of percentage body fat (PBF) in a representative sample of 372 Swiss children aged 6-13 years. PBF was measured using dual-energy X-ray absorptiometry. On the basis of a preliminary bootstrap selection of predictors, seven regression models were evaluated. All models included sex, age and pubertal stage plus one of the following predictors: (1) log-transformed triceps skinfold (logTSF); (2) logTSF and waist circumference; (3) log-transformed sum of triceps and subscapular skinfolds (logSF2); (4) log-transformed sum of triceps, biceps, subscapular and supra-iliac skinfolds (logSF4); (5) BMI; (6) waist circumference; (7) BMI and waist circumference. The adjusted determination coefficient (R² adj) and the root mean squared error (RMSE; kg) were calculated for each model. LogSF4 (R² adj 0.85; RMSE 2.35) and logSF2 (R² adj 0.82; RMSE 2.54) were similarly accurate at predicting PBF and superior to logTSF (R² adj 0.75; RMSE 3.02), logTSF combined with waist circumference (R² adj 0.78; RMSE 2.85), BMI (R² adj 0.62; RMSE 3.73), waist circumference (R² adj 0.58; RMSE 3.89), and BMI combined with waist circumference (R² adj 0.63; RMSE 3.66) (P < 0.001 for all values of R² adj). The finding that logSF4 was only modestly superior to logSF2 and that logTSF was better than BMI and waist circumference at predicting PBF has important implications for paediatric epidemiological studies aimed at disentangling the effect of body fat on health outcomes.

Una extensión de la regresión propuesta por Geweke y Porter-Hudak para la estimación del orden de diferenciación en modelos ARFIMA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recientemente, ha aumentado mucho el interés por la aplicación de los modelos de memoria larga a variables económicas, sobre todo los modelos ARFIMA. Sin duda , el método más usado para la estimación de estos modelos en el ámbito del análisis económico es el propuesto por Geweke y Portero-Hudak (GPH) aun cuando en trabajos recientes se ha demostrado que, en ciertos casos, este estimador presenta un sesgo muy importante. De ahí que, se propone una extensión de este estimador a partir del modelo exponencial propuesto por Bloomfield, y que permite corregir este sesgo.A continuación, se analiza y compara el comportamiento de ambos estimadores en muestras no muy grandes y se comprueba como el estimador propuesto presenta un error cuadrático medio menor que el estimador GPH

Una extensión de la regresión propuesta por Geweke y Porter-Hudak para la estimación del orden de diferenciación en modelos ARFIMA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recientemente, ha aumentado mucho el interés por la aplicación de los modelos de memoria larga a variables económicas, sobre todo los modelos ARFIMA. Sin duda , el método más usado para la estimación de estos modelos en el ámbito del análisis económico es el propuesto por Geweke y Portero-Hudak (GPH) aun cuando en trabajos recientes se ha demostrado que, en ciertos casos, este estimador presenta un sesgo muy importante. De ahí que, se propone una extensión de este estimador a partir del modelo exponencial propuesto por Bloomfield, y que permite corregir este sesgo.A continuación, se analiza y compara el comportamiento de ambos estimadores en muestras no muy grandes y se comprueba como el estimador propuesto presenta un error cuadrático medio menor que el estimador GPH

Coarsening of solid-liquid mixtures in a random acceleration field

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effects of flow induced by a random acceleration field (g-jitter) are considered in two related situations that are of interest for microgravity fluid experiments: the random motion of isolated buoyant particles, and diffusion driven coarsening of a solid-liquid mixture. We start by analyzing in detail actual accelerometer data gathered during a recent microgravity mission, and obtain the values of the parameters defining a previously introduced stochastic model of this acceleration field. The diffusive motion of a single solid particle suspended in an incompressible fluid that is subjected to such random accelerations is considered, and mean squared velocities and effective diffusion coefficients are explicitly given. We next study the flow induced by an ensemble of such particles, and show the existence of a hydrodynamically induced attraction between pairs of particles at distances large compared with their radii, and repulsion at short distances. Finally, a mean field analysis is used to estimate the effect of g-jitter on diffusion controlled coarsening of a solid-liquid mixture. Corrections to classical coarsening rates due to the induced fluid motion are calculated, and estimates are given for coarsening of Sn-rich particles in a Sn-Pb eutectic fluid, an experiment to be conducted in microgravity in the near future.

Quantum fluctuations on domain walls, strings, and vacuum bubbles

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop a covariant quantum theory of fluctuations on vacuum domain walls and strings. The fluctuations are described by a scalar field defined on the classical world sheet of the defects. We consider the following cases: straight strings and planar walls in flat space, true vacuum bubbles nucleating in false vacuum, and strings and walls nucleating during inflation. The quantum state for the perturbations is constructed so that it respects the original symmetries of the classical solution. In particular, for the case of vacuum bubbles and nucleating strings and walls, the geometry of the world sheet is that of a lower-dimensional de Sitter space, and the problem reduces to the quantization of a scalar field of tachyonic mass in de Sitter space. In all cases, the root-mean-squared fluctuation is evaluated in detail, and the physical implications are briefly discussed.

Revisiting Field Capacity (FC): variation of definition of FC and its estimation from pedotransfer functions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Taking into account the nature of the hydrological processes involved in in situ measurement of Field Capacity (FC), this study proposes a variation of the definition of FC aiming not only at minimizing the inadequacies of its determination, but also at maintaining its original, practical meaning. Analysis of FC data for 22 Brazilian soils and additional FC data from the literature, all measured according to the proposed definition, which is based on a 48-h drainage time after infiltration by shallow ponding, indicates a weak dependency on the amount of infiltrated water, antecedent moisture level, soil morphology, and the level of the groundwater table, but a strong dependency on basic soil properties. The dependence on basic soil properties allowed determination of FC of the 22 soil profiles by pedotransfer functions (PTFs) using the input variables usually adopted in prediction of soil water retention. Among the input variables, soil moisture content θ (6 kPa) had the greatest impact. Indeed, a linear PTF based only on it resulted in an FC with a root mean squared residue less than 0.04 m³ m-3 for most soils individually. Such a PTF proved to be a better FC predictor than the traditional method of using moisture content at an arbitrary suction. Our FC data were compatible with an equivalent and broader USA database found in the literature, mainly for medium-texture soil samples. One reason for differences between FCs of the two data sets of fine-textured soils is due to their different drainage times. Thus, a standardized procedure for in situ determination of FC is recommended.

«
1
2
3
4
5
6
7
8
...
64
65
»