910 resultados para Missing Data


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations are independent of the underlying distribution, requiring only the existence of second order moments. These formulations are then specialized to the case of missing values in observations for both classification and regression problems. Experiments show that the proposed formulations outperform imputation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Himalayas are presently holding the largest ice masses outside the polar regions and thus (temporarily) store important freshwater resources. In contrast to the contemplation of glaciers, the role of runoff from snow cover has received comparably little attention in the past, although (i) its contribution is thought to be at least equally or even more important than that of ice melt in many Himalayan catchments and (ii) climate change is expected to have widespread and significant consequences on snowmelt runoff. Here, we show that change assessment of snowmelt runoff and its timing is not as straightforward as often postulated, mainly as larger partial pressure of H2O, CO2, CH4, and other greenhouse gases might increase net long-wave input for snowmelt quite significantly in a future atmosphere. In addition, changes in the short-wave energy balance such as the pollution of the snow cover through black carbon or the sensible or latent heat contribution to snowmelt are likely to alter future snowmelt and runoff characteristics as well. For the assessment of snow cover extent and depletion, but also for its monitoring over the extremely large areas of the Himalayas, remote sensing has been used in the past and is likely to become even more important in the future. However, for the calibration and validation of remotely-sensed data, and even-more so in light of possible changes in snow-cover energy balance, we strongly call for more in-situ measurements across the Himalayas, in particular for daily data on new snow and snow cover water equivalent, or the respective energy balance components. Moreover, data should be made accessible to the scientific community, so that the latter can more accurately estimate climate change impacts on Himalayan snow cover and possible consequences thereof on runoff. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Electromagnetic Articulography (EMA) technique is used to record the kinematics of different articulators while one speaks. EMA data often contains missing segments due to sensor failure. In this work, we propose a maximum a-posteriori (MAP) estimation with continuity constraint to recover the missing samples in the articulatory trajectories recorded using EMA. In this approach, we combine the benefits of statistical MAP estimation as well as the temporal continuity of the articulatory trajectories. Experiments on articulatory corpus using different missing segment durations show that the proposed continuity constraint results in a 30% reduction in average root mean squared error in estimation over statistical estimation of missing segments without any continuity constraint.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A data sample corresponding to an integrated luminosity of 2.1 fb(-1) collected by the D phi detector at the Fermilab Tevatron Collider was analyzed to search for squarks and gluinos produced in p (p) over bar collisions at a center-of-mass energy of 1.96 TeV. No evidence for the production of such particles was observed in topologies involving jets and missing transverse energy, and 95% C.L. lower limits of 379 GeV and 308 GeV were set on the squark and gluino masses, respectively, within the framework of minimal supergravity with tan beta = 3, A(0) = 0, and mu < 0. The corresponding previous limits are improved by 54 GeV and 67 GeV. (c) 2008 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Applying location-focused data protection law within the context of a location-agnostic cloud computing framework is fraught with difficulties. While the Proposed EU Data Protection Regulation has introduced a lot of changes to the current data protection framework, the complexities of data processing in the cloud involve various layers and intermediaries of actors that have not been properly addressed. This leaves some gaps in the regulation when analyzed in cloud scenarios. This paper gives a brief overview of the relevant provisions of the regulation that will have an impact on cloud transactions and addresses the missing links. It is hoped that these loopholes will be reconsidered before the final version of the law is passed in order to avoid unintended consequences.