941 resultados para Kernel density estimation
Resumo:
We document the existence of a Crime Kuznets Curve in US states since the 1970s. As income levels have risen, crime has followed an inverted U-shaped pattern, first increasing and then dropping. The Crime Kuznets Curve is not explained by income inequality. In fact, we show that during the sample period inequality has risen monotonically with income, ruling out the traditional Kuznets Curve. Our finding is robust to adding a large set of controls that are used in the literature to explain the incidence of crime, as well as to controlling for state and year fixed effects. The Curve is also revealed in nonparametric specifications. The Crime Kuznets Curve exists for property crime and for some categories of violent crime.
Resumo:
Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm.
Resumo:
Estimating snow mass at continental scales is difficult but important for understanding landatmosphere interactions, biogeochemical cycles and Northern latitudes’ hydrology. Remote sensing provides the only consistent global observations, but the uncertainty in measurements is poorly understood. Existing techniques for the remote sensing of snow mass are based on the Chang algorithm, which relates the absorption of Earth-emitted microwave radiation by a snow layer to the snow mass within the layer. The absorption also depends on other factors such as the snow grain size and density, which are assumed and fixed within the algorithm. We examine the assumptions, compare them to field measurements made at the NASA Cold Land Processes Experiment (CLPX) Colorado field site in 2002–3, and evaluate the consequences of deviation and variability for snow mass retrieval. The accuracy of the emission model used to devise the algorithm also has an impact on its accuracy, so we test this with the CLPX measurements of snow properties against SSM/I and AMSR-E satellite measurements.
Resumo:
Accurate estimates for the fall speed of natural hydrometeors are vital if their evolution in clouds is to be understood quantitatively. In this study, laboratory measurements of the terminal velocity vt for a variety of ice particle models settling in viscous fluids, along with wind-tunnel and field measurements of ice particles settling in air, have been analyzed and compared to common methods of computing vt from the literature. It is observed that while these methods work well for a number of particle types, they fail for particles with open geometries, specifically those particles for which the area ratio Ar is small (Ar is defined as the area of the particle projected normal to the flow divided by the area of a circumscribing disc). In particular, the fall speeds of stellar and dendritic crystals, needles, open bullet rosettes, and low-density aggregates are all overestimated. These particle types are important in many cloud types: aggregates in particular often dominate snow precipitation at the ground and vertically pointing Doppler radar measurements. Based on the laboratory data, a simple modification to previous computational methods is proposed, based on the area ratio. This new method collapses the available drag data onto an approximately universal curve, and the resulting errors in the computed fall speeds relative to the tank data are less than 25% in all cases. Comparison with the (much more scattered) measurements of ice particles falling in air show strong support for this new method, with the area ratio bias apparently eliminated.
Resumo:
This paper discusses how numerical gradient estimation methods may be used in order to reduce the computational demands on a class of multidimensional clustering algorithms. The study is motivated by the recognition that several current point-density based cluster identification algorithms could benefit from a reduction of computational demand if approximate a-priori estimates of the cluster centres present in a given data set could be supplied as starting conditions for these algorithms. In this particular presentation, the algorithm shown to benefit from the technique is the Mean-Tracking (M-T) cluster algorithm, but the results obtained from the gradient estimation approach may also be applied to other clustering algorithms and their related disciplines.
Resumo:
Identifying a periodic time-series model from environmental records, without imposing the positivity of the growth rate, does not necessarily respect the time order of the data observations. Consequently, subsequent observations, sampled in the environmental archive, can be inversed on the time axis, resulting in a non-physical signal model. In this paper an optimization technique with linear constraints on the signal model parameters is proposed that prevents time inversions. The activation conditions for this constrained optimization are based upon the physical constraint of the growth rate, namely, that it cannot take values smaller than zero. The actual constraints are defined for polynomials and first-order splines as basis functions for the nonlinear contribution in the distance-time relationship. The method is compared with an existing method that eliminates the time inversions, and its noise sensitivity is tested by means of Monte Carlo simulations. Finally, the usefulness of the method is demonstrated on the measurements of the vessel density, in a mangrove tree, Rhizophora mucronata, and the measurement of Mg/Ca ratios, in a bivalve, Mytilus trossulus.
Resumo:
Estimating snow mass at continental scales is difficult, but important for understanding land-atmosphere interactions, biogeochemical cycles and the hydrology of the Northern latitudes. Remote sensing provides the only consistent global observations, butwith unknown errors. Wetest the theoretical performance of the Chang algorithm for estimating snow mass from passive microwave measurements using the Helsinki University of Technology (HUT) snow microwave emission model. The algorithm's dependence upon assumptions of fixed and uniform snow density and grainsize is determined, and measurements of these properties made at the Cold Land Processes Experiment (CLPX) Colorado field site in 2002–2003 used to quantify the retrieval errors caused by differences between the algorithm assumptions and measurements. Deviation from the Chang algorithm snow density and grainsize assumptions gives rise to an error of a factor of between two and three in calculating snow mass. The possibility that the algorithm performsmore accurately over large areas than at points is tested by simulating emission from a 25 km diameter area of snow with a distribution of properties derived from the snow pitmeasurements, using the Chang algorithm to calculate mean snow-mass from the simulated emission. The snowmass estimation froma site exhibiting the heterogeneity of the CLPX Colorado site proves onlymarginally different than that from a similarly-simulated homogeneous site. The estimation accuracy predictions are tested using the CLPX field measurements of snow mass, and simultaneous SSM/I and AMSR-E measurements.
Resumo:
We present a new method to determine mesospheric electron densities from partially reflected medium frequency radar pulses. The technique uses an optimal estimation inverse method and retrieves both an electron density profile and a gradient electron density profile. As well as accounting for the absorption of the two magnetoionic modes formed by ionospheric birefringence of each radar pulse, the forward model of the retrieval parameterises possible Fresnel scatter of each mode by fine electronic structure, phase changes of each mode due to Faraday rotation and the dependence of the amplitudes of the backscattered modes upon pulse width. Validation results indicate that known profiles can be retrieved and that χ2 tests upon retrieval parameters satisfy validity criteria. Application to measurements shows that retrieved electron density profiles are consistent with accepted ideas about seasonal variability of electron densities and their dependence upon nitric oxide production and transport.
Resumo:
The estimation of the long-term wind resource at a prospective site based on a relatively short on-site measurement campaign is an indispensable task in the development of a commercial wind farm. The typical industry approach is based on the measure-correlate-predict �MCP� method where a relational model between the site wind velocity data and the data obtained from a suitable reference site is built from concurrent records. In a subsequent step, a long-term prediction for the prospective site is obtained from a combination of the relational model and the historic reference data. In the present paper, a systematic study is presented where three new MCP models, together with two published reference models �a simple linear regression and the variance ratio method�, have been evaluated based on concurrent synthetic wind speed time series for two sites, simulating the prospective and the reference site. The synthetic method has the advantage of generating time series with the desired statistical properties, including Weibull scale and shape factors, required to evaluate the five methods under all plausible conditions. In this work, first a systematic discussion of the statistical fundamentals behind MCP methods is provided and three new models, one based on a nonlinear regression and two �termed kernel methods� derived from the use of conditional probability density functions, are proposed. All models are evaluated by using five metrics under a wide range of values of the correlation coefficient, the Weibull scale, and the Weibull shape factor. Only one of all models, a kernel method based on bivariate Weibull probability functions, is capable of accurately predicting all performance metrics studied.
Resumo:
Simultaneous scintillometer measurements at multiple wavelengths (pairing visible or infrared with millimetre or radio waves) have the potential to provide estimates of path-averaged surface fluxes of sensible and latent heat. Traditionally, the equations to deduce fluxes from measurements of the refractive index structure parameter at the two wavelengths have been formulated in terms of absolute humidity. Here, it is shown that formulation in terms of specific humidity has several advantages. Specific humidity satisfies the requirement for a conserved variable in similarity theory and inherently accounts for density effects misapportioned through the use of absolute humidity. The validity and interpretation of both formulations are assessed and the analogy with open-path infrared gas analyser density corrections is discussed. Original derivations using absolute humidity to represent the influence of water vapour are shown to misrepresent the latent heat flux. The errors in the flux, which depend on the Bowen ratio (larger for drier conditions), may be of the order of 10%. The sensible heat flux is shown to remain unchanged. It is also verified that use of a single scintillometer at optical wavelengths is essentially unaffected by these new formulations. Where it may not be possible to reprocess two-wavelength results, a density correction to the latent heat flux is proposed for scintillometry, which can be applied retrospectively to reduce the error.
Resumo:
A class identification algorithms is introduced for Gaussian process(GP)models.The fundamental approach is to propose a new kernel function which leads to a covariance matrix with low rank,a property that is consequently exploited for computational efficiency for both model parameter estimation and model predictions.The objective of either maximizing the marginal likelihood or the Kullback–Leibler (K–L) divergence between the estimated output probability density function(pdf)and the true pdf has been used as respective cost functions.For each cost function,an efficient coordinate descent algorithm is proposed to estimate the kernel parameters using a one dimensional derivative free search, and noise variance using a fast gradient descent algorithm. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.
Resumo:
A procedure (concurrent multiplicative-additive objective analysis scheme [CMA-OAS]) is proposed for operational rainfall estimation using rain gauges and radar data. On the basis of a concurrent multiplicative-additive (CMA) decomposition of the spatially nonuniform radar bias, within-storm variability of rainfall and fractional coverage of rainfall are taken into account. Thus both spatially nonuniform radar bias, given that rainfall is detected, and bias in radar detection of rainfall are handled. The interpolation procedure of CMA-OAS is built on Barnes' objective analysis scheme (OAS), whose purpose is to estimate a filtered spatial field of the variable of interest through a successive correction of residuals resulting from a Gaussian kernel smoother applied on spatial samples. The CMA-OAS, first, poses an optimization problem at each gauge-radar support point to obtain both a local multiplicative-additive radar bias decomposition and a regionalization parameter. Second, local biases and regionalization parameters are integrated into an OAS to estimate the multisensor rainfall at the ground level. The procedure is suited to relatively sparse rain gauge networks. To show the procedure, six storms are analyzed at hourly steps over 10,663 km2. Results generally indicated an improved quality with respect to other methods evaluated: a standard mean-field bias adjustment, a spatially variable adjustment with multiplicative factors, and ordinary cokriging.
Resumo:
A new class of parameter estimation algorithms is introduced for Gaussian process regression (GPR) models. It is shown that the integration of the GPR model with probability distance measures of (i) the integrated square error and (ii) Kullback–Leibler (K–L) divergence are analytically tractable. An efficient coordinate descent algorithm is proposed to iteratively estimate the kernel width using golden section search which includes a fast gradient descent algorithm as an inner loop to estimate the noise variance. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.
Resumo:
Data from 58 strong-lensing events surveyed by the Sloan Lens ACS Survey are used to estimate the projected galaxy mass inside their Einstein radii by two independent methods: stellar dynamics and strong gravitational lensing. We perform a joint analysis of these two estimates inside models with up to three degrees of freedom with respect to the lens density profile, stellar velocity anisotropy, and line-of-sight (LOS) external convergence, which incorporates the effect of the large-scale structure on strong lensing. A Bayesian analysis is employed to estimate the model parameters, evaluate their significance, and compare models. We find that the data favor Jaffe`s light profile over Hernquist`s, but that any particular choice between these two does not change the qualitative conclusions with respect to the features of the system that we investigate. The density profile is compatible with an isothermal, being sightly steeper and having an uncertainty in the logarithmic slope of the order of 5% in models that take into account a prior ignorance on anisotropy and external convergence. We identify a considerable degeneracy between the density profile slope and the anisotropy parameter, which largely increases the uncertainties in the estimates of these parameters, but we encounter no evidence in favor of an anisotropic velocity distribution on average for the whole sample. An LOS external convergence following a prior probability distribution given by cosmology has a small effect on the estimation of the lens density profile, but can increase the dispersion of its value by nearly 40%.
Resumo:
This paper deals with the testing of autoregressive conditional duration (ACD) models by gauging the distance between the parametric density and hazard rate functions implied by the duration process and their non-parametric estimates. We derive the asymptotic justification using the functional delta method for fixed and gamma kernels, and then investigate the finite-sample properties through Monte Carlo simulations. Although our tests display some size distortion, bootstrapping suffices to correct the size without compromising their excellent power. We show the practical usefulness of such testing procedures for the estimation of intraday volatility patterns.