968 resultados para order estimation
Resumo:
This paper discusses how numerical gradient estimation methods may be used in order to reduce the computational demands on a class of multidimensional clustering algorithms. The study is motivated by the recognition that several current point-density based cluster identification algorithms could benefit from a reduction of computational demand if approximate a-priori estimates of the cluster centres present in a given data set could be supplied as starting conditions for these algorithms. In this particular presentation, the algorithm shown to benefit from the technique is the Mean-Tracking (M-T) cluster algorithm, but the results obtained from the gradient estimation approach may also be applied to other clustering algorithms and their related disciplines.
Resumo:
Identifying a periodic time-series model from environmental records, without imposing the positivity of the growth rate, does not necessarily respect the time order of the data observations. Consequently, subsequent observations, sampled in the environmental archive, can be inversed on the time axis, resulting in a non-physical signal model. In this paper an optimization technique with linear constraints on the signal model parameters is proposed that prevents time inversions. The activation conditions for this constrained optimization are based upon the physical constraint of the growth rate, namely, that it cannot take values smaller than zero. The actual constraints are defined for polynomials and first-order splines as basis functions for the nonlinear contribution in the distance-time relationship. The method is compared with an existing method that eliminates the time inversions, and its noise sensitivity is tested by means of Monte Carlo simulations. Finally, the usefulness of the method is demonstrated on the measurements of the vessel density, in a mangrove tree, Rhizophora mucronata, and the measurement of Mg/Ca ratios, in a bivalve, Mytilus trossulus.
Resumo:
Statistical methods of inference typically require the likelihood function to be computable in a reasonable amount of time. The class of “likelihood-free” methods termed Approximate Bayesian Computation (ABC) is able to eliminate this requirement, replacing the evaluation of the likelihood with simulation from it. Likelihood-free methods have gained in efficiency and popularity in the past few years, following their integration with Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) in order to better explore the parameter space. They have been applied primarily to estimating the parameters of a given model, but can also be used to compare models. Here we present novel likelihood-free approaches to model comparison, based upon the independent estimation of the evidence of each model under study. Key advantages of these approaches over previous techniques are that they allow the exploitation of MCMC or SMC algorithms for exploring the parameter space, and that they do not require a sampler able to mix between models. We validate the proposed methods using a simple exponential family problem before providing a realistic problem from human population genetics: the comparison of different demographic models based upon genetic data from the Y chromosome.
Resumo:
We introduce an algorithm (called REDFITmc2) for spectrum estimation in the presence of timescale errors. It is based on the Lomb-Scargle periodogram for unevenly spaced time series, in combination with the Welch's Overlapped Segment Averaging procedure, bootstrap bias correction and persistence estimation. The timescale errors are modelled parametrically and included in the simulations for determining (1) the upper levels of the spectrum of the red-noise AR(1) alternative and (2) the uncertainty of the frequency of a spectral peak. Application of REDFITmc2 to ice core and stalagmite records of palaeoclimate allowed a more realistic evaluation of spectral peaks than when ignoring this source of uncertainty. The results support qualitatively the intuition that stronger effects on the spectrum estimate (decreased detectability and increased frequency uncertainty) occur for higher frequencies. The surplus information brought by algorithm REDFITmc2 is that those effects are quantified. Regarding timescale construction, not only the fixpoints, dating errors and the functional form of the age-depth model play a role. Also the joint distribution of all time points (serial correlation, stratigraphic order) determines spectrum estimation.
Resumo:
We develop a new sparse kernel density estimator using a forward constrained regression framework, within which the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Our main contribution is to derive a recursive algorithm to select significant kernels one at time based on the minimum integrated square error (MISE) criterion for both the selection of kernels and the estimation of mixing weights. The proposed approach is simple to implement and the associated computational cost is very low. Specifically, the complexity of our algorithm is in the order of the number of training data N, which is much lower than the order of N2 offered by the best existing sparse kernel density estimators. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to those of the classical Parzen window estimate and other existing sparse kernel density estimators.
Resumo:
Optimal estimation (OE) improves sea surface temperature (SST) estimated from satellite infrared imagery in the “split-window”, in comparison to SST retrieved using the usual multi-channel (MCSST) or non-linear (NLSST) estimators. This is demonstrated using three months of observations of the Advanced Very High Resolution Radiometer (AVHRR) on the first Meteorological Operational satellite (Metop-A), matched in time and space to drifter SSTs collected on the global telecommunications system. There are 32,175 matches. The prior for the OE is forecast atmospheric fields from the Météo-France global numerical weather prediction system (ARPEGE), the forward model is RTTOV8.7, and a reduced state vector comprising SST and total column water vapour (TCWV) is used. Operational NLSST coefficients give mean and standard deviation (SD) of the difference between satellite and drifter SSTs of 0.00 and 0.72 K. The “best possible” NLSST and MCSST coefficients, empirically regressed on the data themselves, give zero mean difference and SDs of 0.66 K and 0.73 K respectively. Significant contributions to the global SD arise from regional systematic errors (biases) of several tenths of kelvin in the NLSST. With no bias corrections to either prior fields or forward model, the SSTs retrieved by OE minus drifter SSTs have mean and SD of − 0.16 and 0.49 K respectively. The reduction in SD below the “best possible” regression results shows that OE deals with structural limitations of the NLSST and MCSST algorithms. Using simple empirical bias corrections to improve the OE, retrieved minus drifter SSTs are obtained with mean and SD of − 0.06 and 0.44 K respectively. Regional biases are greatly reduced, such that the absolute bias is less than 0.1 K in 61% of 10°-latitude by 30°-longitude cells. OE also allows a statistic of the agreement between modelled and measured brightness temperatures to be calculated. We show that this measure is more efficient than the current system of confidence levels at identifying reliable retrievals, and that the best 75% of satellite SSTs by this measure have negligible bias and retrieval error of order 0.25 K.
Resumo:
Simultaneous scintillometer measurements at multiple wavelengths (pairing visible or infrared with millimetre or radio waves) have the potential to provide estimates of path-averaged surface fluxes of sensible and latent heat. Traditionally, the equations to deduce fluxes from measurements of the refractive index structure parameter at the two wavelengths have been formulated in terms of absolute humidity. Here, it is shown that formulation in terms of specific humidity has several advantages. Specific humidity satisfies the requirement for a conserved variable in similarity theory and inherently accounts for density effects misapportioned through the use of absolute humidity. The validity and interpretation of both formulations are assessed and the analogy with open-path infrared gas analyser density corrections is discussed. Original derivations using absolute humidity to represent the influence of water vapour are shown to misrepresent the latent heat flux. The errors in the flux, which depend on the Bowen ratio (larger for drier conditions), may be of the order of 10%. The sensible heat flux is shown to remain unchanged. It is also verified that use of a single scintillometer at optical wavelengths is essentially unaffected by these new formulations. Where it may not be possible to reprocess two-wavelength results, a density correction to the latent heat flux is proposed for scintillometry, which can be applied retrospectively to reduce the error.
Resumo:
In order to overcome divergence of estimation with the same data, the proposed digital costing process adopts an integrated design of information system to design the process knowledge and costing system together. By employing and extending a widely used international standard, industry foundation classes, the system can provide an integrated process which can harvest information and knowledge of current quantity surveying practice of costing method and data. Knowledge of quantification is encoded from literatures, motivation case and standards. It can reduce the time consumption of current manual practice. The further development will represent the pricing process in a Bayesian Network based knowledge representation approach. The hybrid types of knowledge representation can produce a reliable estimation for construction project. In a practical term, the knowledge management of quantity surveying can improve the system of construction estimation. The theoretical significance of this study lies in the fact that its content and conclusion make it possible to develop an automatic estimation system based on hybrid knowledge representation approach.
Resumo:
Background: Dietary assessment methods are important tools for nutrition research. Online dietary assessment tools have the potential to become invaluable methods of assessing dietary intake because, compared with traditional methods, they have many advantages including the automatic storage of input data and the immediate generation of nutritional outputs. Objective: The aim of this study was to develop an online food frequency questionnaire (FFQ) for dietary data collection in the “Food4Me” study and to compare this with the validated European Prospective Investigation of Cancer (EPIC) Norfolk printed FFQ. Methods: The Food4Me FFQ used in this analysis was developed to consist of 157 food items. Standardized color photographs were incorporated in the development of the Food4Me FFQ to facilitate accurate quantification of the portion size of each food item. Participants were recruited in two centers (Dublin, Ireland and Reading, United Kingdom) and each received the online Food4Me FFQ and the printed EPIC-Norfolk FFQ in random order. Participants completed the Food4Me FFQ online and, for most food items, participants were requested to choose their usual serving size among seven possibilities from a range of portion size pictures. The level of agreement between the two methods was evaluated for both nutrient and food group intakes using the Bland and Altman method and classification into quartiles of daily intake. Correlations were calculated for nutrient and food group intakes. Results: A total of 113 participants were recruited with a mean age of 30 (SD 10) years (40.7% male, 46/113; 59.3%, 67/113 female). Cross-classification into exact plus adjacent quartiles ranged from 77% to 97% at the nutrient level and 77% to 99% at the food group level. Agreement at the nutrient level was highest for alcohol (97%) and lowest for percent energy from polyunsaturated fatty acids (77%). Crude unadjusted correlations for nutrients ranged between .43 and .86. Agreement at the food group level was highest for “other fruits” (eg, apples, pears, oranges) and lowest for “cakes, pastries, and buns”. For food groups, correlations ranged between .41 and .90. Conclusions: The results demonstrate that the online Food4Me FFQ has good agreement with the validated printed EPIC-Norfolk FFQ for assessing both nutrient and food group intakes, rendering it a useful tool for ranking individuals based on nutrient and food group intakes.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Resumo:
High bandwidth-efficiency quadrature amplitude modulation (QAM) signaling widely adopted in high-rate communication systems suffers from a drawback of high peak-toaverage power ratio, which may cause the nonlinear saturation of the high power amplifier (HPA) at transmitter. Thus, practical high-throughput QAM communication systems exhibit nonlinear and dispersive channel characteristics that must be modeled as a Hammerstein channel. Standard linear equalization becomes inadequate for such Hammerstein communication systems. In this paper, we advocate an adaptive B-Spline neural network based nonlinear equalizer. Specifically, during the training phase, an efficient alternating least squares (LS) scheme is employed to estimate the parameters of the Hammerstein channel, including both the channel impulse response (CIR) coefficients and the parameters of the B-spline neural network that models the HPA’s nonlinearity. In addition, another B-spline neural network is used to model the inversion of the nonlinear HPA, and the parameters of this inverting B-spline model can easily be estimated using the standard LS algorithm based on the pseudo training data obtained as a natural byproduct of the Hammerstein channel identification. Nonlinear equalisation of the Hammerstein channel is then accomplished by the linear equalization based on the estimated CIR as well as the inverse B-spline neural network model. Furthermore, during the data communication phase, the decision-directed LS channel estimation is adopted to track the time-varying CIR. Extensive simulation results demonstrate the effectiveness of our proposed B-Spline neural network based nonlinear equalization scheme.
Resumo:
Optimal state estimation is a method that requires minimising a weighted, nonlinear, least-squares objective function in order to obtain the best estimate of the current state of a dynamical system. Often the minimisation is non-trivial due to the large scale of the problem, the relative sparsity of the observations and the nonlinearity of the objective function. To simplify the problem the solution is often found via a sequence of linearised objective functions. The condition number of the Hessian of the linearised problem is an important indicator of the convergence rate of the minimisation and the expected accuracy of the solution. In the standard formulation the convergence is slow, indicating an ill-conditioned objective function. A transformation to different variables is often used to ameliorate the conditioning of the Hessian by changing, or preconditioning, the Hessian. There is only sparse information in the literature for describing the causes of ill-conditioning of the optimal state estimation problem and explaining the effect of preconditioning on the condition number. This paper derives descriptive theoretical bounds on the condition number of both the unpreconditioned and preconditioned system in order to better understand the conditioning of the problem. We use these bounds to explain why the standard objective function is often ill-conditioned and why a standard preconditioning reduces the condition number. We also use the bounds on the preconditioned Hessian to understand the main factors that affect the conditioning of the system. We illustrate the results with simple numerical experiments.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
We estimate the conditions for detectability of two planets in a 2/1 mean-motion resonance from radial velocity data, as a function of their masses, number of observations and the signal-to-noise ratio. Even for a data set of the order of 100 observations and standard deviations of the order of a few meters per second, we find that Jovian-size resonant planets are difficult to detect if the masses of the planets differ by a factor larger than similar to 4. This is consistent with the present population of real exosystems in the 2/1 commensurability, most of which have resonant pairs with similar minimum masses, and could indicate that many other resonant systems exist, but are currently beyond the detectability limit. Furthermore, we analyze the error distribution in masses and orbital elements of orbital fits from synthetic data sets for resonant planets in the 2/1 commensurability. For various mass ratios and number of data points we find that the eccentricity of the outer planet is systematically overestimated, although the inner planet`s eccentricity suffers a much smaller effect. If the initial conditions correspond to small-amplitude oscillations around stable apsidal corotation resonances, the amplitudes estimated from the orbital fits are biased toward larger amplitudes, in accordance to results found in real resonant extrasolar systems.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.