953 resultados para function estimation
Resumo:
Using the classical Parzen window estimate as the target function, the kernel density estimation is formulated as a regression problem and the orthogonal forward regression technique is adopted to construct sparse kernel density estimates. The proposed algorithm incrementally minimises a leave-one-out test error score to select a sparse kernel model, and a local regularisation method is incorporated into the density construction process to further enforce sparsity. The kernel weights are finally updated using the multiplicative nonnegative quadratic programming algorithm, which has the ability to reduce the model size further. Except for the kernel width, the proposed algorithm has no other parameters that need tuning, and the user is not required to specify any additional criterion to terminate the density construction procedure. Two examples are used to demonstrate the ability of this regression-based approach to effectively construct a sparse kernel density estimate with comparable accuracy to that of the full-sample optimised Parzen window density estimate.
Resumo:
A new sparse kernel probability density function (pdf) estimator based on zero-norm constraint is constructed using the classical Parzen window (PW) estimate as the target function. The so-called zero-norm of the parameters is used in order to achieve enhanced model sparsity, and it is suggested to minimize an approximate function of the zero-norm. It is shown that under certain condition, the kernel weights of the proposed pdf estimator based on the zero-norm approximation can be updated using the multiplicative nonnegative quadratic programming algorithm. Numerical examples are employed to demonstrate the efficacy of the proposed approach.
Resumo:
We develop a new sparse kernel density estimator using a forward constrained regression framework, within which the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Our main contribution is to derive a recursive algorithm to select significant kernels one at time based on the minimum integrated square error (MISE) criterion for both the selection of kernels and the estimation of mixing weights. The proposed approach is simple to implement and the associated computational cost is very low. Specifically, the complexity of our algorithm is in the order of the number of training data N, which is much lower than the order of N2 offered by the best existing sparse kernel density estimators. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to those of the classical Parzen window estimate and other existing sparse kernel density estimators.
Resumo:
In this paper we propose methods for smooth hazard estimation of a time variable where that variable is interval censored. These methods allow one to model the transformed hazard in terms of either smooth (smoothing splines) or linear functions of time and other relevant time varying predictor variables. We illustrate the use of this method on a dataset of hemophiliacs where the outcome, time to seroconversion for HIV, is interval censored and left-truncated.
Resumo:
A generalized or tunable-kernel model is proposed for probability density function estimation based on an orthogonal forward regression procedure. Each stage of the density estimation process determines a tunable kernel, namely, its center vector and diagonal covariance matrix, by minimizing a leave-one-out test criterion. The kernel mixing weights of the constructed sparse density estimate are finally updated using the multiplicative nonnegative quadratic programming algorithm to ensure the nonnegative and unity constraints, and this weight-updating process additionally has the desired ability to further reduce the model size. The proposed tunable-kernel model has advantages, in terms of model generalization capability and model sparsity, over the standard fixed-kernel model that restricts kernel centers to the training data points and employs a single common kernel variance for every kernel. On the other hand, it does not optimize all the model parameters together and thus avoids the problems of high-dimensional ill-conditioned nonlinear optimization associated with the conventional finite mixture model. Several examples are included to demonstrate the ability of the proposed novel tunable-kernel model to effectively construct a very compact density estimate accurately.
Resumo:
In this work, we present results from teleseismic P-wave receiver functions (PRFs) obtained in Portugal, Western Iberia. A dense seismic station deployment conducted between 2010 and 2012, in the scope of the WILAS project and covering the entire country, allowed the most spatially extensive probing on the bulk crustal seismic properties of Portugal up to date. The application of the H-kappa stacking algorithm to the PRFs enabled us to estimate the crustal thickness (H) and the average crustal ratio of the P- and S-waves velocities V (p)/V (s) (kappa) for the region. Observations of Moho conversions indicate that this interface is relatively smooth with the crustal thickness ranging between 24 and 34 km, with an average of 30 km. The highest V (p)/V (s) values are found on the Mesozoic-Cenozoic crust beneath the western and southern coastal domain of Portugal, whereas the lowest values correspond to Palaeozoic crust underlying the remaining part of the subject area. An average V (p)/V (s) is found to be 1.72, ranging 1.63-1.86 across the study area, indicating a predominantly felsic composition. Overall, we systematically observe a decrease of V (p)/V (s) with increasing crustal thickness. Taken as a whole, our results indicate a clear distinction between the geological zones of the Variscan Iberian Massif in Portugal, the overall shape of the anomalies conditioned by the shape of the Ibero-Armorican Arc, and associated Late Paleozoic suture zones, and the Meso-Cenozoic basin associated with Atlantic rifting stages. Thickened crust (30-34 km) across the studied region may be inherited from continental collision during the Paleozoic Variscan orogeny. An anomalous crustal thinning to around 28 km is observed beneath the central part of the Central Iberian Zone and the eastern part of South Portuguese Zone.
Resumo:
It has not yet been established whether the spatial variation of particle number concentration (PNC) within a microscale environment can have an effect on exposure estimation results. In general, the degree of spatial variation within microscale environments remains unclear, since previous studies have only focused on spatial variation within macroscale environments. The aims of this study were to determine the spatial variation of PNC within microscale school environments, in order to assess the importance of the number of monitoring sites on exposure estimation. Furthermore, this paper aims to identify which parameters have the largest influence on spatial variation, as well as the relationship between those parameters and spatial variation. Air quality measurements were conducted for two consecutive weeks at each of the 25 schools across Brisbane, Australia. PNC was measured at three sites within the grounds of each school, along with the measurement of meteorological and several other air quality parameters. Traffic density was recorded for the busiest road adjacent to the school. Spatial variation at each school was quantified using coefficient of variation (CV). The portion of CV associated with instrument uncertainty was found to be 0.3 and therefore, CV was corrected so that only non-instrument uncertainty was analysed in the data. The median corrected CV (CVc) ranged from 0 to 0.35 across the schools, with 12 schools found to exhibit spatial variation. The study determined the number of required monitoring sites at schools with spatial variability and tested the deviation in exposure estimation arising from using only a single site. Nine schools required two measurement sites and three schools required three sites. Overall, the deviation in exposure estimation from using only one monitoring site was as much as one order of magnitude. The study also tested the association of spatial variation with wind speed/direction and traffic density, using partial correlation coefficients to identify sources of variation and non-parametric function estimation to quantify the level of variability. Traffic density and road to school wind direction were found to have a positive effect on CVc, and therefore, also on spatial variation. Wind speed was found to have a decreasing effect on spatial variation when it exceeded a threshold of 1.5 (m/s), while it had no effect below this threshold. Traffic density had a positive effect on spatial variation and its effect increased until it reached a density of 70 vehicles per five minutes, at which point its effect plateaued and did not increase further as a result of increasing traffic density.
Resumo:
We consider the analysis of longitudinal data when the covariance function is modeled by additional parameters to the mean parameters. In general, inconsistent estimators of the covariance (variance/correlation) parameters will be produced when the "working" correlation matrix is misspecified, which may result in great loss of efficiency of the mean parameter estimators (albeit the consistency is preserved). We consider using different "Working" correlation models for the variance and the mean parameters. In particular, we find that an independence working model should be used for estimating the variance parameters to ensure their consistency in case the correlation structure is misspecified. The designated "working" correlation matrices should be used for estimating the mean and the correlation parameters to attain high efficiency for estimating the mean parameters. Simulation studies indicate that the proposed algorithm performs very well. We also applied different estimation procedures to a data set from a clinical trial for illustration.
Resumo:
As a promising method for pattern recognition and function estimation, least squares support vector machines (LS-SVM) express the training in terms of solving a linear system instead of a quadratic programming problem as for conventional support vector machines (SVM). In this paper, by using the information provided by the equality constraint, we transform the minimization problem with a single equality constraint in LS-SVM into an unconstrained minimization problem, then propose reduced formulations for LS-SVM. By introducing this transformation, the times of using conjugate gradient (CG) method, which is a greatly time-consuming step in obtaining the numerical solution, are reduced to one instead of two as proposed by Suykens et al. (1999). The comparison on computational speed of our method with the CG method proposed by Suykens et al. and the first order and second order SMO methods on several benchmark data sets shows a reduction of training time by up to 44%. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.
Resumo:
Technology involving genetic modification of crops has the potential to make a contribution to rural poverty reduction in many developing countries. Thus far, insecticide-producing 'Bt' varieties of cotton have been the main GM crops under cultivation in developing nations. Several studies have evaluated the farm-level performance of Bt varieties in comparison to conventional ones by estimating production technology, and have mostly found Bt technology to be very successful in raising output and/or reducing insecticide input. However, the production risk properties of this technology have not been studied, although they are likely to be important to risk-averse smallholders. This study investigates the output risk aspects of Bt technology using a three-year farm-level dataset on smallholder cotton production in Makhathini flats, Kwa-Zulu Natal, South Africa. Stochastic dominance and stochastic production function estimation methods are used to examine the risk properties of the two technologies. Results indicate that Bt technology increases output risk by being most effective when crop growth conditions are good, but being less effective when conditions are less favourable. However, in spite of its risk increasing effect, the mean output performance of Bt cotton is good enough to make it preferable to conventional technology even for risk-averse smallholders.