852 resultados para kernel estimation


20.00% 20.00%



Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.


20.00% 20.00%



Preface The starting point for this work and eventually the subject of the whole thesis was the question: how to estimate parameters of the affine stochastic volatility jump-diffusion models. These models are very important for contingent claim pricing. Their major advantage, availability T of analytical solutions for characteristic functions, made them the models of choice for many theoretical constructions and practical applications. At the same time, estimation of parameters of stochastic volatility jump-diffusion models is not a straightforward task. The problem is coming from the variance process, which is non-observable. There are several estimation methodologies that deal with estimation problems of latent variables. One appeared to be particularly interesting. It proposes the estimator that in contrast to the other methods requires neither discretization nor simulation of the process: the Continuous Empirical Characteristic function estimator (EGF) based on the unconditional characteristic function. However, the procedure was derived only for the stochastic volatility models without jumps. Thus, it has become the subject of my research. This thesis consists of three parts. Each one is written as independent and self contained article. At the same time, questions that are answered by the second and third parts of this Work arise naturally from the issues investigated and results obtained in the first one. The first chapter is the theoretical foundation of the thesis. It proposes an estimation procedure for the stochastic volatility models with jumps both in the asset price and variance processes. The estimation procedure is based on the joint unconditional characteristic function for the stochastic process. The major analytical result of this part as well as of the whole thesis is the closed form expression for the joint unconditional characteristic function for the stochastic volatility jump-diffusion models. The empirical part of the chapter suggests that besides a stochastic volatility, jumps both in the mean and the volatility equation are relevant for modelling returns of the S&P500 index, which has been chosen as a general representative of the stock asset class. Hence, the next question is: what jump process to use to model returns of the S&P500. The decision about the jump process in the framework of the affine jump- diffusion models boils down to defining the intensity of the compound Poisson process, a constant or some function of state variables, and to choosing the distribution of the jump size. While the jump in the variance process is usually assumed to be exponential, there are at least three distributions of the jump size which are currently used for the asset log-prices: normal, exponential and double exponential. The second part of this thesis shows that normal jumps in the asset log-returns should be used if we are to model S&P500 index by a stochastic volatility jump-diffusion model. This is a surprising result. Exponential distribution has fatter tails and for this reason either exponential or double exponential jump size was expected to provide the best it of the stochastic volatility jump-diffusion models to the data. The idea of testing the efficiency of the Continuous ECF estimator on the simulated data has already appeared when the first estimation results of the first chapter were obtained. In the absence of a benchmark or any ground for comparison it is unreasonable to be sure that our parameter estimates and the true parameters of the models coincide. The conclusion of the second chapter provides one more reason to do that kind of test. Thus, the third part of this thesis concentrates on the estimation of parameters of stochastic volatility jump- diffusion models on the basis of the asset price time-series simulated from various "true" parameter sets. The goal is to show that the Continuous ECF estimator based on the joint unconditional characteristic function is capable of finding the true parameters. And, the third chapter proves that our estimator indeed has the ability to do so. Once it is clear that the Continuous ECF estimator based on the unconditional characteristic function is working, the next question does not wait to appear. The question is whether the computation effort can be reduced without affecting the efficiency of the estimator, or whether the efficiency of the estimator can be improved without dramatically increasing the computational burden. The efficiency of the Continuous ECF estimator depends on the number of dimensions of the joint unconditional characteristic function which is used for its construction. Theoretically, the more dimensions there are, the more efficient is the estimation procedure. In practice, however, this relationship is not so straightforward due to the increasing computational difficulties. The second chapter, for example, in addition to the choice of the jump process, discusses the possibility of using the marginal, i.e. one-dimensional, unconditional characteristic function in the estimation instead of the joint, bi-dimensional, unconditional characteristic function. As result, the preference for one or the other depends on the model to be estimated. Thus, the computational effort can be reduced in some cases without affecting the efficiency of the estimator. The improvement of the estimator s efficiency by increasing its dimensionality faces more difficulties. The third chapter of this thesis, in addition to what was discussed above, compares the performance of the estimators with bi- and three-dimensional unconditional characteristic functions on the simulated data. It shows that the theoretical efficiency of the Continuous ECF estimator based on the three-dimensional unconditional characteristic function is not attainable in practice, at least for the moment, due to the limitations on the computer power and optimization toolboxes available to the general public. Thus, the Continuous ECF estimator based on the joint, bi-dimensional, unconditional characteristic function has all the reasons to exist and to be used for the estimation of parameters of the stochastic volatility jump-diffusion models.


20.00% 20.00%



Numerous sources of evidence point to the fact that heterogeneity within the Earth's deep crystalline crust is complex and hence may be best described through stochastic rather than deterministic approaches. As seismic reflection imaging arguably offers the best means of sampling deep crustal rocks in situ, much interest has been expressed in using such data to characterize the stochastic nature of crustal heterogeneity. Previous work on this problem has shown that the spatial statistics of seismic reflection data are indeed related to those of the underlying heterogeneous seismic velocity distribution. As of yet, however, the nature of this relationship has remained elusive due to the fact that most of the work was either strictly empirical or based on incorrect methodological approaches. Here, we introduce a conceptual model, based on the assumption of weak scattering, that allows us to quantitatively link the second-order statistics of a 2-D seismic velocity distribution with those of the corresponding processed and depth-migrated seismic reflection image. We then perform a sensitivity study in order to investigate what information regarding the stochastic model parameters describing crustal velocity heterogeneity might potentially be recovered from the statistics of a seismic reflection image using this model. Finally, we present a Monte Carlo inversion strategy to estimate these parameters and we show examples of its application at two different source frequencies and using two different sets of prior information. Our results indicate that the inverse problem is inherently non-unique and that many different combinations of the vertical and lateral correlation lengths describing the velocity heterogeneity can yield seismic images with the same 2-D autocorrelation structure. The ratio of all of these possible combinations of vertical and lateral correlation lengths, however, remains roughly constant which indicates that, without additional prior information, the aspect ratio is the only parameter describing the stochastic seismic velocity structure that can be reliably recovered.


20.00% 20.00%



The nutritional status of cystic fibrosis (CF) patients has to be regularly evaluated and alimentary support instituted when indicated. Bio-electrical impedance analysis (BIA) is a recent method for determining body composition. The present study evaluates its use in CF patients without any clinical sign of malnutrition. Thirty-nine patients with CF and 39 healthy subjects aged 6-24 years were studied. Body density and mid-arm muscle circumference were determined by anthropometry and skinfold measurements. Fat-free mass was calculated taking into account the body density. Muscle mass was obtained from the urinary creatinine excretion rate. The resistance index was calculated by dividing the square of the subject's height by the body impedance. We show that fat-free mass, mid-arm muscle circumference and muscle mass are each linearly correlated to the resistance index and that the regression equations are similar for both CF patients and healthy subjects.


20.00% 20.00%



En aquest treball demostrem que en la classe de jocs d'assignació amb diagonal dominant (Solymosi i Raghavan, 2001), el repartiment de Thompson (que coincideix amb el valor tau) és l'únic punt del core que és maximal respecte de la relació de dominància de Lorenz, i a més coincideix amb la solucié de Dutta i Ray (1989), també coneguda com solució igualitària. En segon lloc, mitjançant una condició més forta que la de diagonal dominant, introduïm una nova classe de jocs d'assignació on cada agent obté amb la seva parella òptima almenys el doble que amb qualsevol altra parella. Per aquests jocs d'assignació amb diagonal 2-dominant, el repartiment de Thompson és l'únic punt del kernel, i per tant el nucleolo.


20.00% 20.00%



We propose an iterative procedure to minimize the sum of squares function which avoids the nonlinear nature of estimating the first order moving average parameter and provides a closed form of the estimator. The asymptotic properties of the method are discussed and the consistency of the linear least squares estimator is proved for the invertible case. We perform various Monte Carlo experiments in order to compare the sample properties of the linear least squares estimator with its nonlinear counterpart for the conditional and unconditional cases. Some examples are also discussed


20.00% 20.00%



PURPOSE: To derive a prediction rule by using prospectively obtained clinical and bone ultrasonographic (US) data to identify elderly women at risk for osteoporotic fractures. MATERIALS AND METHODS: The study was approved by the Swiss Ethics Committee. A prediction rule was computed by using data from a 3-year prospective multicenter study to assess the predictive value of heel-bone quantitative US in 6174 Swiss women aged 70-85 years. A quantitative US device to calculate the stiffness index at the heel was used. Baseline characteristics, known risk factors for osteoporosis and fall, and the quantitative US stiffness index were used to elaborate a predictive rule for osteoporotic fracture. Predictive values were determined by using a univariate Cox model and were adjusted with multivariate analysis. RESULTS: There were five risk factors for the incidence of osteoporotic fracture: older age (>75 years) (P < .001), low heel quantitative US stiffness index (<78%) (P < .001), history of fracture (P = .001), recent fall (P = .001), and a failed chair test (P = .029). The score points assigned to these risk factors were as follows: age, 2 (3 if age > 80 years); low quantitative US stiffness index, 5 (7.5 if stiffness index < 60%); history of fracture, 1; recent fall, 1.5; and failed chair test, 1. The cutoff value to obtain a high sensitivity (90%) was 4.5. With this cutoff, 1464 women were at lower risk (score, <4.5) and 4710 were at higher risk (score, >or=4.5) for fracture. Among the higher-risk women, 6.1% had an osteoporotic fracture, versus 1.8% of women at lower risk. Among the women who had a hip fracture, 90% were in the higher-risk group. CONCLUSION: A prediction rule obtained by using quantitative US stiffness index and four clinical risk factors helped discriminate, with high sensitivity, women at higher versus those at lower risk for osteoporotic fracture.


20.00% 20.00%



While the incidence of sleep disorders is continuously increasing in western societies, there is a clear demand for technologies to asses sleep-related parameters in ambulatory scenarios. The present study introduces a novel concept of accurate sensor to measure RR intervals via the analysis of photo-plethysmographic signals recorded at the wrist. In a cohort of 26 subjects undergoing full night polysomnography, the wrist device provided RR interval estimates in agreement with RR intervals as measured from standard electrocardiographic time series. The study showed an overall agreement between both approaches of 0.05 ± 18 ms. The novel wrist sensor opens the door towards a new generation of comfortable and easy-to-use sleep monitors.


20.00% 20.00%



The amino acid composition of the protein from three strains of rat (Wistar, Zucker lean and Zucker obese), subjected to reference and high-fat diets has been used to determine the mean empirical formula, molecular weight and N content of whole-rat protein. The combined whole protein of the rat was uniform for the six experimental groups, containing an estimate of 17.3% N and a mean aminoacyl residue molecular weight of 103.7. This suggests that the appropriate protein factor for the calculation of rat protein from its N content should be 5.77 instead of the classical 6.25. In addition, an estimate of the size of the non-protein N mass in the whole rat gave a figure in the range of 5.5 % of all N. The combination of the two calculations gives a protein factor of 5.5 for the conversion of total N into rat protein.


20.00% 20.00%



In this paper we propose an innovative methodology for automated profiling of illicit tablets bytheir surface granularity; a feature previously unexamined for this purpose. We make use of the tinyinconsistencies at the tablet surface, referred to as speckles, to generate a quantitative granularity profileof tablets. Euclidian distance is used as a measurement of (dis)similarity between granularity profiles.The frequency of observed distances is then modelled by kernel density estimation in order to generalizethe observations and to calculate likelihood ratios (LRs). The resulting LRs are used to evaluate thepotential of granularity profiles to differentiate between same-batch and different-batches tablets.Furthermore, we use the LRs as a similarity metric to refine database queries. We are able to derivereliable LRs within a scope that represent the true evidential value of the granularity feature. Thesemetrics are used to refine candidate hit-lists form a database containing physical features of illicittablets. We observe improved or identical ranking of candidate tablets in 87.5% of cases when granularityis considered.


20.00% 20.00%



This paper introduces a nonlinear measure of dependence between random variables in the context of remote sensing data analysis. The Hilbert-Schmidt Independence Criterion (HSIC) is a kernel method for evaluating statistical dependence. HSIC is based on computing the Hilbert-Schmidt norm of the cross-covariance operator of mapped samples in the corresponding Hilbert spaces. The HSIC empirical estimator is very easy to compute and has good theoretical and practical properties. We exploit the capabilities of HSIC to explain nonlinear dependences in two remote sensing problems: temperature estimation and chlorophyll concentration prediction from spectra. Results show that, when the relationship between random variables is nonlinear or when few data are available, the HSIC criterion outperforms other standard methods, such as the linear correlation or mutual information.


20.00% 20.00%



Recommendations for statin use for primary prevention of coronary heart disease (CHD) are based on estimation of the 10-year CHD risk. It is unclear which risk algorithm and guidelines should be used in European populations. Using data from a population-based study in Switzerland, we first assessed 10-year CHD risk and eligibility for statins in 5,683 women and men 35 to 75 years of age without cardiovascular disease by comparing recommendations by the European Society of Cardiology without and with extrapolation of risk to age 60 years, the International Atherosclerosis Society, and the US Adult Treatment Panel III. The proportions of participants classified as high-risk for CHD were 12.5% (15.4% with extrapolation), 3.0%, and 5.8%, respectively. Proportions of participants eligible for statins were 9.2% (11.6% with extrapolation), 13.7%, and 16.7%, respectively. Assuming full compliance to each guideline, expected relative decreases in CHD deaths in Switzerland over a 10-year period would be 16.4% (17.5% with extrapolation), 18.7%, and 19.3%, respectively; the corresponding numbers needed to treat to prevent 1 CHD death would be 285 (340 with extrapolation), 380, and 440, respectively. In conclusion, the proportion of subjects classified as high risk for CHD varied over a fivefold range across recommendations. Following the International Atherosclerosis Society and the Adult Treatment Panel III recommendations might prevent more CHD deaths at the cost of higher numbers needed to treat compared with European Society of Cardiology guidelines.


20.00% 20.00%



The temporal dynamics of species diversity are shaped by variations in the rates of speciation and extinction, and there is a long history of inferring these rates using first and last appearances of taxa in the fossil record. Understanding diversity dynamics critically depends on unbiased estimates of the unobserved times of speciation and extinction for all lineages, but the inference of these parameters is challenging due to the complex nature of the available data. Here, we present a new probabilistic framework to jointly estimate species-specific times of speciation and extinction and the rates of the underlying birth-death process based on the fossil record. The rates are allowed to vary through time independently of each other, and the probability of preservation and sampling is explicitly incorporated in the model to estimate the true lifespan of each lineage. We implement a Bayesian algorithm to assess the presence of rate shifts by exploring alternative diversification models. Tests on a range of simulated data sets reveal the accuracy and robustness of our approach against violations of the underlying assumptions and various degrees of data incompleteness. Finally, we demonstrate the application of our method with the diversification of the mammal family Rhinocerotidae and reveal a complex history of repeated and independent temporal shifts of both speciation and extinction rates, leading to the expansion and subsequent decline of the group. The estimated parameters of the birth-death process implemented here are directly comparable with those obtained from dated molecular phylogenies. Thus, our model represents a step towards integrating phylogenetic and fossil information to infer macroevolutionary processes.


20.00% 20.00%



Selostus: Maassa olevan nitraattitypen arviointi simulointimallin avulla