924 resultados para sampling methods
Resumo:
In order to evaluate the influence of ambient aerosol particles on cloud formation, climate and human health, detailed information about the concentration and composition of ambient aerosol particles is needed. The dura-tion of aerosol formation, growth and removal processes in the atmosphere range from minutes to hours, which highlights the need for high-time-resolution data in order to understand the underlying processes. This thesis focuses on characterization of ambient levels, size distributions and sources of water-soluble organic carbon (WSOC) in ambient aerosols. The results show that in the location of this study typically 50-60 % of organic carbon in fine particles is water-soluble. The amount of WSOC was observed to increase as aerosols age, likely due to further oxidation of organic compounds. In the boreal region the main sources of WSOC were biomass burning during the winter and secondary aerosol formation during the summer. WSOC was mainly attributed to a fine particle mode between 0.1 - 1 μm, although different size distributions were measured for different sources. The WSOC concentrations and size distributions had a clear seasonal variation. Another main focus of this thesis was to test and further develop the high-time-resolution methods for chemical characterization of ambient aerosol particles. The concentrations of the main chemical components (ions, OC, EC) of ambient aerosol particles were measured online during a year-long intensive measurement campaign conducted on the SMEAR III station in Southern Finland. The results were compared to the results of traditional filter collections in order to study sampling artifacts and limitations related to each method. To achieve better a time resolution for the WSOC and ion measurements, a particle-into-liquid sampler (PILS) was coupled with a total organic carbon analyzer (TOC) and two ion chromatographs (IC). The PILS-TOC-IC provided important data about diurnal variations and short-time plumes, which cannot be resolved from the filter samples. In summary, the measurements made for this thesis provide new information on the concentrations, size distribu-tions and sources of WSOC in ambient aerosol particles in the boreal region. The analytical and collection me-thods needed for the online characterization of aerosol chemical composition were further developed in order to provide more reliable high-time-resolution measurements.
Resumo:
Sampling based planners have been successful in path planning of robots with many degrees of freedom, but still remains ineffective when the configuration space has a narrow passage. We present a new technique based on a random walk strategy to generate samples in narrow regions quickly, thus improving efficiency of Probabilistic Roadmap Planners. The algorithm substantially reduces instances of collision checking and thereby decreases computational time. The method is powerful even for cases where the structure of the narrow passage is not known, thus giving significant improvement over other known methods.
Resumo:
Many downscaling techniques have been developed in the past few years for projection of station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs) to assess the hydrological impacts of climate change. This article compares the performances of three downscaling methods, viz. conditional random field (CRF), K-nearest neighbour (KNN) and support vector machine (SVM) methods in downscaling precipitation in the Punjab region of India, belonging to the monsoon regime. The CRF model is a recently developed method for downscaling hydrological variables in a probabilistic framework, while the SVM model is a popular machine learning tool useful in terms of its ability to generalize and capture nonlinear relationships between predictors and predictand. The KNN model is an analogue-type method that queries days similar to a given feature vector from the training data and classifies future days by random sampling from a weighted set of K closest training examples. The models are applied for downscaling monsoon (June to September) daily precipitation at six locations in Punjab. Model performances with respect to reproduction of various statistics such as dry and wet spell length distributions, daily rainfall distribution, and intersite correlations are examined. It is found that the CRF and KNN models perform slightly better than the SVM model in reproducing most daily rainfall statistics. These models are then used to project future precipitation at the six locations. Output from the Canadian global climate model (CGCM3) GCM for three scenarios, viz. A1B, A2, and B1 is used for projection of future precipitation. The projections show a change in probability density functions of daily rainfall amount and changes in the wet and dry spell distributions of daily precipitation. Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
Compressive Sampling Matching Pursuit (CoSaMP) is one of the popular greedy methods in the emerging field of Compressed Sensing (CS). In addition to the appealing empirical performance, CoSaMP has also splendid theoretical guarantees for convergence. In this paper, we propose a modification in CoSaMP to adaptively choose the dimension of search space in each iteration, using a threshold based approach. Using Monte Carlo simulations, we show that this modification improves the reconstruction capability of the CoSaMP algorithm in clean as well as noisy measurement cases. From empirical observations, we also propose an optimum value for the threshold to use in applications.
Resumo:
Monte Carlo simulation methods involving splitting of Markov chains have been used in evaluation of multi-fold integrals in different application areas. We examine in this paper the performance of these methods in the context of evaluation of reliability integrals from the point of view of characterizing the sampling fluctuations. The methods discussed include the Au-Beck subset simulation, Holmes-Diaconis-Ross method, and generalized splitting algorithm. A few improvisations based on first order reliability method are suggested to select algorithmic parameters of the latter two methods. The bias and sampling variance of the alternative estimators are discussed. Also, an approximation to the sampling distribution of some of these estimators is obtained. Illustrative examples involving component and series system reliability analyses are presented with a view to bring out the relative merits of alternative methods. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
Standard approaches for ellipse fitting are based on the minimization of algebraic or geometric distance between the given data and a template ellipse. When the data are noisy and come from a partial ellipse, the state-of-the-art methods tend to produce biased ellipses. We rely on the sampling structure of the underlying signal and show that the x- and y-coordinate functions of an ellipse are finite-rate-of-innovation (FRI) signals, and that their parameters are estimable from partial data. We consider both uniform and nonuniform sampling scenarios in the presence of noise and show that the data can be modeled as a sum of random amplitude-modulated complex exponentials. A low-pass filter is used to suppress noise and approximate the data as a sum of weighted complex exponentials. The annihilating filter used in FRI approaches is applied to estimate the sampling interval in the closed form. We perform experiments on simulated and real data, and assess both objective and subjective performances in comparison with the state-of-the-art ellipse fitting methods. The proposed method produces ellipses with lesser bias. Furthermore, the mean-squared error is lesser by about 2 to 10 dB. We show the applications of ellipse fitting in iris images starting from partial edge contours, and to free-hand ellipses drawn on a touch-screen tablet.
Resumo:
The Restricted Boltzmann Machines (RBM) can be used either as classifiers or as generative models. The quality of the generative RBM is measured through the average log-likelihood on test data. Due to the high computational complexity of evaluating the partition function, exact calculation of test log-likelihood is very difficult. In recent years some estimation methods are suggested for approximate computation of test log-likelihood. In this paper we present an empirical comparison of the main estimation methods, namely, the AIS algorithm for estimating the partition function, the CSL method for directly estimating the log-likelihood, and the RAISE algorithm that combines these two ideas.
Resumo:
We present methods for fixed-lag smoothing using Sequential Importance sampling (SIS) on a discrete non-linear, non-Gaussian state space system with unknown parameters. Our particular application is in the field of digital communication systems. Each input data point is taken from a finite set of symbols. We represent transmission media as a fixed filter with a finite impulse response (FIR), hence a discrete state-space system is formed. Conventional Markov chain Monte Carlo (MCMC) techniques such as the Gibbs sampler are unsuitable for this task because they can only perform processing on a batch of data. Data arrives sequentially, so it would seem sensible to process it in this way. In addition, many communication systems are interactive, so there is a maximum level of latency that can be tolerated before a symbol is decoded. We will demonstrate this method by simulation and compare its performance to existing techniques.
An overview of sequential Monte Carlo methods for parameter estimation in general state-space models
Resumo:
Nonlinear non-Gaussian state-space models arise in numerous applications in control and signal processing. Sequential Monte Carlo (SMC) methods, also known as Particle Filters, are numerical techniques based on Importance Sampling for solving the optimal state estimation problem. The task of calibrating the state-space model is an important problem frequently faced by practitioners and the observed data may be used to estimate the parameters of the model. The aim of this paper is to present a comprehensive overview of SMC methods that have been proposed for this task accompanied with a discussion of their advantages and limitations.
Resumo:
Methods for generating a new population are a fundamental component of estimation of distribution algorithms (EDAs). They serve to transfer the information contained in the probabilistic model to the new generated population. In EDAs based on Markov networks, methods for generating new populations usually discard information contained in the model to gain in efficiency. Other methods like Gibbs sampling use information about all interactions in the model but are computationally very costly. In this paper we propose new methods for generating new solutions in EDAs based on Markov networks. We introduce approaches based on inference methods for computing the most probable configurations and model-based template recombination. We show that the application of different variants of inference methods can increase the EDAs’ convergence rate and reduce the number of function evaluations needed to find the optimum of binary and non-binary discrete functions.
Resumo:
Introduction: The National Oceanic and Atmospheric Administration’s Biogeography Branch has conducted surveys of reef fish in the Caribbean since 1999. Surveys were initially undertaken to identify essential fish habitat, but later were used to characterize and monitor reef fish populations and benthic communities over time. The Branch’s goals are to develop knowledge and products on the distribution and ecology of living marine resources and provide resource managers, scientists and the public with an improved ecosystem basis for making decisions. The Biogeography Branch monitors reef fishes and benthic communities in three study areas: (1) St. John, USVI, (2) Buck Island, St. Croix, USVI, and (3) La Parguera, Puerto Rico. In addition, the Branch has characterized the reef fish and benthic communities in the Flower Garden Banks National Marine Sanctuary, Gray’s Reef National Marine Sanctuary and around the island of Vieques, Puerto Rico. Reef fish data are collected using a stratified random sampling design and stringent measurement protocols. Over time, the sampling design has changed in order to meet different management objectives (i.e. identification of essential fish habitat vs. monitoring), but the designs have always remained: • Probabilistic – to allow inferences to a larger targeted population, • Objective – to satisfy management objectives, and • Stratified – to reduce sampling costs and obtain population estimates for strata. There are two aspects of the sampling design which are now under consideration and are the focus of this report: first, the application of a sample frame, identified as a set of points or grid elements from which a sample is selected; and second, the application of subsampling in a two-stage sampling design. To evaluate these considerations, the pros and cons of implementing a sampling frame and subsampling are discussed. Particular attention is paid to the impacts of each design on accuracy (bias), feasibility and sampling cost (precision). Further, this report presents an analysis of data to determine the optimal number of subsamples to collect if subsampling were used. (PDF contains 19 pages)
Resumo:
[EN]The Mallows and Generalized Mallows models are compact yet powerful and natural ways of representing a probability distribution over the space of permutations. In this paper we deal with the problems of sampling and learning (estimating) such distributions when the metric on permutations is the Cayley distance. We propose new methods for both operations, whose performance is shown through several experiments. We also introduce novel procedures to count and randomly generate permutations at a given Cayley distance both with and without certain structural restrictions. An application in the field of biology is given to motivate the interest of this model.
Resumo:
Atlantic menhaden, Brrvoortia tyrannus, the object of a major purse-seine fishery along the U.S. east coast, are landed at plants from northern Florida to central Maine. The National Marine Fisheries Service has sampled these landings since 1955 for length, weight, and age. Together with records of landings at each plant, the samples are used to estimate numbers of fish landed at each age. This report analyzes the sampling design in terms of probablity sampling theory. The design is c1assified as two-stage cluster sampling, the first stage consisting of purse-seine sets randomly selected from the population of all sets landed, and the second stage consisting of fish randomly selected from each sampled set. Implicit assumptions of this design are discussed with special attention to current sampling procedures. Methods are developed for estimating mean fish weight, numbers of fish landed, and age composition of the catch, with approximate 95% confidence intervals. Based on specific results from three ports (port Monmouth, N.J., Reedville, Va., and Beaufort, N.C.) for the 1979 fishing season, recommendations are made for improving sampling procedures to comply more exactly with assumptions of the sampling design. These recommendatlons include adopting more formal methods for randomizing set and fish selection, increasing the number of sets sampled, considering the bias introduced by unequal set sizes, and developing methods to optimize the use of funds and personnel. (PDF file contains 22 pages.)
Resumo:
A central objective in signal processing is to infer meaningful information from a set of measurements or data. While most signal models have an overdetermined structure (the number of unknowns less than the number of equations), traditionally very few statistical estimation problems have considered a data model which is underdetermined (number of unknowns more than the number of equations). However, in recent times, an explosion of theoretical and computational methods have been developed primarily to study underdetermined systems by imposing sparsity on the unknown variables. This is motivated by the observation that inspite of the huge volume of data that arises in sensor networks, genomics, imaging, particle physics, web search etc., their information content is often much smaller compared to the number of raw measurements. This has given rise to the possibility of reducing the number of measurements by down sampling the data, which automatically gives rise to underdetermined systems.
In this thesis, we provide new directions for estimation in an underdetermined system, both for a class of parameter estimation problems and also for the problem of sparse recovery in compressive sensing. There are two main contributions of the thesis: design of new sampling and statistical estimation algorithms for array processing, and development of improved guarantees for sparse reconstruction by introducing a statistical framework to the recovery problem.
We consider underdetermined observation models in array processing where the number of unknown sources simultaneously received by the array can be considerably larger than the number of physical sensors. We study new sparse spatial sampling schemes (array geometries) as well as propose new recovery algorithms that can exploit priors on the unknown signals and unambiguously identify all the sources. The proposed sampling structure is generic enough to be extended to multiple dimensions as well as to exploit different kinds of priors in the model such as correlation, higher order moments, etc.
Recognizing the role of correlation priors and suitable sampling schemes for underdetermined estimation in array processing, we introduce a correlation aware framework for recovering sparse support in compressive sensing. We show that it is possible to strictly increase the size of the recoverable sparse support using this framework provided the measurement matrix is suitably designed. The proposed nested and coprime arrays are shown to be appropriate candidates in this regard. We also provide new guarantees for convex and greedy formulations of the support recovery problem and demonstrate that it is possible to strictly improve upon existing guarantees.
This new paradigm of underdetermined estimation that explicitly establishes the fundamental interplay between sampling, statistical priors and the underlying sparsity, leads to exciting future research directions in a variety of application areas, and also gives rise to new questions that can lead to stand-alone theoretical results in their own right.
Resumo:
The author explains some aspects of sampling phytoplankton blooms and the evaluation of results obtained from different methods. Qualitative and quantitative sampling is covered as well as filtration, freeze-drying and toxin separation.