Biblioteca Digital

5 resultados para statistical methods

em Duke University

New Advancements of Scalable Statistical Methods for Learning Latent Structures in Big Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.

Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.

Veja mais

Using Image Processing Methods to Improve the Detection of Buried Explosive Threats in GPR Data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Current state of the art techniques for landmine detection in ground penetrating radar (GPR) utilize statistical methods to identify characteristics of a landmine response. This research makes use of 2-D slices of data in which subsurface landmine responses have hyperbolic shapes. Various methods from the field of visual image processing are adapted to the 2-D GPR data, producing superior landmine detection results. This research goes on to develop a physics-based GPR augmentation method motivated by current advances in visual object detection. This GPR specific augmentation is used to mitigate issues caused by insufficient training sets. This work shows that augmentation improves detection performance under training conditions that are normally very difficult. Finally, this work introduces the use of convolutional neural networks as a method to learn feature extraction parameters. These learned convolutional features outperform hand-designed features in GPR detection tasks. This work presents a number of methods, both borrowed from and motivated by the substantial work in visual image processing. The methods developed and presented in this work show an improvement in overall detection performance and introduce a method to improve the robustness of statistical classification.

Veja mais

Learning and Exploiting Camera Geometry for Computer Vision

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work explores the use of statistical methods in describing and estimating camera poses, as well as the information feedback loop between camera pose and object detection. Surging development in robotics and computer vision has pushed the need for algorithms that infer, understand, and utilize information about the position and orientation of the sensor platforms when observing and/or interacting with their environment.

The first contribution of this thesis is the development of a set of statistical tools for representing and estimating the uncertainty in object poses. A distribution for representing the joint uncertainty over multiple object positions and orientations is described, called the mirrored normal-Bingham distribution. This distribution generalizes both the normal distribution in Euclidean space, and the Bingham distribution on the unit hypersphere. It is shown to inherit many of the convenient properties of these special cases: it is the maximum-entropy distribution with fixed second moment, and there is a generalized Laplace approximation whose result is the mirrored normal-Bingham distribution. This distribution and approximation method are demonstrated by deriving the analytical approximation to the wrapped-normal distribution. Further, it is shown how these tools can be used to represent the uncertainty in the result of a bundle adjustment problem.

Another application of these methods is illustrated as part of a novel camera pose estimation algorithm based on object detections. The autocalibration task is formulated as a bundle adjustment problem using prior distributions over the 3D points to enforce the objects' structure and their relationship with the scene geometry. This framework is very flexible and enables the use of off-the-shelf computational tools to solve specialized autocalibration problems. Its performance is evaluated using a pedestrian detector to provide head and foot location observations, and it proves much faster and potentially more accurate than existing methods.

Finally, the information feedback loop between object detection and camera pose estimation is closed by utilizing camera pose information to improve object detection in scenarios with significant perspective warping. Methods are presented that allow the inverse perspective mapping traditionally applied to images to be applied instead to features computed from those images. For the special case of HOG-like features, which are used by many modern object detection systems, these methods are shown to provide substantial performance benefits over unadapted detectors while achieving real-time frame rates, orders of magnitude faster than comparable image warping methods.

The statistical tools and algorithms presented here are especially promising for mobile cameras, providing the ability to autocalibrate and adapt to the camera pose in real time. In addition, these methods have wide-ranging potential applications in diverse areas of computer vision, robotics, and imaging.

Veja mais

Essays on Energy Economics and Industrial Organization

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The dissertation consists of three chapters related to the low-price guarantee marketing strategy and energy efficiency analysis. The low-price guarantee is a marketing strategy in which firms promise to charge consumers the lowest price among their competitors. Chapter 1 addresses the research question "Does a Low-Price Guarantee Induce Lower Prices'' by looking into the retail gasoline industry in Quebec where there was a major branded firm which started a low-price guarantee back in 1996. Chapter 2 does a consumer welfare analysis of low-price guarantees to drive police indications and offers a new explanation of the firms' incentives to adopt a low-price guarantee. Chapter 3 develops the energy performance indicators (EPIs) to measure energy efficiency of the manufacturing plants in pulp, paper and paperboard industry.

Chapter 1 revisits the traditional view that a low-price guarantee results in higher prices by facilitating collusion. Using accurate market definitions and station-level data from the retail gasoline industry in Quebec, I conducted a descriptive analysis based on stations and price zones to compare the price and sales movement before and after the guarantee was adopted. I find that, contrary to the traditional view, the stores that offered the guarantee significantly decreased their prices and increased their sales. I also build a difference-in-difference model to quantify the decrease in posted price of the stores that offered the guarantee to be 0.7 cents per liter. While this change is significant, I do not find the response in comeptitors' prices to be significant. The sales of the stores that offered the guarantee increased significantly while the competitors' sales decreased significantly. However, the significance vanishes if I use the station clustered standard errors. Comparing my observations and the predictions of different theories of modeling low-price guarantees, I conclude the empirical evidence here supports that the low-price guarantee is a simple commitment device and induces lower prices.

Chapter 2 conducts a consumer welfare analysis of low-price guarantees to address the antitrust concerns and potential regulations from the government; explains the firms' potential incentives to adopt a low-price guarantee. Using station-level data from the retail gasoline industry in Quebec, I estimated consumers' demand of gasoline by a structural model with spatial competition incorporating the low-price guarantee as a commitment device, which allows firms to pre-commit to charge the lowest price among their competitors. The counterfactual analysis under the Bertrand competition setting shows that the stores that offered the guarantee attracted a lot more consumers and decreased their posted price by 0.6 cents per liter. Although the matching stores suffered a decrease in profits from gasoline sales, they are incentivized to adopt the low-price guarantee to attract more consumers to visit the store likely increasing profits at attached convenience stores. Firms have strong incentives to adopt a low-price guarantee on the product that their consumers are most price-sensitive about, while earning a profit from the products that are not covered in the guarantee. I estimate that consumers earn about 0.3% more surplus when the low-price guarantee is in place, which suggests that the authorities should not be concerned and regulate low-price guarantees. In Appendix B, I also propose an empirical model to look into how low-price guarantees would change consumer search behavior and whether consumer search plays an important role in estimating consumer surplus accurately.

Chapter 3, joint with Gale Boyd, describes work with the pulp, paper, and paperboard (PP&PB) industry to provide a plant-level indicator of energy efficiency for facilities that produce various types of paper products in the United States. Organizations that implement strategic energy management programs undertake a set of activities that, if carried out properly, have the potential to deliver sustained energy savings. Energy performance benchmarking is a key activity of strategic energy management and one way to enable companies to set energy efficiency targets for manufacturing facilities. The opportunity to assess plant energy performance through a comparison with similar plants in its industry is a highly desirable and strategic method of benchmarking for industrial energy managers. However, access to energy performance data for conducting industry benchmarking is usually unavailable to most industrial energy managers. The U.S. Environmental Protection Agency (EPA), through its ENERGY STAR program, seeks to overcome this barrier through the development of manufacturing sector-based plant energy performance indicators (EPIs) that encourage U.S. industries to use energy more efficiently. In the development of the energy performance indicator tools, consideration is given to the role that performance-based indicators play in motivating change; the steps necessary for indicator development, from interacting with an industry in securing adequate data for the indicator; and actual application and use of an indicator when complete. How indicators are employed in EPA’s efforts to encourage industries to voluntarily improve their use of energy is discussed as well. The chapter describes the data and statistical methods used to construct the EPI for plants within selected segments of the pulp, paper, and paperboard industry: specifically pulp mills and integrated paper & paperboard mills. The individual equations are presented, as are the instructions for using those equations as implemented in an associated Microsoft Excel-based spreadsheet tool.

Veja mais

On the Advancement of Probabilistic Models of Decompression Sickness

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The work presented in this dissertation is focused on applying engineering methods to develop and explore probabilistic survival models for the prediction of decompression sickness in US NAVY divers. Mathematical modeling, computational model development, and numerical optimization techniques were employed to formulate and evaluate the predictive quality of models fitted to empirical data. In Chapters 1 and 2 we present general background information relevant to the development of probabilistic models applied to predicting the incidence of decompression sickness. The remainder of the dissertation introduces techniques developed in an effort to improve the predictive quality of probabilistic decompression models and to reduce the difficulty of model parameter optimization.

The first project explored seventeen variations of the hazard function using a well-perfused parallel compartment model. Models were parametrically optimized using the maximum likelihood technique. Model performance was evaluated using both classical statistical methods and model selection techniques based on information theory. Optimized model parameters were overall similar to those of previously published Results indicated that a novel hazard function definition that included both ambient pressure scaling and individually fitted compartment exponent scaling terms.

We developed ten pharmacokinetic compartmental models that included explicit delay mechanics to determine if predictive quality could be improved through the inclusion of material transfer lags. A fitted discrete delay parameter augmented the inflow to the compartment systems from the environment. Based on the observation that symptoms are often reported after risk accumulation begins for many of our models, we hypothesized that the inclusion of delays might improve correlation between the model predictions and observed data. Model selection techniques identified two models as having the best overall performance, but comparison to the best performing model without delay and model selection using our best identified no delay pharmacokinetic model both indicated that the delay mechanism was not statistically justified and did not substantially improve model predictions.

Our final investigation explored parameter bounding techniques to identify parameter regions for which statistical model failure will not occur. When a model predicts a no probability of a diver experiencing decompression sickness for an exposure that is known to produce symptoms, statistical model failure occurs. Using a metric related to the instantaneous risk, we successfully identify regions where model failure will not occur and identify the boundaries of the region using a root bounding technique. Several models are used to demonstrate the techniques, which may be employed to reduce the difficulty of model optimization for future investigations.

Veja mais

5 resultados para statistical methods

em Duke University

Filtro por publicador

New Advancements of Scalable Statistical Methods for Learning Latent Structures in Big Data

Using Image Processing Methods to Improve the Detection of Buried Explosive Threats in GPR Data

Learning and Exploiting Camera Geometry for Computer Vision

Essays on Energy Economics and Industrial Organization

On the Advancement of Probabilistic Models of Decompression Sickness