6 resultados para Statistical Model
em Digital Commons - Michigan Tech
                                
Resumo:
The report explores the problem of detecting complex point target models in a MIMO radar system. A complex point target is a mathematical and statistical model for a radar target that is not resolved in space, but exhibits varying complex reflectivity across the different bistatic view angles. The complex reflectivity can be modeled as a complex stochastic process whose index set is the set of all the bistatic view angles, and the parameters of the stochastic process follow from an analysis of a target model comprising a number of ideal point scatterers randomly located within some radius of the targets center of mass. The proposed complex point targets may be applicable to statistical inference in multistatic or MIMO radar system. Six different target models are summarized here – three 2-dimensional (Gaussian, Uniform Square, and Uniform Circle) and three 3-dimensional (Gaussian, Uniform Cube, and Uniform Sphere). They are assumed to have different distributions on the location of the point scatterers within the target. We develop data models for the received signals from such targets in the MIMO radar system with distributed assets and partially correlated signals, and consider the resulting detection problem which reduces to the familiar Gauss-Gauss detection problem. We illustrate that the target parameter and transmit signal have an influence on the detector performance through target extent and the SNR respectively. A series of the receiver operator characteristic (ROC) curves are generated to notice the impact on the detector for varying SNR. Kullback–Leibler (KL) divergence is applied to obtain the approximate mean difference between density functions the scatterers assume inside the target models to show the change in the performance of the detector with target extent of the point scatterers.
                                
Resumo:
Standard procedures for forecasting flood risk (Bulletin 17B) assume annual maximum flood (AMF) series are stationary, meaning the distribution of flood flows is not significantly affected by climatic trends/cycles, or anthropogenic activities within the watershed. Historical flood events are therefore considered representative of future flood occurrences, and the risk associated with a given flood magnitude is modeled as constant over time. However, in light of increasing evidence to the contrary, this assumption should be reconsidered, especially as the existence of nonstationarity in AMF series can have significant impacts on planning and management of water resources and relevant infrastructure. Research presented in this thesis quantifies the degree of nonstationarity evident in AMF series for unimpaired watersheds throughout the contiguous U.S., identifies meteorological, climatic, and anthropogenic causes of this nonstationarity, and proposes an extension of the Bulletin 17B methodology which yields forecasts of flood risk that reflect climatic influences on flood magnitude. To appropriately forecast flood risk, it is necessary to consider the driving causes of nonstationarity in AMF series. Herein, large-scale climate patterns—including El Niño-Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO), and Atlantic Multidecadal Oscillation (AMO)—are identified as influencing factors on flood magnitude at numerous stations across the U.S. Strong relationships between flood magnitude and associated precipitation series were also observed for the majority of sites analyzed in the Upper Midwest and Northeastern regions of the U.S. Although relationships between flood magnitude and associated temperature series are not apparent, results do indicate that temperature is highly correlated with the timing of flood peaks. Despite consideration of watersheds classified as unimpaired, analyses also suggest that identified change-points in AMF series are due to dam construction, and other types of regulation and diversion. Although not explored herein, trends in AMF series are also likely to be partially explained by changes in land use and land cover over time. Results obtained herein suggest that improved forecasts of flood risk may be obtained using a simple modification of the Bulletin 17B framework, wherein the mean and standard deviation of the log-transformed flows are modeled as functions of climate indices associated with oceanic-atmospheric patterns (e.g. AMO, ENSO, NAO, and PDO) with lead times between 3 and 9 months. Herein, one-year ahead forecasts of the mean and standard deviation, and subsequently flood risk, are obtained by applying site specific multivariate regression models, which reflect the phase and intensity of a given climate pattern, as well as possible impacts of coupling of the climate cycles. These forecasts of flood risk are compared with forecasts derived using the existing Bulletin 17B model; large differences in the one-year ahead forecasts are observed in some locations. The increased knowledge of the inherent structure of AMF series and an improved understanding of physical and/or climatic causes of nonstationarity gained from this research should serve as insight for the formulation of a physical-casual based statistical model, incorporating both climatic variations and human impacts, for flood risk over longer planning horizons (e.g., 10-, 50, 100-years) necessary for water resources design, planning, and management.
Analysis of spring break-up and its effects on a biomass feedstock supply chain in northern Michigan
                                
Resumo:
Demand for bio-fuels is expected to increase, due to rising prices of fossil fuels and concerns over greenhouse gas emissions and energy security. The overall cost of biomass energy generation is primarily related to biomass harvesting activity, transportation, and storage. With a commercial-scale cellulosic ethanol processing facility in Kinross Township of Chippewa County, Michigan about to be built, models including a simulation model and an optimization model have been developed to provide decision support for the facility. Both models track cost, emissions and energy consumption. While the optimization model provides guidance for a long-term strategic plan, the simulation model aims to present detailed output for specified operational scenarios over an annual period. Most importantly, the simulation model considers the uncertainty of spring break-up timing, i.e., seasonal road restrictions. Spring break-up timing is important because it will impact the feasibility of harvesting activity and the time duration of transportation restrictions, which significantly changes the availability of feedstock for the processing facility. This thesis focuses on the statistical model of spring break-up used in the simulation model. Spring break-up timing depends on various factors, including temperature, road conditions and soil type, as well as individual decision making processes at the county level. The spring break-up model, based on the historical spring break-up data from 27 counties over the period of 2002-2010, starts by specifying the probability distribution of a particular county’s spring break-up start day and end day, and then relates the spring break-up timing of the other counties in the harvesting zone to the first county. In order to estimate the dependence relationship between counties, regression analyses, including standard linear regression and reduced major axis regression, are conducted. Using realizations (scenarios) of spring break-up generated by the statistical spring breakup model, the simulation model is able to probabilistically evaluate different harvesting and transportation plans to help the bio-fuel facility select the most effective strategy. For early spring break-up, which usually indicates a longer than average break-up period, more log storage is required, total cost increases, and the probability of plant closure increases. The risk of plant closure may be partially offset through increased use of rail transportation, which is not subject to spring break-up restrictions. However, rail availability and rail yard storage may then become limiting factors in the supply chain. Rail use will impact total cost, energy consumption, system-wide CO2 emissions, and the reliability of providing feedstock to the bio-fuel processing facility.
                                
Resumo:
Several deterministic and probabilistic methods are used to evaluate the probability of seismically induced liquefaction of a soil. The probabilistic models usually possess some uncertainty in that model and uncertainties in the parameters used to develop that model. These model uncertainties vary from one statistical model to another. Most of the model uncertainties are epistemic, and can be addressed through appropriate knowledge of the statistical model. One such epistemic model uncertainty in evaluating liquefaction potential using a probabilistic model such as logistic regression is sampling bias. Sampling bias is the difference between the class distribution in the sample used for developing the statistical model and the true population distribution of liquefaction and non-liquefaction instances. Recent studies have shown that sampling bias can significantly affect the predicted probability using a statistical model. To address this epistemic uncertainty, a new approach was developed for evaluating the probability of seismically-induced soil liquefaction, in which a logistic regression model in combination with Hosmer-Lemeshow statistic was used. This approach was used to estimate the population (true) distribution of liquefaction to non-liquefaction instances of standard penetration test (SPT) and cone penetration test (CPT) based most updated case histories. Apart from this, other model uncertainties such as distribution of explanatory variables and significance of explanatory variables were also addressed using KS test and Wald statistic respectively. Moreover, based on estimated population distribution, logistic regression equations were proposed to calculate the probability of liquefaction for both SPT and CPT based case history. Additionally, the proposed probability curves were compared with existing probability curves based on SPT and CPT case histories.
                                
Resumo:
Turbulence affects traditional free space optical communication by causing speckle to appear in the received beam profile. This occurs due to changes in the refractive index of the atmosphere that are caused by fluctuations in temperature and pressure, resulting in an inhomogeneous medium. The Gaussian-Schell model of partial coherence has been suggested as a means of mitigating these atmospheric inhomogeneities on the transmission side. This dissertation analyzed the Gaussian-Schell model of partial coherence by verifying the Gaussian-Schell model in the far-field, investigated the number of independent phase control screens necessary to approach the ideal Gaussian-Schell model, and showed experimentally that the Gaussian-Schell model of partial coherence is achievable in the far-field using a liquid crystal spatial light modulator. A method for optimizing the statistical properties of the Gaussian-Schell model was developed to maximize the coherence of the field while ensuring that it does not exhibit the same statistics as a fully coherent source. Finally a technique to estimate the minimum spatial resolution necessary in a spatial light modulator was developed to effectively propagate the Gaussian-Schell model through a range of atmospheric turbulence strengths. This work showed that regardless of turbulence strength or receiver aperture, transmitting the Gaussian-Schell model of partial coherence instead of a fully coherent source will yield a reduction in the intensity fluctuations of the received field. By measuring the variance of the intensity fluctuations and the received mean, it is shown through the scintillation index that using the Gaussian-Schell model of partial coherence is a simple and straight forward method to mitigate atmospheric turbulence instead of traditional adaptive optics in free space optical communications.
                                
Resumo:
The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.
 
                    