2 resultados para Statistical evaluation

em Digital Commons - Michigan Tech


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standard procedures for forecasting flood risk (Bulletin 17B) assume annual maximum flood (AMF) series are stationary, meaning the distribution of flood flows is not significantly affected by climatic trends/cycles, or anthropogenic activities within the watershed. Historical flood events are therefore considered representative of future flood occurrences, and the risk associated with a given flood magnitude is modeled as constant over time. However, in light of increasing evidence to the contrary, this assumption should be reconsidered, especially as the existence of nonstationarity in AMF series can have significant impacts on planning and management of water resources and relevant infrastructure. Research presented in this thesis quantifies the degree of nonstationarity evident in AMF series for unimpaired watersheds throughout the contiguous U.S., identifies meteorological, climatic, and anthropogenic causes of this nonstationarity, and proposes an extension of the Bulletin 17B methodology which yields forecasts of flood risk that reflect climatic influences on flood magnitude. To appropriately forecast flood risk, it is necessary to consider the driving causes of nonstationarity in AMF series. Herein, large-scale climate patterns—including El Niño-Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO), and Atlantic Multidecadal Oscillation (AMO)—are identified as influencing factors on flood magnitude at numerous stations across the U.S. Strong relationships between flood magnitude and associated precipitation series were also observed for the majority of sites analyzed in the Upper Midwest and Northeastern regions of the U.S. Although relationships between flood magnitude and associated temperature series are not apparent, results do indicate that temperature is highly correlated with the timing of flood peaks. Despite consideration of watersheds classified as unimpaired, analyses also suggest that identified change-points in AMF series are due to dam construction, and other types of regulation and diversion. Although not explored herein, trends in AMF series are also likely to be partially explained by changes in land use and land cover over time. Results obtained herein suggest that improved forecasts of flood risk may be obtained using a simple modification of the Bulletin 17B framework, wherein the mean and standard deviation of the log-transformed flows are modeled as functions of climate indices associated with oceanic-atmospheric patterns (e.g. AMO, ENSO, NAO, and PDO) with lead times between 3 and 9 months. Herein, one-year ahead forecasts of the mean and standard deviation, and subsequently flood risk, are obtained by applying site specific multivariate regression models, which reflect the phase and intensity of a given climate pattern, as well as possible impacts of coupling of the climate cycles. These forecasts of flood risk are compared with forecasts derived using the existing Bulletin 17B model; large differences in the one-year ahead forecasts are observed in some locations. The increased knowledge of the inherent structure of AMF series and an improved understanding of physical and/or climatic causes of nonstationarity gained from this research should serve as insight for the formulation of a physical-casual based statistical model, incorporating both climatic variations and human impacts, for flood risk over longer planning horizons (e.g., 10-, 50, 100-years) necessary for water resources design, planning, and management.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.