3 resultados para Model Output Statistics
em Duke University
Resumo:
Forests change with changes in their environment based on the physiological responses of individual trees. These short-term reactions have cumulative impacts on long-term demographic performance. For a tree in a forest community, success depends on biomass growth to capture above- and belowground resources and reproductive output to establish future generations. Here we examine aspects of how forests respond to changes in moisture and light availability and how these responses are related to tree demography and physiology.
First we address the long-term pattern of tree decline before death and its connection with drought. Increasing drought stress and chronic morbidity could have pervasive impacts on forest composition in many regions. We use long-term, whole-stand inventory data from southeastern U.S. forests to show that trees exposed to drought experience multiyear declines in growth prior to mortality. Following a severe, multiyear drought, 72% of trees that did not recover their pre-drought growth rates died within 10 years. This pattern was mediated by local moisture availability. As an index of morbidity prior to death, we calculated the difference in cumulative growth after drought relative to surviving conspecifics. The strength of drought-induced morbidity varied among species and was correlated with species drought tolerance.
Next, we investigate differences among tree species in reproductive output relative to biomass growth with changes in light availability. Previous studies reach conflicting conclusions about the constraints on reproductive allocation relative to growth and how they vary through time, across species, and between environments. We test the hypothesis that canopy exposure to light, a critical resource, limits reproductive allocation by comparing long-term relationships between reproduction and growth for trees from 21 species in forests throughout the southeastern U.S. We found that species had divergent responses to light availability, with shade-intolerant species experiencing an alleviation of trade-offs between growth and reproduction at high light. Shade-tolerant species showed no changes in reproductive output across light environments.
Given that the above patterns depend on the maintenance of transpiration, we next developed an approach for predicting whole-tree water use from sap flux observations. Accurately scaling these observations to tree- or stand-levels requires accounting for variation in sap flux between wood types and with depth into the tree. We compared different models with sap flux data to test the hypotheses that radial sap flux profiles differ by wood type and tree size. We show that radial variation in sap flux is dependent on wood type but independent of tree size for a range of temperate trees. The best-fitting model predicted out-of-sample sap flux observations and independent estimates of sapwood area with small errors, suggesting robustness in new settings. We outline a method for predicting whole-tree water use with this model and include computer code for simple implementation in other studies.
Finally, we estimated tree water balances during drought with a statistical time-series analysis. Moisture limitation in forest stands comes predominantly from water use by the trees themselves, a drought-stand feedback. We show that drought impacts on tree fitness and forest composition can be predicted by tracking the moisture reservoir available to each tree in a mass balance. We apply this model to multiple seasonal droughts in a temperate forest with measurements of tree water use to demonstrate how species and size differences modulate moisture availability across landscapes. As trees deplete their soil moisture reservoir during droughts, a transpiration deficit develops, leading to reduced biomass growth and reproductive output.
This dissertation draws connections between the physiological condition of individual trees and their behavior in crowded, diverse, and continually-changing forest stands. The analyses take advantage of growing data sets on both the physiology and demography of trees as well as novel statistical techniques that allow us to link these observations to realistic quantitative models. The results can be used to scale up tree measurements to entire stands and address questions about the future composition of forests and the land’s balance of water and carbon.
Resumo:
Uncertainty quantification (UQ) is both an old and new concept. The current novelty lies in the interactions and synthesis of mathematical models, computer experiments, statistics, field/real experiments, and probability theory, with a particular emphasize on the large-scale simulations by computer models. The challenges not only come from the complication of scientific questions, but also from the size of the information. It is the focus in this thesis to provide statistical models that are scalable to massive data produced in computer experiments and real experiments, through fast and robust statistical inference.
Chapter 2 provides a practical approach for simultaneously emulating/approximating massive number of functions, with the application on hazard quantification of Soufri\`{e}re Hills volcano in Montserrate island. Chapter 3 discusses another problem with massive data, in which the number of observations of a function is large. An exact algorithm that is linear in time is developed for the problem of interpolation of Methylation levels. Chapter 4 and Chapter 5 are both about the robust inference of the models. Chapter 4 provides a new criteria robustness parameter estimation criteria and several ways of inference have been shown to satisfy such criteria. Chapter 5 develops a new prior that satisfies some more criteria and is thus proposed to use in practice.
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.