881 resultados para Geographic Regression Discontinuity
Resumo:
We develop fast fitting methods for generalized functional linear models. An undersmooth of the functional predictor is obtained by projecting on a large number of smooth eigenvectors and the coefficient function is estimated using penalized spline regression. Our method can be applied to many functional data designs including functions measured with and without error, sparsely or densely sampled. The methods also extend to the case of multiple functional predictors or functional predictors with a natural multilevel structure. Our approach can be implemented using standard mixed effects software and is computationally fast. Our methodology is motivated by a diffusion tensor imaging (DTI) study. The aim of this study is to analyze differences between various cerebral white matter tract property measurements of multiple sclerosis (MS) patients and controls. While the statistical developments proposed here were motivated by the DTI study, the methodology is designed and presented in generality and is applicable to many other areas of scientific research. An online appendix provides R implementations of all simulations.
Resumo:
Background mortality is an essential component of any forest growth and yield model. Forecasts of mortality contribute largely to the variability and accuracy of model predictions at the tree, stand and forest level. In the present study, I implement and evaluate state-of-the-art techniques to increase the accuracy of individual tree mortality models, similar to those used in many of the current variants of the Forest Vegetation Simulator, using data from North Idaho and Montana. The first technique addresses methods to correct for bias induced by measurement error typically present in competition variables. The second implements survival regression and evaluates its performance against the traditional logistic regression approach. I selected the regression calibration (RC) algorithm as a good candidate for addressing the measurement error problem. Two logistic regression models for each species were fitted, one ignoring the measurement error, which is the “naïve” approach, and the other applying RC. The models fitted with RC outperformed the naïve models in terms of discrimination when the competition variable was found to be statistically significant. The effect of RC was more obvious where measurement error variance was large and for more shade-intolerant species. The process of model fitting and variable selection revealed that past emphasis on DBH as a predictor variable for mortality, while producing models with strong metrics of fit, may make models less generalizable. The evaluation of the error variance estimator developed by Stage and Wykoff (1998), and core to the implementation of RC, in different spatial patterns and diameter distributions, revealed that the Stage and Wykoff estimate notably overestimated the true variance in all simulated stands, but those that are clustered. Results show a systematic bias even when all the assumptions made by the authors are guaranteed. I argue that this is the result of the Poisson-based estimate ignoring the overlapping area of potential plots around a tree. Effects, especially in the application phase, of the variance estimate justify suggested future efforts of improving the accuracy of the variance estimate. The second technique implemented and evaluated is a survival regression model that accounts for the time dependent nature of variables, such as diameter and competition variables, and the interval-censored nature of data collected from remeasured plots. The performance of the model is compared with the traditional logistic regression model as a tool to predict individual tree mortality. Validation of both approaches shows that the survival regression approach discriminates better between dead and alive trees for all species. In conclusion, I showed that the proposed techniques do increase the accuracy of individual tree mortality models, and are a promising first step towards the next generation of background mortality models. I have also identified the next steps to undertake in order to advance mortality models further.
Resumo:
A post classification change detection technique based on a hybrid classification approach (unsupervised and supervised) was applied to Landsat Thematic Mapper (TM), Landsat Enhanced Thematic Plus (ETM+), and ASTER images acquired in 1987, 2000 and 2004 respectively to map land use/cover changes in the Pic Macaya National Park in the southern region of Haiti. Each image was classified individually into six land use/cover classes: built-up, agriculture, herbaceous, open pine forest, mixed forest, and barren land using unsupervised ISODATA and maximum likelihood supervised classifiers with the aid of field collected ground truth data collected in the field. Ground truth information, collected in the field in December 2007, and including equalized stratified random points which were visual interpreted were used to assess the accuracy of the classification results. The overall accuracy of the land classification for each image was respectively: 1987 (82%), 2000 (82%), 2004 (87%). A post classification change detection technique was used to produce change images for 1987 to 2000, 1987 to 2004, and 2000 to 2004. It was found that significant changes in the land use/cover occurred over the 17- year period. The results showed increases in built up (from 10% to 17%) and herbaceous (from 5% to 14%) areas between 1987 and 2004. The increase of herbaceous was mostly caused by the abandonment of exhausted agriculture lands. At the same time, open pine forest and mixed forest areas lost (75%) and (83%) of their area to other land use/cover types. Open pine forest (from 20% to 14%) and mixed forest (from18 to 12%) were transformed into agriculture area or barren land. This study illustrated the continuing deforestation, land degradation and soil erosion in the region, which in turn is leading to decrease in vegetative cover. The study also showed the importance of Remote Sensing (RS) and Geographic Information System (GIS) technologies to estimate timely changes in the land use/cover, and to evaluate their causes in order to design an ecological based management plan for the park.
Resumo:
To investigate the appearance of geographic atrophy in high-resolution optical coherence tomography (OCT) images, the fundus autofluorescence (FAF) pattern, and infrared images simultaneously recorded with a novel combined OCT-scanning laser ophthalmology (SLO) system.
Resumo:
In this thesis, we consider Bayesian inference on the detection of variance change-point models with scale mixtures of normal (for short SMN) distributions. This class of distributions is symmetric and thick-tailed and includes as special cases: Gaussian, Student-t, contaminated normal, and slash distributions. The proposed models provide greater flexibility to analyze a lot of practical data, which often show heavy-tail and may not satisfy the normal assumption. As to the Bayesian analysis, we specify some prior distributions for the unknown parameters in the variance change-point models with the SMN distributions. Due to the complexity of the joint posterior distribution, we propose an efficient Gibbs-type with Metropolis- Hastings sampling algorithm for posterior Bayesian inference. Thereafter, following the idea of [1], we consider the problems of the single and multiple change-point detections. The performance of the proposed procedures is illustrated and analyzed by simulation studies. A real application to the closing price data of U.S. stock market has been analyzed for illustrative purposes.
Resumo:
This morning Dr. Battle will introduce descriptive statistics and linear regression and how to apply these concepts in mathematical modeling. You will also learn how to use a spreadsheet to help with statistical analysis and to create graphs.
Resumo:
OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.
Resumo:
Background Access to health care can be described along four dimensions: geographic accessibility, availability, financial accessibility and acceptability. Geographic accessibility measures how physically accessible resources are for the population, while availability reflects what resources are available and in what amount. Combining these two types of measure into a single index provides a measure of geographic (or spatial) coverage, which is an important measure for assessing the degree of accessibility of a health care network. Results This paper describes the latest version of AccessMod, an extension to the Geographical Information System ArcView 3.×, and provides an example of application of this tool. AccessMod 3 allows one to compute geographic coverage to health care using terrain information and population distribution. Four major types of analysis are available in AccessMod: (1) modeling the coverage of catchment areas linked to an existing health facility network based on travel time, to provide a measure of physical accessibility to health care; (2) modeling geographic coverage according to the availability of services; (3) projecting the coverage of a scaling-up of an existing network; (4) providing information for cost effectiveness analysis when little information about the existing network is available. In addition to integrating travelling time, population distribution and the population coverage capacity specific to each health facility in the network, AccessMod can incorporate the influence of landscape components (e.g. topography, river and road networks, vegetation) that impact travelling time to and from facilities. Topographical constraints can be taken into account through an anisotropic analysis that considers the direction of movement. We provide an example of the application of AccessMod in the southern part of Malawi that shows the influences of the landscape constraints and of the modes of transportation on geographic coverage. Conclusion By incorporating the demand (population) and the supply (capacities of heath care centers), AccessMod provides a unifying tool to efficiently assess the geographic coverage of a network of health care facilities. This tool should be of particular interest to developing countries that have a relatively good geographic information on population distribution, terrain, and health facility locations.
Resumo:
INTRODUCTION: It is unclear to which level mean arterial blood pressure (MAP) should be increased during septic shock in order to improve outcome. In this study we investigated the association between MAP values of 70 mmHg or higher, vasopressor load, 28-day mortality and disease-related events in septic shock. METHODS: This is a post hoc analysis of data of the control group of a multicenter trial and includes 290 septic shock patients in whom a mean MAP > or = 70 mmHg could be maintained during shock. Demographic and clinical data, MAP, vasopressor requirements during the shock period, disease-related events and 28-day mortality were documented. Logistic regression models adjusted for the geographic region of the study center, age, presence of chronic arterial hypertension, simplified acute physiology score (SAPS) II and the mean vasopressor load during the shock period was calculated to investigate the association between MAP or MAP quartiles > or = 70 mmHg and mortality or the frequency and occurrence of disease-related events. RESULTS: There was no association between MAP or MAP quartiles and mortality or the occurrence of disease-related events. These associations were not influenced by age or pre-existent arterial hypertension (all P > 0.05). The mean vasopressor load was associated with mortality (relative risk (RR), 1.83; confidence interval (CI) 95%, 1.4-2.38; P < 0.001), the number of disease-related events (P < 0.001) and the occurrence of acute circulatory failure (RR, 1.64; CI 95%, 1.28-2.11; P < 0.001), metabolic acidosis (RR, 1.79; CI 95%, 1.38-2.32; P < 0.001), renal failure (RR, 1.49; CI 95%, 1.17-1.89; P = 0.001) and thrombocytopenia (RR, 1.33; CI 95%, 1.06-1.68; P = 0.01). CONCLUSIONS: MAP levels of 70 mmHg or higher do not appear to be associated with improved survival in septic shock. Elevating MAP >70 mmHg by augmenting vasopressor dosages may increase mortality. Future trials are needed to identify the lowest acceptable MAP level to ensure tissue perfusion and avoid unnecessary high catecholamine infusions.
Resumo:
A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.