889 resultados para Heterogeneous regression


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set of tools for increasingly complex data collected in medical and public health studies. Our methods were motivated by and are illustrated with a state-of-the-art study of neuronal tracts in multiple sclerosis patients and healthy controls. Using the tools we have developed, we were able to find those locations along the tract most affected by the disease. However, our methods are general and highly relevant to many functional data sets. In addition to the application to one-dimensional tract profiles illustrated here, higher-dimensional extensions of the methodology could have direct applications to other biological data including functional and structural MRI.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop fast fitting methods for generalized functional linear models. An undersmooth of the functional predictor is obtained by projecting on a large number of smooth eigenvectors and the coefficient function is estimated using penalized spline regression. Our method can be applied to many functional data designs including functions measured with and without error, sparsely or densely sampled. The methods also extend to the case of multiple functional predictors or functional predictors with a natural multilevel structure. Our approach can be implemented using standard mixed effects software and is computationally fast. Our methodology is motivated by a diffusion tensor imaging (DTI) study. The aim of this study is to analyze differences between various cerebral white matter tract property measurements of multiple sclerosis (MS) patients and controls. While the statistical developments proposed here were motivated by the DTI study, the methodology is designed and presented in generality and is applicable to many other areas of scientific research. An online appendix provides R implementations of all simulations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Mannose-binding lectin (MBL) and MBL-associated serine protease-2 (MASP-2) are key components of the lectin pathway of complement activation. Their serum concentrations show a wide interindividual variability. This study investigated whether the concentration of MBL and MASP-2 is associated with prognosis in pediatric patients with cancer. METHODS: In this retrospective multicenter study, MBL and MASP-2 were measured by commercially available ELISA in frozen remnants of serum taken at diagnosis. Associations of overall survival (OS) and event-free survival (EFS) with MBL and MASP-2 were assessed by multivariate Cox regression accounting for prognostically relevant clinical variables. RESULTS: In the 372 patients studied, median serum concentration of MBL was 2,808 microg/L (range, 2-10,060) and 391 microg/L (46-2,771) for MASP-2. The estimated 4-year EFS was 0.60 (OS, 0.78). In the entire, heterogeneous sample, MBL and MASP-2 were not significantly associated with OS or EFS. In patients with hematologic malignancies, however, higher MASP-2 was associated with better EFS in a significant and clinically relevant way (hazard ratio per tenfold increase (HR), 0.22; 95% CI, 0.09-0.54; P = 0.001). This was due to patients with lymphoma (HR, 0.11; 95% CI, 0.03-0.47; P = 0.003), but less for those with acute leukemia (HR, 0.35; 95% CI, 0.11-1.15; P = 0.083). CONCLUSION: In this study, higher MASP-2 was associated with better EFS in pediatric patients with hematologic malignancies, especially lymphoma. Whether MASP-2 is an independent prognostic factor affecting risk stratification and anticancer therapy needs to be assessed in prospective, disease-specific studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background mortality is an essential component of any forest growth and yield model. Forecasts of mortality contribute largely to the variability and accuracy of model predictions at the tree, stand and forest level. In the present study, I implement and evaluate state-of-the-art techniques to increase the accuracy of individual tree mortality models, similar to those used in many of the current variants of the Forest Vegetation Simulator, using data from North Idaho and Montana. The first technique addresses methods to correct for bias induced by measurement error typically present in competition variables. The second implements survival regression and evaluates its performance against the traditional logistic regression approach. I selected the regression calibration (RC) algorithm as a good candidate for addressing the measurement error problem. Two logistic regression models for each species were fitted, one ignoring the measurement error, which is the “naïve” approach, and the other applying RC. The models fitted with RC outperformed the naïve models in terms of discrimination when the competition variable was found to be statistically significant. The effect of RC was more obvious where measurement error variance was large and for more shade-intolerant species. The process of model fitting and variable selection revealed that past emphasis on DBH as a predictor variable for mortality, while producing models with strong metrics of fit, may make models less generalizable. The evaluation of the error variance estimator developed by Stage and Wykoff (1998), and core to the implementation of RC, in different spatial patterns and diameter distributions, revealed that the Stage and Wykoff estimate notably overestimated the true variance in all simulated stands, but those that are clustered. Results show a systematic bias even when all the assumptions made by the authors are guaranteed. I argue that this is the result of the Poisson-based estimate ignoring the overlapping area of potential plots around a tree. Effects, especially in the application phase, of the variance estimate justify suggested future efforts of improving the accuracy of the variance estimate. The second technique implemented and evaluated is a survival regression model that accounts for the time dependent nature of variables, such as diameter and competition variables, and the interval-censored nature of data collected from remeasured plots. The performance of the model is compared with the traditional logistic regression model as a tool to predict individual tree mortality. Validation of both approaches shows that the survival regression approach discriminates better between dead and alive trees for all species. In conclusion, I showed that the proposed techniques do increase the accuracy of individual tree mortality models, and are a promising first step towards the next generation of background mortality models. I have also identified the next steps to undertake in order to advance mortality models further.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The primary challenge in groundwater and contaminant transport modeling is obtaining the data needed for constructing, calibrating and testing the models. Large amounts of data are necessary for describing the hydrostratigraphy in areas with complex geology. Increasingly states are making spatial data available that can be used for input to groundwater flow models. The appropriateness of this data for large-scale flow systems has not been tested. This study focuses on modeling a plume of 1,4-dioxane in a heterogeneous aquifer system in Scio Township, Washtenaw County, Michigan. The analysis consisted of: (1) characterization of hydrogeology of the area and construction of a conceptual model based on publicly available spatial data, (2) development and calibration of a regional flow model for the site, (3) conversion of the regional model to a more highly resolved local model, (4) simulation of the dioxane plume, and (5) evaluation of the model's ability to simulate field data and estimation of the possible dioxane sources and subsequent migration until maximum concentrations are at or below the Michigan Department of Environmental Quality's residential cleanup standard for groundwater (85 ppb). MODFLOW-2000 and MT3D programs were utilized to simulate the groundwater flow and the development and movement of the 1, 4-dioxane plume, respectively. MODFLOW simulates transient groundwater flow in a quasi-3-dimensional sense, subject to a variety of boundary conditions that can simulate recharge, pumping, and surface-/groundwater interactions. MT3D simulates solute advection with groundwater flow (using the flow solution from MODFLOW), dispersion, source/sink mixing, and chemical reaction of contaminants. This modeling approach was successful at simulating the groundwater flows by calibrating recharge and hydraulic conductivities. The plume transport was adequately simulated using literature dispersivity and sorption coefficients, although the plume geometries were not well constrained.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Heterogeneous materials are ubiquitous in nature and as synthetic materials. These materials provide unique combination of desirable mechanical properties emerging from its heterogeneities at different length scales. Future structural and technological applications will require the development of advanced light weight materials with superior strength and toughness. Cost effective design of the advanced high performance synthetic materials by tailoring their microstructure is the challenge facing the materials design community. Prior knowledge of structure-property relationships for these materials is imperative for optimal design. Thus, understanding such relationships for heterogeneous materials is of primary interest. Furthermore, computational burden is becoming critical concern in several areas of heterogeneous materials design. Therefore, computationally efficient and accurate predictive tools are highly essential. In the present study, we mainly focus on mechanical behavior of soft cellular materials and tough biological material such as mussel byssus thread. Cellular materials exhibit microstructural heterogeneity by interconnected network of same material phase. However, mussel byssus thread comprises of two distinct material phases. A robust numerical framework is developed to investigate the micromechanisms behind the macroscopic response of both of these materials. Using this framework, effect of microstuctural parameters has been addressed on the stress state of cellular specimens during split Hopkinson pressure bar test. A voronoi tessellation based algorithm has been developed to simulate the cellular microstructure. Micromechanisms (microinertia, microbuckling and microbending) governing macroscopic behavior of cellular solids are investigated thoroughly with respect to various microstructural and loading parameters. To understand the origin of high toughness of mussel byssus thread, a Genetic Algorithm (GA) based optimization framework has been developed. It is found that two different material phases (collagens) of mussel byssus thread are optimally distributed along the thread. These applications demonstrate that the presence of heterogeneity in the system demands high computational resources for simulation and modeling. Thus, Higher Dimensional Model Representation (HDMR) based surrogate modeling concept has been proposed to reduce computational complexity. The applicability of such methodology has been demonstrated in failure envelope construction and in multiscale finite element techniques. It is observed that surrogate based model can capture the behavior of complex material systems with sufficient accuracy. The computational algorithms presented in this thesis will further pave the way for accurate prediction of macroscopic deformation behavior of various class of advanced materials from their measurable microstructural features at a reasonable computational cost.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While nucleation of solids in supercooled liquids is ubiquitous [15, 65, 66], surface crystallization, the tendency for freezing to begin preferentially at the liquid-gas interface, has remained puzzling [74, 18, 68, 69, 51, 64, 72, 16]. Here we employ high-speed imaging of supercooled water drops to study the phenomenon of heterogeneous surface crystallization. Our geometry avoids the "point-like contact" of prior experiments by providing a simple, symmetric contact line (triple line defined by the substrate-liquid-air interface) for a drop resting on a homogeneous silicon substrate. We examine three possible mechanisms that might explain these laboratory observations: (i) Line Tension at the triple line, (ii) Thermal Gradients within the droplets and (iii) Surface Texture. In our first study we record nearly perfect spatial uniformity in the immersed (liquid-substrate) region and, thereby, no preference for nucleation at the triple line. In our second study, no influence of thermal gradients on the preference for freezing at the triple line was observed. Motivated by the conjectured importance of line tension (τ) [1, 66] for heterogeneous nucleation, we also searched for evidence of a transition to surface crystallization at length scales on the order of δ ∼ τ/σ, where σ is the surface tension [14]; poorly constrained τ [49] leads to δ ranging from microns to nanometers. We demonstrate that nano-scale texture causes a shift in the nucleation to the three-phase contact line, while micro-scale texture does not. The possibility of a critical length scale has implications for the effectiveness of nucleation catalysts, including formation of ice in atmospheric clouds [7].

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis, we consider Bayesian inference on the detection of variance change-point models with scale mixtures of normal (for short SMN) distributions. This class of distributions is symmetric and thick-tailed and includes as special cases: Gaussian, Student-t, contaminated normal, and slash distributions. The proposed models provide greater flexibility to analyze a lot of practical data, which often show heavy-tail and may not satisfy the normal assumption. As to the Bayesian analysis, we specify some prior distributions for the unknown parameters in the variance change-point models with the SMN distributions. Due to the complexity of the joint posterior distribution, we propose an efficient Gibbs-type with Metropolis- Hastings sampling algorithm for posterior Bayesian inference. Thereafter, following the idea of [1], we consider the problems of the single and multiple change-point detections. The performance of the proposed procedures is illustrated and analyzed by simulation studies. A real application to the closing price data of U.S. stock market has been analyzed for illustrative purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This morning Dr. Battle will introduce descriptive statistics and linear regression and how to apply these concepts in mathematical modeling. You will also learn how to use a spreadsheet to help with statistical analysis and to create graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As microgrid power systems gain prevalence and renewable energy comprises greater and greater portions of distributed generation, energy storage becomes important to offset the higher variance of renewable energy sources and maximize their usefulness. One of the emerging techniques is to utilize a combination of lead-acid batteries and ultracapacitors to provide both short and long-term stabilization to microgrid systems. The different energy and power characteristics of batteries and ultracapacitors imply that they ought to be utilized in different ways. Traditional linear controls can use these energy storage systems to stabilize a power grid, but cannot effect more complex interactions. This research explores a fuzzy logic approach to microgrid stabilization. The ability of a fuzzy logic controller to regulate a dc bus in the presence of source and load fluctuations, in a manner comparable to traditional linear control systems, is explored and demonstrated. Furthermore, the expanded capabilities (such as storage balancing, self-protection, and battery optimization) of a fuzzy logic system over a traditional linear control system are shown. System simulation results are presented and validated through hardware-based experiments. These experiments confirm the capabilities of the fuzzy logic control system to regulate bus voltage, balance storage elements, optimize battery usage, and effect self-protection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.