921 resultados para Multivariate Equations
Resumo:
This paper considers a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function. Special cases in our approach include marginal models for longitudinal/clustered data, conditional logistic regression for matched case-control studies, multivariate measurement error models, generalized linear mixed models with a semiparametric component, and many others. We propose profile-kernel and backfitting estimation methods for these problems, derive their asymptotic distributions, and show that in likelihood problems the methods are semiparametric efficient. While generally not true, with our methods profiling and backfitting are asymptotically equivalent. We also consider pseudolikelihood methods where some nuisance parameters are estimated from a different algorithm. The proposed methods are evaluated using simulation studies and applied to the Kenya hemoglobin data.
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
Visualization and exploratory analysis is an important part of any data analysis and is made more challenging when the data are voluminous and high-dimensional. One such example is environmental monitoring data, which are often collected over time and at multiple locations, resulting in a geographically indexed multivariate time series. Financial data, although not necessarily containing a geographic component, present another source of high-volume multivariate time series data. We present the mvtsplot function which provides a method for visualizing multivariate time series data. We outline the basic design concepts and provide some examples of its usage by applying it to a database of ambient air pollution measurements in the United States and to a hypothetical portfolio of stocks.
Resumo:
Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p < .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models.
Resumo:
Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts.
Resumo:
In the diagnosis of diabetic autonomic neuropathy (DAN) various autonomic tests are used. We took a novel statistical approach to find a combination of autonomic tests that best separates normal controls from patients with DAN.
Resumo:
The flammability zone boundaries are very important properties to prevent explosions in the process industries. Within the boundaries, a flame or explosion can occur so it is important to understand these boundaries to prevent fires and explosions. Very little work has been reported in the literature to model the flammability zone boundaries. Two boundaries are defined and studied: the upper flammability zone boundary and the lower flammability zone boundary. Three methods are presented to predict the upper and lower flammability zone boundaries: The linear model The extended linear model, and An empirical model The linear model is a thermodynamic model that uses the upper flammability limit (UFL) and lower flammability limit (LFL) to calculate two adiabatic flame temperatures. When the proper assumptions are applied, the linear model can be reduced to the well-known equation yLOC = zyLFL for estimation of the limiting oxygen concentration. The extended linear model attempts to account for the changes in the reactions along the UFL boundary. Finally, the empirical method fits the boundaries with linear equations between the UFL or LFL and the intercept with the oxygen axis. xx Comparison of the models to experimental data of the flammability zone shows that the best model for estimating the flammability zone boundaries is the empirical method. It is shown that is fits the limiting oxygen concentration (LOC), upper oxygen limit (UOL), and the lower oxygen limit (LOL) quite well. The regression coefficient values for the fits to the LOC, UOL, and LOL are 0.672, 0.968, and 0.959, respectively. This is better than the fit of the "zyLFL" method for the LOC in which the regression coefficient’s value is 0.416.
Resumo:
The maximum principle is an important property of solutions to PDE. Correspondingly, it's of great interest for people to design a high order numerical scheme solving PDE with this property maintained. In this thesis, our particular interest is solving convection-dominated diffusion equation. We first review a nonconventional maximum principle preserving(MPP) high order finite volume(FV) WENO scheme, and then propose a new parametrized MPP high order finite difference(FD) WENO framework, which is generalized from the one solving hyperbolic conservation laws. A formal analysis is presented to show that a third order finite difference scheme with this parametrized MPP flux limiters maintains the third order accuracy without extra CFL constraint when the low order monotone flux is chosen appropriately. Numerical tests in both one and two dimensional cases are performed on the simulation of the incompressible Navier-Stokes equations in vorticity stream-function formulation and several other problems to show the effectiveness of the proposed method.
Resumo:
An important problem in unsupervised data clustering is how to determine the number of clusters. Here we investigate how this can be achieved in an automated way by using interrelation matrices of multivariate time series. Two nonparametric and purely data driven algorithms are expounded and compared. The first exploits the eigenvalue spectra of surrogate data, while the second employs the eigenvector components of the interrelation matrix. Compared to the first algorithm, the second approach is computationally faster and not limited to linear interrelation measures.