994 resultados para Distributions for Correlated Variables
Resumo:
In this paper, we propose nonlinear elliptical models for correlated data with heteroscedastic and/or autoregressive structures. Our aim is to extend the models proposed by Russo et al. [22] by considering a more sophisticated scale structure to deal with variations in data dispersion and/or a possible autocorrelation among measurements taken throughout the same experimental unit. Moreover, to avoid the possible influence of outlying observations or to take into account the non-normal symmetric tails of the data, we assume elliptical contours for the joint distribution of random effects and errors, which allows us to attribute different weights to the observations. We propose an iterative algorithm to obtain the maximum-likelihood estimates for the parameters and derive the local influence curvatures for some specific perturbation schemes. The motivation for this work comes from a pharmacokinetic indomethacin data set, which was analysed previously by Bocheng and Xuping [1] under normality.
Resumo:
This paper analyzes concepts of independence and assumptions of convexity in the theory of sets of probability distributions. The starting point is Kyburg and Pittarelli's discussion of "convex Bayesianism" (in particular their proposals concerning E-admissibility, independence, and convexity). The paper offers an organized review of the literature on independence for sets of probability distributions; new results on graphoid properties and on the justification of "strong independence" (using exchangeability) are presented. Finally, the connection between Kyburg and Pittarelli's results and recent developments on the axiomatization of non-binary preferences, and its impact on "complete" independence, are described.
Resumo:
Dinoflagellates of the genus Ceratium are chiefly marine but there are rare occurrences in freshwater. In this study we analyze the invasion and progressive establishment of Ceratium furcoides, an exotic species, in the Furnas Reservoir. Samples were taken at 36 points in the reservoir, during the months of March, June, September and December, 2007. Measurements of some physical and chemical variables were simultaneously performed at each site. The occurrence of C. furcoides was registered at 20 sites, with densities varying between 0.57 and 28,564,913.0 ind.m-3. Blooms of this species were recorded in points which were classified as mesotrophic, coinciding with the places receiving high amounts of untreated domestic sewage. C. furcoides density was correlated with temperature, nutrients (nitrate and nitrite) and water electric conductivity. The highest density was recorded in June when temperature was low. The presence of Ceratium furcoides in the reservoir apparently has not yet affected the reservoir water quality or other plankton communities. However, if it becomes fully established it could perhaps become a problem in the reservoir or even to spread out to other reservoirs in Rio Grande basin.
Resumo:
The measurement of charged-particle event shape variables is presented in inclusive inelastic pp collisions at a center-of-mass energy of 7 TeV using the ATLAS detector at the LHC. The observables studied are the transverse thrust, thrust minor, and transverse sphericity, each defined using the final-state charged particles' momentum components perpendicular to the beam direction. Events with at least six charged particles are selected by a minimum-bias trigger. In addition to the differential distributions, the evolution of each event shape variable as a function of the leading charged-particle transverse momentum, charged-particle multiplicity, and summed transverse momentum is presented. Predictions from several Monte Carlo models show significant deviations from data.
Resumo:
In a matched experimental design, the effectiveness of matching in reducing bias and increasing power depends on the strength of the association between the matching variable and the outcome of interest. In particular, in the design of a community health intervention trial, the effectiveness of a matched design, where communities are matched according to some community characteristic, depends on the strength of the correlation between the matching characteristic and the change in the health behavior being measured. We attempt to estimate the correlation between community characteristics and changes in health behaviors in four datasets from community intervention trials and observational studies. Community characteristics that are highly correlated with changes in health behaviors would potentially be effective matching variables in studies of health intervention programs designed to change those behaviors. Among the community characteristics considered, the urban-rural character of the community was the most highly correlated with changes in health behaviors. The correlations between Per Capita Income, Percent Low Income & Percent aged over 65 and changes in health behaviors were marginally statistically significant (p < 0.08).
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts.
Resumo:
Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^
Resumo:
Theoretical models predict lognormal species abundance distributions (SADs) in stable and productive environments, with log-series SADs in less stable, dispersal driven communities. We studied patterns of relative species abundances of perennial vascular plants in global dryland communities to: (i) assess the influence of climatic and soil characteristics on the observed SADs, (ii) infer how environmental variability influences relative abundances, and (iii) evaluate how colonisation dynamics and environmental filters shape abundance distributions. We fitted lognormal and log-series SADs to 91 sites containing at least 15 species of perennial vascular plants. The dependence of species relative abundances on soil and climate variables was assessed using general linear models. Irrespective of habitat type and latitude, the majority of the SADs (70.3%) were best described by a lognormal distribution. Lognormal SADs were associated with low annual precipitation, higher aridity, high soil carbon content, and higher variability of climate variables and soil nitrate. Our results do not corroborate models predicting the prevalence of log-series SADs in dryland communities. As lognormal SADs were particularly associated with sites with drier conditions and a higher environmental variability, we reject models linking lognormality to environmental stability and high productivity conditions. Instead our results point to the prevalence of lognormal SADs in heterogeneous environments, allowing for more evenly distributed plant communities, or in stressful ecosystems, which are generally shaped by strong habitat filters and limited colonisation. This suggests that drylands may be resilient to environmental changes because the many species with intermediate relative abundances could take over ecosystem functioning if the environment becomes suboptimal for dominant species.
Resumo:
Five years (1979-1983) of Coastal Zone Color Scanner satellite ocean color data are used to examine seasonal patterns of phytoplankton pigment concentration along the Chilean coast from 20 degrees S to 45 degrees S. Four kilometer resolution, 2-4 day composites document the presence of filaments of elevated pigment concentration extending offshore throughout the study area, with maximum offshore extension at higher latitudes. In three years, 1979, 1981, and 1983, sufficient data exist in monthly composites to allow recreation of portions of the seasonal cycle. Data in 1979 are the most complete. Near-shore concentrations and cross-shelf extension of pigment concentrations in 1979 are maximum in austral winter throughout the study area and minimum in summer. Available data from 1981 and 1983 are consistent with this temporal pattern but with concentrations approximately double those of 1979. Seasonal, spatial patterns within 10 km of shore and 50 km offshore indicate a latitudinal discontinuity both in absolute concentration and in the magnitude of the seasonal cycle at approximately 33 degrees S in both 1979 and in the climatological time series. The discontinuity is strongest ill fall-winter and weakest in summer. South of this latitude, concentrations are relatively high (2-3 mg m(-3) in 1979), a strong seasonal cycle is present, and patterns 50 km offshore are correlated with those within 10 km of shore. North of 33 degrees S, concentrations are < 1.5 mg m(-3) (in 1979), and the seasonal cycle within 10 km of shore is present but much weaker and less obviously correlated with that 50 km offshore. The seasonal cycle of pigment concentrations is 180 degrees out of phase with monthly averaged upwelling favorable winds. Noncoincident Pathfinder sea surface temperature data show that over most latitudes, coastal low surface temperatures lag wind forcing by 1-2 months, but these too are out of phase with the pigment seasonal cycle. These data point to control of pigment patterns along the Chilean coast by the interaction of upwelling with circulation patterns unconnected to local wind forcing.
Resumo:
In this study we examined the spatial and temporal variability of particulate organic material (POM) off Oregon during the upwelling season. High-resolution vertical profiling of beam attenuation was conducted along two cross-shelf transects. One transect was located in a region where the shelf is relatively uniform and narrow (off Cascade Head (CH)); the second transect was located in a region where the shelf is shallow and wide (off Cape Perpetua (CP)). In addition, water samples were collected for direct analysis of chlorophyll, particulate organic carbon (POC), and particulate organic nitrogen (PON). Beam attenuation was highly correlated with POC and PON. Striking differences in distribution patterns and characteristics of POM were observed between CH and CP. Off CH, elevated concentrations of chlorophyll and POC were restricted to the inner shelf and were highly variable in time. The magnitude of the observed short-term temporal variability was of the same order as that of the seasonal variability reported in previous studies. Elevated concentrations of nondegraded chlorophyll and POM were observed near the bottom. Downwelling and rapid sinking are two mechanisms by which phytoplankton cells can be delivered to the bottom before being degraded. POM may be then transported across the shelf via the benthic nepheloid layer. Along the CP transect, concentrations of POM were generally higher than they were along the CH transect and extended farther across the shelf. Characteristics of surface POM, namely, C: N ratios and carbon: chlorophyll ratios, differed between the two sites. These differences can be attributed to differences in shelf circulation.
Resumo:
In order to maintain pond-breeding amphibian species richness, it is important to understand how both natural and anthropogenic disturbances affect species assemblages and individual species distributions both at the scale of individual ponds and at a larger landscape scale. The goal of this project was to investigate what characteristics of ponds and the surrounding wetland landscape were most effective in predicting pond-breeding species richness and the individual occurrence of wood frog (Rana sylvatica), bullfrog (Rana catesbeiana) and pickerel frog (Rana palustris) breeding sites in a beaver-modified landscape and how this landscape has changed over time. The wetland landscape of Acadia National Park was historically modified by the natural disturbance cycles of beaver (Castor cazadensis), and since their reintroduction to the island in 1921, beaver have played a large role in creating and maintaining palustrine wetlands. In 2000 and 2001, I studied pond-breeding amphibian assemblages at 71 palustrine wetlands in Acadia National Park, Mount Desert Island, Maine. I determined breeding presence of 7 amphibian species and quantified 15 variables describing local pond conditions and characteristics of the wetland landscape. I developed a priori models to predict sites with high amphibian species and used model selection with Akaike's Information Criterion (AIC) to identify important variables. Single species models were also developed to predict wood frog, bullfrog and pickerel frogs breeding presence. The variables for wetland connectivity by stream corridors and the presence of beaver disturbance were the most effective variables to predict sites with high amphibian richness. Wood frog breeding was best predicted by local scale variables describing temporary, fishless wetlands and the absence of active beaver disturbance. Abandoned beaver sites provided wood frog breeding habitat (70%) in a similar proportion to that found in non beaver-influenced sites (79%). In contrast, bullfrog breeding presence was limited to active beaver wetlands with fish and permanent water, and 80% of breeding sites were large (≥2ha in size). Pickerel frog breeding site selection was predicted best by the connectivity of sites in the landscape by stream corridors. Models including the presence of beaver disturbance, greater wetland perimeter and greater depth were included in the confidence set of pickerel frog models but showed considerably less support. Analysis of historic aerial photographs showed an 89% increase in the total number of ponded wetlands available in the landscape between the years of 1944 and 1997. Beaver colonization generally converted forested wetlands and riparian areas to open water and emergent wetlands. Temporal colonization of beaver wetlands favored large sites low in the watersheds and sites that were impounded later were generally smaller, higher in the watershed, and more likely to be abandoned. These results suggest that beaver have not only increased the number of available breeding sites in the landscape for pond-breeding amphibians, but the resulting mosaic of active and abandoned beaver wetlands also provides suitable breeding habitat for species with differing habitat requirements.
Resumo:
A multivariate frailty hazard model is developed for joint-modeling of three correlated time-to-event outcomes: (1) local recurrence, (2) distant recurrence, and (3) overall survival. The term frailty is introduced to model population heterogeneity. The dependence is modeled by conditioning on a shared frailty that is included in the three hazard functions. Independent variables can be included in the model as covariates. The Markov chain Monte Carlo methods are used to estimate the posterior distributions of model parameters. The algorithm used in present application is the hybrid Metropolis-Hastings algorithm, which simultaneously updates all parameters with evaluations of gradient of log posterior density. The performance of this approach is examined based on simulation studies using Exponential and Weibull distributions. We apply the proposed methods to a study of patients with soft tissue sarcoma, which motivated this research. Our results indicate that patients with chemotherapy had better overall survival with hazard ratio of 0.242 (95% CI: 0.094 - 0.564) and lower risk of distant recurrence with hazard ratio of 0.636 (95% CI: 0.487 - 0.860), but not significantly better in local recurrence with hazard ratio of 0.799 (95% CI: 0.575 - 1.054). The advantages and limitations of the proposed models, and future research directions are discussed. ^
Resumo:
Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^
Resumo:
Body fat distribution is a cardiovascular health risk factor in adults. Body fat distribution can be measured through various methods including anthropometry. It is not clear which anthropometric index is suitable for epidemiologic studies of fat distribution and cardiovascular disease. The purpose of the present study was to select a measure of body fat distribution from among a series of indices (those traditionally used in the literature and others constructed from the analysis) that is most highly correlated with lipid-related variables and is independent of overall fatness. Subjects were Mexican-American men and women (N = 1004) from a study of gallbladder disease in Starr County, Texas. Multivariate associations were sought between lipid profile measures (lipids, lipoproteins, and apolipoproteins) and two sets of anthropometric variables (4 circumferences and 6 skinfolds). This was done to assess the association between lipid-related measures and the two sets of anthropometric variables and guide the construction of indices.^ Two indices emerged from the analysis that seemed to be highly correlated with lipid profile measures independent of obesity. These indices are: 2*arm circumference-thigh skinfold in pre- and post-menopausal women and arm/thigh circumference ratio in men. Next, using the sum of all skinfolds to represent obesity and the selected body fat distribution indices, the following hypotheses were tested: (1) state of obesity and centrally/upper distributed body fat are equally predictive of lipids, lipoproteins and apolipoproteins, and (2) the correlation among the lipid-related measures is not altered by obesity and body fat distribution.^ With respect to the first hypothesis, the present study found that most lipids, lipoproteins and apolipoproteins were significantly associated with both overall fatness and anatomical location of body fat in both sex and menopausal groups. However, within men and post-menopausal women, certain lipid profile measures (triglyceride and HDLT among post-menopausal women and apos C-II, CIII, and E among men) had substantially higher correlation with body fat distribution as compared with overall fatness.^ With respect to the second hypothesis, both obesity and body fat distribution were found to alter the association among plasma lipid variables in men and women. There was a suggestion from the data that the pattern of correlations among men and post-menopausal women are more comparable. Among men correlations involving apo A-I, HDLT, and HDL$\sb2$ seemed greatly influenced by obesity, and A-II by fat distribution; among post-menopausal women correlations involving apos A-I and A-II were highly affected by the location of body fat.^ Thus, these data point out that not only can obesity and fat distribution affect levels of single measures, they also can markedly influence the pattern of relationship among measures. The fact that such changes are seen for both obesity and fat distribution is significant, since the indices employed were chosen because they were independent of one another. ^