987 resultados para correlated binary regression
Resumo:
OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
In this article we investigate the asymptotic and finite-sample properties of predictors of regression models with autocorrelated errors. We prove new theorems associated with the predictive efficiency of generalized least squares (GLS) and incorrectly structured GLS predictors. We also establish the form associated with their predictive mean squared errors as well as the magnitude of these errors relative to each other and to those generated from the ordinary least squares (OLS) predictor. A large simulation study is used to evaluate the finite-sample performance of forecasts generated from models using different corrections for the serial correlation.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Resumo:
Optimal design for generalized linear models has primarily focused on univariate data. Often experiments are performed that have multiple dependent responses described by regression type models, and it is of interest and of value to design the experiment for all these responses. This requires a multivariate distribution underlying a pre-chosen model for the data. Here, we consider the design of experiments for bivariate binary data which are dependent. We explore Copula functions which provide a rich and flexible class of structures to derive joint distributions for bivariate binary data. We present methods for deriving optimal experimental designs for dependent bivariate binary data using Copulas, and demonstrate that, by including the dependence between responses in the design process, more efficient parameter estimates are obtained than by the usual practice of simply designing for a single variable only. Further, we investigate the robustness of designs with respect to initial parameter estimates and Copula function, and also show the performance of compound criteria within this bivariate binary setting.
Resumo:
PURPOSE: To examine the relationship between contact lens (CL) case contamination and various potential predictive factors. METHODS: 74 subjects were fitted with lotrafilcon B (CIBA Vision) CLs on a daily wear basis for 1 month. Subjects were randomly assigned one of two polyhexamethylene biguanide (PHMB) preserved disinfecting solutions with the corresponding regular lens case. Clinical evaluations were conducted at lens delivery and after 1 month, when cases were collected for microbial culture. A CL care non-compliance score was determined through administration of a questionnaire and the volume of solution used was calculated for each subject. Data was examined using backward stepwise binary logistic regression. RESULTS: 68% of cases were contaminated. 35% were moderately or heavily contaminated and 36% contained gram-negative bacteria. Case contamination was significantly associated with subjective dryness symptoms (OR 4.22, CI 1.37–13.01) (P<0.05). There was no association between contamination and subject age, ethnicity, gender, average wearing time, amount of solution used, non-compliance score, CL power and subjective redness (P>0.05). The effect of lens care system on case contamination approached significance (P=0.07). Failure to rinse the case with disinfecting solution following CL insertion (OR 2.51, CI 0.52–12.09) and not air drying the case (OR 2.31, CI 0.39–13.35) were positively correlated with contamination; however, did not reach statistical significance. CONCLUSIONS: Our results suggest that case contamination may influence subjective comfort. It is difficult to predict the development of case contamination from a variety of clinical factors. The efficacy of CL solutions, bacterial resistance to disinfection and biofilm formation are likely to play a role. Further evaluation of these factors will improve our understanding of the development of case contamination and its clinical impact.
Resumo:
Quality oriented management systems and methods have become the dominant business and governance paradigm. From this perspective, satisfying customers’ expectations by supplying reliable, good quality products and services is the key factor for an organization and even government. During recent decades, Statistical Quality Control (SQC) methods have been developed as the technical core of quality management and continuous improvement philosophy and now are being applied widely to improve the quality of products and services in industrial and business sectors. Recently SQC tools, in particular quality control charts, have been used in healthcare surveillance. In some cases, these tools have been modified and developed to better suit the health sector characteristics and needs. It seems that some of the work in the healthcare area has evolved independently of the development of industrial statistical process control methods. Therefore analysing and comparing paradigms and the characteristics of quality control charts and techniques across the different sectors presents some opportunities for transferring knowledge and future development in each sectors. Meanwhile considering capabilities of Bayesian approach particularly Bayesian hierarchical models and computational techniques in which all uncertainty are expressed as a structure of probability, facilitates decision making and cost-effectiveness analyses. Therefore, this research investigates the use of quality improvement cycle in a health vii setting using clinical data from a hospital. The need of clinical data for monitoring purposes is investigated in two aspects. A framework and appropriate tools from the industrial context are proposed and applied to evaluate and improve data quality in available datasets and data flow; then a data capturing algorithm using Bayesian decision making methods is developed to determine economical sample size for statistical analyses within the quality improvement cycle. Following ensuring clinical data quality, some characteristics of control charts in the health context including the necessity of monitoring attribute data and correlated quality characteristics are considered. To this end, multivariate control charts from an industrial context are adapted to monitor radiation delivered to patients undergoing diagnostic coronary angiogram and various risk-adjusted control charts are constructed and investigated in monitoring binary outcomes of clinical interventions as well as postintervention survival time. Meanwhile, adoption of a Bayesian approach is proposed as a new framework in estimation of change point following control chart’s signal. This estimate aims to facilitate root causes efforts in quality improvement cycle since it cuts the search for the potential causes of detected changes to a tighter time-frame prior to the signal. This approach enables us to obtain highly informative estimates for change point parameters since probability distribution based results are obtained. Using Bayesian hierarchical models and Markov chain Monte Carlo computational methods, Bayesian estimators of the time and the magnitude of various change scenarios including step change, linear trend and multiple change in a Poisson process are developed and investigated. The benefits of change point investigation is revisited and promoted in monitoring hospital outcomes where the developed Bayesian estimator reports the true time of the shifts, compared to priori known causes, detected by control charts in monitoring rate of excess usage of blood products and major adverse events during and after cardiac surgery in a local hospital. The development of the Bayesian change point estimators are then followed in a healthcare surveillances for processes in which pre-intervention characteristics of patients are viii affecting the outcomes. In this setting, at first, the Bayesian estimator is extended to capture the patient mix, covariates, through risk models underlying risk-adjusted control charts. Variations of the estimator are developed to estimate the true time of step changes and linear trends in odds ratio of intensive care unit outcomes in a local hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift in mean survival time after a clinical intervention which is being monitored by riskadjusted survival time control charts. In this context, the survival time after a clinical intervention is also affected by patient mix and the survival function is constructed using survival prediction model. The simulation study undertaken in each research component and obtained results highly recommend the developed Bayesian estimators as a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances as well as industrial and business contexts. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The empirical results and simulations indicate that the Bayesian estimators are a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The advantages of the Bayesian approach seen in general context of quality control may also be extended in the industrial and business domains where quality monitoring was initially developed.
Resumo:
The conventional mechanical properties of articular cartilage, such as compressive stiffness, have been demonstrated to be limited in their capacity to distinguish intact (visually normal) from degraded cartilage samples. In this paper, we explore the correlation between a new mechanical parameter, namely the reswelling of articular cartilage following unloading from a given compressive load, and the near infrared (NIR) spectrum. The capacity to distinguish mechanically intact from proteoglycan-depleted tissue relative to the "reswelling" characteristic was first established, and the result was subsequently correlated with the NIR spectral data of the respective tissue samples. To achieve this, normal intact and enzymatically degraded samples were subjected to both NIR probing and mechanical compression based on a load-unload-reswelling protocol. The parameter δ(r), characteristic of the osmotic "reswelling" of the matrix after unloading to a constant small load in the order of the osmotic pressure of cartilage, was obtained for the different sample types. Multivariate statistics was employed to determine the degree of correlation between δ(r) and the NIR absorption spectrum of relevant specimens using Partial Least Squared (PLS) regression. The results show a strong relationship (R(2)=95.89%, p<0.0001) between the spectral data and δ(r). This correlation of δ(r) with NIR spectral data suggests the potential for determining the reswelling characteristics non-destructively. It was also observed that δ(r) values bear a significant relationship with the cartilage matrix integrity, indicated by its proteoglycan content, and can therefore differentiate between normal and artificially degraded proteoglycan-depleted cartilage samples. It is therefore argued that the reswelling of cartilage, which is both biochemical (osmotic) and mechanical (hydrostatic pressure) in origin, could be a strong candidate for characterizing the tissue, especially in regions surrounding focal cartilage defects in joints.
Resumo:
Environmental data usually include measurements, such as water quality data, which fall below detection limits, because of limitations of the instruments or of certain analytical methods used. The fact that some responses are not detected needs to be properly taken into account in statistical analysis of such data. However, it is well-known that it is challenging to analyze a data set with detection limits, and we often have to rely on the traditional parametric methods or simple imputation methods. Distributional assumptions can lead to biased inference and justification of distributions is often not possible when the data are correlated and there is a large proportion of data below detection limits. The extent of bias is usually unknown. To draw valid conclusions and hence provide useful advice for environmental management authorities, it is essential to develop and apply an appropriate statistical methodology. This paper proposes rank-based procedures for analyzing non-normally distributed data collected at different sites over a period of time in the presence of multiple detection limits. To take account of temporal correlations within each site, we propose an optimal linear combination of estimating functions and apply the induced smoothing method to reduce the computational burden. Finally, we apply the proposed method to the water quality data collected at Susquehanna River Basin in United States of America, which dearly demonstrates the advantages of the rank regression models.
Resumo:
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice.
Resumo:
The number of drug substances in formulation development in the pharmaceutical industry is increasing. Some of these are amorphous drugs and have glass transition below ambient temperature, and thus they are usually difficult to formulate and handle. One reason for this is the reduced viscosity, related to the stickiness of the drug, that makes them complicated to handle in unit operations. Thus, the aim in this thesis was to develop a new processing method for a sticky amorphous model material. Furthermore, model materials were characterised before and after formulation, using several characterisation methods, to understand more precisely the prerequisites for physical stability of amorphous state against crystallisation. The model materials used were monoclinic paracetamol and citric acid anhydrate. Amorphous materials were prepared by melt quenching or by ethanol evaporation methods. The melt blends were found to have slightly higher viscosity than the ethanol evaporated materials. However, melt produced materials crystallised more easily upon consecutive shearing than ethanol evaporated materials. The only material that did not crystallise during shearing was a 50/50 (w/w, %) blend regardless of the preparation method and it was physically stable at least two years in dry conditions. Shearing at varying temperatures was established to measure the physical stability of amorphous materials in processing and storage conditions. The actual physical stability of the blends was better than the pure amorphous materials at ambient temperature. Molecular mobility was not related to the physical stability of the amorphous blends, observed as crystallisation. Molecular mobility of the 50/50 blend derived from a spectral linewidth as a function of temperature using solid state NMR correlated better with the molecular mobility derived from a rheometer than that of differential scanning calorimetry data. Based on the results obtained, the effect of molecular interactions, thermodynamic driving force and miscibility of the blends are discussed as the key factors to stabilise the blends. The stickiness was found to be affected glass transition and viscosity. Ultrasound extrusion and cutting were successfully tested to increase the processability of sticky material. Furthermore, it was found to be possible to process the physically stable 50/50 blend in a supercooled liquid state instead of a glassy state. The method was not found to accelerate the crystallisation. This may open up new possibilities to process amorphous materials that are otherwise impossible to manufacture into solid dosage forms.
Resumo:
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988-96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.