973 resultados para CORRELATED DATA
Resumo:
In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying recurrent event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the recurrent event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximizing a conditional likelihood function of observed event counts and solving estimation equations. Large sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumor study is presented to illustrate the use of the proposed methods.
Resumo:
Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts.
Resumo:
OBJECTIVE: The factors that induce remission of RA during pregnancy and the relapse occurring after delivery remain an enigma. In a previous study, we investigated gene-expression profiles of peripheral blood mononuclear cells (PBMC) in patients with RA and healthy women in late pregnancy and postpartum. Profiles of samples from both groups were similar in late pregnancy with elevated monocyte and decreased lymphocyte signatures. Postpartum, in RA PBMC the high level of monocyte transcripts persisted. Further increase was observed in adhesion, migration and signalling processes related to monocytes but also in lymphocytes despite similar clinical activity due to intensified drug treatment. This prompted us to investigate correlations between clinical parameters of disease activity and gene profiles. METHODS: Transcriptome data were correlated with RADAI, CRP, monocyte and lymphocyte counts. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations, monocytes and lymphocytes signatures were used as reference information. RESULTS: Comparative analysis of PBMC expression profiles from RA patients during and after pregnancy with RADAI and CRP revealed a correlation of these disease activity parameters predominantly with monocyte transcripts. Genes related to cellular programs of adhesion, migration and response to infections were upregulated. Comparing clinically active and not-active RA patients postpartum revealed a cluster of 19 genes that could also identify active disease during pregnancy. CONCLUSION: The data suggest that an increase of the RADAI and an elevation of CRP is a consequence of molecular activation of monocytes. Furthermore, they indicate that molecular activation of T lymphocytes may remain clinically unrecognized postpartum. It is conceivable that a set of 19 genes may qualify as molecular disease activity marker.
Resumo:
BACKGROUND: Periodontitis is the major cause of tooth loss in adults and is linked to systemic illnesses, such as cardiovascular disease and stroke. The development of rapid point-of-care (POC) chairside diagnostics has the potential for the early detection of periodontal infection and progression to identify incipient disease and reduce health care costs. However, validation of effective diagnostics requires the identification and verification of biomarkers correlated with disease progression. This clinical study sought to determine the ability of putative host- and microbially derived biomarkers to identify periodontal disease status from whole saliva and plaque biofilm. METHODS: One hundred human subjects were equally recruited into a healthy/gingivitis group or a periodontitis population. Whole saliva was collected from all subjects and analyzed using antibody arrays to measure the levels of multiple proinflammatory cytokines and bone resorptive/turnover markers. RESULTS: Salivary biomarker data were correlated to comprehensive clinical, radiographic, and microbial plaque biofilm levels measured by quantitative polymerase chain reaction (qPCR) for the generation of models for periodontal disease identification. Significantly elevated levels of matrix metalloproteinase (MMP)-8 and -9 were found in subjects with advanced periodontitis with Random Forest importance scores of 7.1 and 5.1, respectively. The generation of receiver operating characteristic curves demonstrated that permutations of salivary biomarkers and pathogen biofilm values augmented the prediction of disease category. Multiple combinations of salivary biomarkers (especially MMP-8 and -9 and osteoprotegerin) combined with red-complex anaerobic periodontal pathogens (such as Porphyromonas gingivalis or Treponema denticola) provided highly accurate predictions of periodontal disease category. Elevated salivary MMP-8 and T. denticola biofilm levels displayed robust combinatorial characteristics in predicting periodontal disease severity (area under the curve = 0.88; odds ratio = 24.6; 95% confidence interval: 5.2 to 116.5). CONCLUSIONS: Using qPCR and sensitive immunoassays, we identified host- and bacterially derived biomarkers correlated with periodontal disease. This approach offers significant potential for the discovery of biomarker signatures useful in the development of rapid POC chairside diagnostics for oral and systemic diseases. Studies are ongoing to apply this approach to the longitudinal predictions of disease activity.
Resumo:
High density spatial and temporal sampling of EEG data enhances the quality of results of electrophysiological experiments. Because EEG sources typically produce widespread electric fields (see Chapter 3) and operate at frequencies well below the sampling rate, increasing the number of electrodes and time samples will not necessarily increase the number of observed processes, but mainly increase the accuracy of the representation of these processes. This is namely the case when inverse solutions are computed. As a consequence, increasing the sampling in space and time increases the redundancy of the data (in space, because electrodes are correlated due to volume conduction, and time, because neighboring time points are correlated), while the degrees of freedom of the data change only little. This has to be taken into account when statistical inferences are to be made from the data. However, in many ERP studies, the intrinsic correlation structure of the data has been disregarded. Often, some electrodes or groups of electrodes are a priori selected as the analysis entity and considered as repeated (within subject) measures that are analyzed using standard univariate statistics. The increased spatial resolution obtained with more electrodes is thus poorly represented by the resulting statistics. In addition, the assumptions made (e.g. in terms of what constitutes a repeated measure) are not supported by what we know about the properties of EEG data. From the point of view of physics (see Chapter 3), the natural “atomic” analysis entity of EEG and ERP data is the scalp electric field
Resumo:
A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.
Resumo:
The recent liberalization of the German energy market has forced the energy industry to develop and install new information systems to support agents on the energy trading floors in their analytical tasks. Besides classical approaches of building a data warehouse giving insight into the time series to understand market and pricing mechanisms, it is crucial to provide a variety of external data from the web. Weather information as well as political news or market rumors are relevant to give the appropriate interpretation to the variables of a volatile energy market. Starting from a multidimensional data model and a collection of buy and sell transactions a data warehouse is built that gives analytical support to the agents. Following the idea of web farming we harvest the web, match the external information sources after a filtering and evaluation process to the data warehouse objects, and present this qualified information on a user interface where market values are correlated with those external sources over the time axis.
Resumo:
A digital camera was used to obtain digital images of beef carcasses moving on the rail in commercial beef packing plants. These images were satisfactory for measurement of backfat thickness and area of ribeye. The measurements were closely correlated with the same two measurements taken from tracings on acetate paper of fat thickness and area of ribeye made on carcasses moving on the rail.
Resumo:
When observers are presented with two visual targets appearing in the same position in close temporal proximity, a marked reduction in detection performance of the second target has often been reported, the so-called attentional blink phenomenon. Several studies found a similar decrement of P300 amplitudes during the attentional blink period as observed with detection performances of the second target. However, whether the parallel courses of second target performances and corresponding P300 amplitudes resulted from the same underlying mechanisms remained unclear. The aim of our study was therefore to investigate whether the mechanisms underlying the AB can be assessed by fixed-links modeling and whether this kind of assessment would reveal the same or at least related processes in the behavioral and electrophysiological data. On both levels of observation three highly similar processes could be identified: an increasing, a decreasing and a u-shaped trend. Corresponding processes from the behavioral and electrophysiological data were substantially correlated, with the two u-shaped trends showing the strongest association with each other. Our results provide evidence for the assumption that the same mechanisms underlie attentional blink task performance at the electrophysiological and behavioral levels as assessed by fixed-links models.
Resumo:
A non-parametric method was developed and tested to compare the partial areas under two correlated Receiver Operating Characteristic curves. Based on the theory of generalized U-statistics the mathematical formulas have been derived for computing ROC area, and the variance and covariance between the portions of two ROC curves. A practical SAS application also has been developed to facilitate the calculations. The accuracy of the non-parametric method was evaluated by comparing it to other methods. By applying our method to the data from a published ROC analysis of CT image, our results are very close to theirs. A hypothetical example was used to demonstrate the effects of two crossed ROC curves. The two ROC areas are the same. However each portion of the area between two ROC curves were found to be significantly different by the partial ROC curve analysis. For computation of ROC curves with large scales, such as a logistic regression model, we applied our method to the breast cancer study with Medicare claims data. It yielded the same ROC area computation as the SAS Logistic procedure. Our method also provides an alternative to the global summary of ROC area comparison by directly comparing the true-positive rates for two regression models and by determining the range of false-positive values where the models differ. ^
Resumo:
Recent studies identified unexpected expression and transcriptional complexity of the hemoprotein myoglobin (MB) in human breast cancer but its role in prostate cancer is still unclear. Expression of MB was immunohistochemically analyzed in three independent cohorts of radical prostatectomy specimens (n = 409, n = 625, and n = 237). MB expression data were correlated with clinicopathological parameters and molecular parameters of androgen and hypoxia signaling. Expression levels of novel tumor-associated MB transcript variants and the VEGF gene as a hypoxia marker were analyzed using qRT-PCR. Fifty-three percent of the prostate cancer cases were MB positive and significantly correlated with androgen receptor (AR) expression (p < 0.001). The positive correlation with CAIX (p < 0.001) and FASN (p = 0.008) as well as the paralleled increased expression of the tumor-associated MB transcript variants and VEGF suggest that hypoxia participates in MB expression regulation. Analogous to breast cancer, MB expression in prostate cancer is associated with steroid hormone signaling and markers of hypoxia. Further studies must elucidate the novel functional roles of MB in human carcinomas, which probably extend beyond its classic intramuscular function in oxygen storage.
Resumo:
Motivation: Population allele frequencies are correlated when populations have a shared history or when they exchange genes. Unfortunately, most models for allele frequency and inference about population structure ignore this correlation. Recent analytical results show that among populations, correlations can be very high, which could affect estimates of population genetic structure. In this study, we propose a mixture beta model to characterize the allele frequency distribution among populations. This formulation incorporates the correlation among populations as well as extending the model to data with different clusters of populations. Results: Using simulated data, we show that in general, the mixture model provides a good approximation of the among-population allele frequency distribution and a good estimate of correlation among populations. Results from fitting the mixture model to a dataset of genotypes at 377 autosomal microsatellite loci from human populations indicate high correlation among populations, which may not be appropriate to neglect. Traditional measures of population structure tend to over-estimate the amount of genetic differentiation when correlation is neglected. Inference is performed in a Bayesian framework.
Resumo:
In the United States, “binge” drinking among college students is an emerging public health concern due to the significant physical and psychological effects on young adults. The focus is on identifying interventions that can help decrease high-risk drinking behavior among this group of drinkers. One such intervention is Motivational interviewing (MI), a client-centered therapy that aims at resolving client ambivalence by developing discrepancy and engaging the client in change talk. Of late, there is a growing interest in determining the active ingredients that influence the alliance between the therapist and the client. This study is a secondary analysis of the data obtained from the Southern Methodist Alcohol Research Trial (SMART) project, a dismantling trial of MI and feedback among heavy drinking college students. The present project examines the relationship between therapist and client language in MI sessions on a sample of “binge” drinking college students. Of the 126 SMART tapes, 30 tapes (‘MI with feedback’ group = 15, ‘MI only’ group = 15) were randomly selected for this study. MISC 2.1, a mutually exclusive and exhaustive coding system, was used to code the audio/videotaped MI sessions. Therapist and client language were analyzed for communication characteristics. Overall, therapists adopted a MI consistent style and clients were found to engage in change talk. Counselor acceptance, empathy, spirit, and complex reflections were all significantly related to client change talk (p-values ranged from 0.001 to 0.047). Additionally, therapist ‘advice without permission’ and MI Inconsistent therapist behaviors were strongly correlated with client sustain talk (p-values ranged from 0.006 to 0.048). Simple linear regression models showed a significant correlation between MI consistent (MICO) therapist language (independent variable) and change talk (dependent variable) and MI inconsistent (MIIN) therapist language (independent variable) and sustain talk (dependent variable). The study has several limitations such as small sample size, self-selection bias, poor inter-rater reliability for the global scales and the lack of a temporal measure of therapist and client language. Future studies might consider a larger sample size to obtain more statistical power. In addition the correlation between therapist language, client language and drinking outcome needs to be explored.^
Resumo:
Interaction effect is an important scientific interest for many areas of research. Common approach for investigating the interaction effect of two continuous covariates on a response variable is through a cross-product term in multiple linear regression. In epidemiological studies, the two-way analysis of variance (ANOVA) type of method has also been utilized to examine the interaction effect by replacing the continuous covariates with their discretized levels. However, the implications of model assumptions of either approach have not been examined and the statistical validation has only focused on the general method, not specifically for the interaction effect.^ In this dissertation, we investigated the validity of both approaches based on the mathematical assumptions for non-skewed data. We showed that linear regression may not be an appropriate model when the interaction effect exists because it implies a highly skewed distribution for the response variable. We also showed that the normality and constant variance assumptions required by ANOVA are not satisfied in the model where the continuous covariates are replaced with their discretized levels. Therefore, naïve application of ANOVA method may lead to an incorrect conclusion. ^ Given the problems identified above, we proposed a novel method modifying from the traditional ANOVA approach to rigorously evaluate the interaction effect. The analytical expression of the interaction effect was derived based on the conditional distribution of the response variable given the discretized continuous covariates. A testing procedure that combines the p-values from each level of the discretized covariates was developed to test the overall significance of the interaction effect. According to the simulation study, the proposed method is more powerful then the least squares regression and the ANOVA method in detecting the interaction effect when data comes from a trivariate normal distribution. The proposed method was applied to a dataset from the National Institute of Neurological Disorders and Stroke (NINDS) tissue plasminogen activator (t-PA) stroke trial, and baseline age-by-weight interaction effect was found significant in predicting the change from baseline in NIHSS at Month-3 among patients received t-PA therapy.^