905 resultados para LONGITUDINAL DATA-ANALYSIS
Resumo:
Estimation for bivariate right censored data is a problem that has had much study over the past 15 years. In this paper we propose a new class of estimators for the bivariate survival function based on locally efficient estimation. We introduce the locally efficient estimator for bivariate right censored data, present an asymptotic theorem, present the results of simulation studies and perform a brief data analysis illustrating the use of the locally efficient estimator.
Resumo:
We propose robust and e±cient tests and estimators for gene-environment/gene-drug interactions in family-based association studies. The methodology is designed for studies in which haplotypes, quantitative pheno- types and complex exposure/treatment variables are analyzed. Using causal inference methodology, we derive family-based association tests and estimators for the genetic main effects and the interactions. The tests and estimators are robust against population admixture and strati¯cation without requiring adjustment for confounding variables. We illustrate the practical relevance of our approach by an application to a COPD study. The data analysis suggests a gene-environment interaction between a SNP in the Serpine gene and smok- ing status/pack years of smoking that reduces the FEV1 volume by about 0.02 liter per pack year of smoking. Simulation studies show that the pro- posed methodology is su±ciently powered for realistic sample sizes and that it provides valid tests and effect size estimators in the presence of admixture and stratification.
Resumo:
Background: The recent development of semi-automated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. Methods: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. Results: We found that graphical representations can reveal substantial non-biological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. Conclusions: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.
Resumo:
Visualization and exploratory analysis is an important part of any data analysis and is made more challenging when the data are voluminous and high-dimensional. One such example is environmental monitoring data, which are often collected over time and at multiple locations, resulting in a geographically indexed multivariate time series. Financial data, although not necessarily containing a geographic component, present another source of high-volume multivariate time series data. We present the mvtsplot function which provides a method for visualizing multivariate time series data. We outline the basic design concepts and provide some examples of its usage by applying it to a database of ambient air pollution measurements in the United States and to a hypothetical portfolio of stocks.
Resumo:
INTRODUCTION: A recent report described a possible interaction between tenofovir (TFV) and efavirenz (EFV). Patients developed neuropsychiatric manifestations upon introduction of TFV on a stable EFV-containing regimen. We evaluated the possibility of a pharmacokinetic interaction between TFV and EFV by assessing cross-sectional and longitudinal data in 169 individuals receiving EFV. RESULTS: EFV plasma area-under-the-curve (AUC) levels were comparable among individuals receiving (n=18) or not receiving TFV (n=151); 57,962 versus 52,293 ng*h/ml. However, under conditions of limited EFV metabolism, that is, the group of 23 individuals carrying two copies of CYP2B6 loss/diminished-function alleles, plasma AUC values were highest among individuals receiving TFV (n=5, 353,031 ng*h/ml), compared with those not receiving TFV (n=18, 180,689 ng*h/ml). Statistical analysis identified both a global, sixfold effect of CYP2B6 loss/diminished function (P < 0.0001) and a significant interaction between the number of loss/diminished-function alleles and the co-medication with TFV (P = 0.009). CONCLUSION: Although there is no clear evidence for a pharmacokinetic interaction between TFV and EFV, we cannot rule out an interaction between these drugs restricted to individuals who are slow EFV metabolizers.
Resumo:
BACKGROUND: High intercoder reliability (ICR) is required in qualitative content analysis for assuring quality when more than one coder is involved in data analysis. The literature is short of standardized procedures for ICR procedures in qualitative content analysis. OBJECTIVE: To illustrate how ICR assessment can be used to improve codings in qualitative content analysis. METHODS: Key steps of the procedure are presented, drawing on data from a qualitative study on patients' perspectives on low back pain. RESULTS: First, a coding scheme was developed using a comprehensive inductive and deductive approach. Second, 10 transcripts were coded independently by two researchers, and ICR was calculated. A resulting kappa value of .67 can be regarded as satisfactory to solid. Moreover, varying agreement rates helped to identify problems in the coding scheme. Low agreement rates, for instance, indicated that respective codes were defined too broadly and would need clarification. In a third step, the results of the analysis were used to improve the coding scheme, leading to consistent and high-quality results. DISCUSSION: The quantitative approach of ICR assessment is a viable instrument for quality assurance in qualitative content analysis. Kappa values and close inspection of agreement rates help to estimate and increase quality of codings. This approach facilitates good practice in coding and enhances credibility of analysis, especially when large samples are interviewed, different coders are involved, and quantitative results are presented.
Resumo:
Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution.
Resumo:
BACKGROUND: Falls are common and serious problems in older adults. The goal of this study was to examine whether preclinical disability predicts incident falls in a European population of community-dwelling older adults. METHODS: Secondary data analysis was performed on a population-based longitudinal study of 1644 community-dwelling older adults living in London, U.K.; Hamburg, Germany; Solothurn, Switzerland. Data were collected at baseline and 1-year follow-up using a self-administered multidimensional health risk appraisal questionnaire, including validated questions on falls, mobility disability status (high function, preclinical disability, task difficulty), and demographic and health-related characteristics. Associations were evaluated using bivariate and multivariate logistic regression analyses. RESULTS: Overall incidence of falls was 24%, and increased by worsening mobility disability status: high function (17%), preclinical disability (32%), task difficulty (40%), test-of-trend p <.003. In multivariate analysis adjusting for other fall risk factors, preclinical disability (odds ratio [OR] = 1.7, 95% confidence interval [CI], 1.1-2.5), task difficulty (OR = 1.7, 95% CI, 1.1-2.6) and history of falls (OR = 4.7, 95% CI, 3.5-6.3) were the strongest significant predictors of falls. In stratified multivariate analyses, preclinical disability equally predicted falls in participants with (OR = 1.7, 95% CI, 1.0-3.0) and without history of falls (OR = 1.8, 95% CI, 1.1-3.0). CONCLUSIONS: This study provides longitudinal evidence that self-reported preclinical disability predicts incident falls at 1-year follow-up independent of other self-reported fall risk factors. Multidimensional geriatric assessment that includes preclinical disability may provide a unique early warning system as well as potential targets for intervention.
Resumo:
Turrialba is one of the largest and most active stratovolcanoes in the Central Cordillera of Costa Rica and an excellent target for validation of satellite data using ground based measurements due to its high elevation, relative ease of access, and persistent elevated SO2 degassing. The Ozone Monitoring Instrument (OMI) aboard the Aura satellite makes daily global observations of atmospheric trace gases and it is used in this investigation to obtain volcanic SO2 retrievals in the Turrialba volcanic plume. We present and evaluate the relative accuracy of two OMI SO2 data analysis procedures, the automatic Band Residual Index (BRI) technique and the manual Normalized Cloud-mass (NCM) method. We find a linear correlation and good quantitative agreement between SO2 burdens derived from the BRI and NCM techniques, with an improved correlation when wet season data are excluded. We also present the first comparisons between volcanic SO2 emission rates obtained from ground-based mini-DOAS measurements at Turrialba and three new OMI SO2 data analysis techniques: the MODIS smoke estimation, OMI SO2 lifetime, and OMI SO2 transect techniques. A robust validation of OMI SO2 retrievals was made, with both qualitative and quantitative agreements under specific atmospheric conditions, proving the utility of satellite measurements for estimating accurate SO2 emission rates and monitoring passively degassing volcanoes.
DIMENSION REDUCTION FOR POWER SYSTEM MODELING USING PCA METHODS CONSIDERING INCOMPLETE DATA READINGS
Resumo:
Principal Component Analysis (PCA) is a popular method for dimension reduction that can be used in many fields including data compression, image processing, exploratory data analysis, etc. However, traditional PCA method has several drawbacks, since the traditional PCA method is not efficient for dealing with high dimensional data and cannot be effectively applied to compute accurate enough principal components when handling relatively large portion of missing data. In this report, we propose to use EM-PCA method for dimension reduction of power system measurement with missing data, and provide a comparative study of traditional PCA and EM-PCA methods. Our extensive experimental results show that EM-PCA method is more effective and more accurate for dimension reduction of power system measurement data than traditional PCA method when dealing with large portion of missing data set.
Resumo:
Dr. Rossi discusses the common errors that are made when fitting statistical models to data. Focuses on the planning, data analysis, and interpretation phases of a statistical analysis, and highlights the errors that are commonly made by researchers of these phases. The implications of these commonly made errors are discussed along with a discussion of the methods that can be used to prevent these errors from occurring. A prescription for carrying out a correct statistical analysis will be discussed.
Resumo:
Can one observe an increasing level of individual lack of orientation because of rapid social change in modern societies? This question is examined using data from a representative longitudinal survey in Germany conducted in 2002–04. The study examines the role of education, age, sex, region (east/west), and political orientation for the explanation of anomia and its development. First we present the different sources of anomie in modern societies, based on the theoretical foundations of Durkheim and Merton, and introduce the different definitions of anomia, including our own cognitive version. Then we deduce several hypotheses from the theory, which we test by means of longitudinal data for the period 2002–04 in Germany using the latent growth curve model as our statistical method. The empirical findings show that all the sociodemographic variables, including political orientation, are strong predictors of the initial level of anomia. Regarding the development of anomia over time (2002–04), only the region (west) has a significant impact. In particular, the results of a multi-group analysis show that western German people with a right-wing political orientation become more anomic over this period. The article concludes with some theoretical implications.
Resumo:
BACKGROUND Among children with wheeze and recurrent cough there is great variation in clinical presentation and time course of the disease. We previously distinguished 5 phenotypes of wheeze and cough in early childhood by applying latent class analysis to longitudinal data from a population-based cohort (original cohort). OBJECTIVE To validate previously identified phenotypes of childhood cough and wheeze in an independent cohort. METHODS We included 903 children reporting wheeze or recurrent cough from an independent population-based cohort (validation cohort). As in the original cohort, we used latent class analysis to identify phenotypes on the basis of symptoms of wheeze and cough at 2 time points (preschool and school age) and objective measurements of atopy, lung function, and airway responsiveness (school age). Prognostic outcomes (wheeze, bronchodilator use, cough apart from colds) 5 years later were compared across phenotypes. RESULTS When using a 5-phenotype model, the analysis distinguished 3 phenotypes of wheeze and 2 of cough as in the original cohort. Two phenotypes were closely similar in both cohorts: Atopic persistent wheeze (persistent multiple trigger wheeze and chronic cough, atopy and reduced lung function, poor prognosis) and transient viral wheeze (early-onset transient wheeze with viral triggers, favorable prognosis). The other phenotypes differed more between cohorts. These differences might be explained by differences in age at measurements. CONCLUSIONS Applying the same method to 2 different cohorts, we consistently identified 2 phenotypes of wheeze (atopic persistent wheeze, transient viral wheeze), suggesting that these represent distinct disease processes. Differences found in other phenotypes suggest that the age when features are assessed is critical and should be considered carefully when defining phenotypes.