39 resultados para zero-inflated data
em Université de Lausanne, Switzerland
Resumo:
BACKGROUND: Inflammatory bowel disease can decrease the quality of life and induce work disability. We sought to (1) identify and quantify the predictors of disease-specific work disability in patients with inflammatory bowel disease and (2) assess the suitability of using cross-sectional data to predict future outcomes, using the Swiss Inflammatory Bowel Disease Cohort Study data. METHODS: A total of 1187 patients were enrolled and followed up for an average of 13 months. Predictors included patient and disease characteristics and drug utilization. Potential predictors were identified through an expert panel and published literature. We estimated adjusted effect estimates with 95% confidence intervals using logistic and zero-inflated Poisson regression. RESULTS: Overall, 699 (58.9%) experienced Crohn's disease and 488 (41.1%) had ulcerative colitis. Most important predictors for temporary work disability in patients with Crohn's disease included gender, disease duration, disease activity, C-reactive protein level, smoking, depressive symptoms, fistulas, extraintestinal manifestations, and the use of immunosuppressants/steroids. Temporary work disability in patients with ulcerative colitis was associated with age, disease duration, disease activity, and the use of steroids/antibiotics. In all patients, disease activity emerged as the only predictor of permanent work disability. Comparing data at enrollment versus follow-up yielded substantial differences regarding disability and predictors, with follow-up data showing greater predictor effects. CONCLUSIONS: We identified predictors of work disability in patients with Crohn's disease and ulcerative colitis. Our findings can help in forecasting these disease courses and guide the choice of appropriate measures to prevent adverse outcomes. Comparing cross-sectional and longitudinal data showed that the conduction of cohort studies is inevitable for the examination of disability.
Resumo:
OBJECTIVES: Patients with inflammatory bowel disease (IBD) have a high resource consumption, with considerable costs for the healthcare system. In a system with sparse resources, treatment is influenced not only by clinical judgement but also by resource consumption. We aimed to determine the resource consumption of IBD patients and to identify its significant predictors. MATERIALS AND METHODS: Data from the prospective Swiss Inflammatory Bowel Disease Cohort Study were analysed for the resource consumption endpoints hospitalization and outpatient consultations at enrolment [1187 patients; 41.1% ulcerative colitis (UC), 58.9% Crohn's disease (CD)] and at 1-year follow-up (794 patients). Predictors of interest were chosen through an expert panel and a review of the relevant literature. Logistic regressions were used for binary endpoints, and negative binomial regressions and zero-inflated Poisson regressions were used for count data. RESULTS: For CD, fistula, use of biologics and disease activity were significant predictors for hospitalization days (all P-values <0.001); age, sex, steroid therapy and biologics were significant predictors for the number of outpatient visits (P=0.0368, 0.023, 0.0002, 0.0003, respectively). For UC, biologics, C-reactive protein, smoke quitters, age and sex were significantly predictive for hospitalization days (P=0.0167, 0.0003, 0.0003, 0.0076 and 0.0175 respectively); disease activity and immunosuppressive therapy predicted the number of outpatient visits (P=0.0009 and 0.0017, respectively). The results of multivariate regressions are shown in detail. CONCLUSION: Several highly significant clinical predictors for resource consumption in IBD were identified that might be considered in medical decision-making. In terms of resource consumption and its predictors, CD and UC show a different behaviour.
Resumo:
Time-lapse crosshole ground-penetrating radar (GPR) data, collected while infiltration occurs, can provide valuable information regarding the hydraulic properties of the unsaturated zone. In particular, the stochastic inversion of such data provides estimates of parameter uncertainties, which are necessary for hydrological prediction and decision making. Here, we investigate the effect of different infiltration conditions on the stochastic inversion of time-lapse, zero-offset-profile, GPR data. Inversions are performed using a Bayesian Markov-chain-Monte-Carlo methodology. Our results clearly indicate that considering data collected during a forced infiltration test helps to better refine soil hydraulic properties compared to data collected under natural infiltration conditions
Resumo:
There are far-reaching conceptual similarities between bi-static surface georadar and post-stack, "zero-offset" seismic reflection data, which is expressed in largely identical processing flows. One important difference is, however, that standard deconvolution algorithms routinely used to enhance the vertical resolution of seismic data are notoriously problematic or even detrimental to the overall signal quality when applied to surface georadar data. We have explored various options for alleviating this problem and have tested them on a geologically well-constrained surface georadar dataset. Standard stochastic and direct deterministic deconvolution approaches proved to be largely unsatisfactory. While least-squares-type deterministic deconvolution showed some promise, the inherent uncertainties involved in estimating the source wavelet introduced some artificial "ringiness". In contrast, we found spectral balancing approaches to be effective, practical and robust means for enhancing the vertical resolution of surface georadar data, particularly, but not exclusively, in the uppermost part of the georadar section, which is notoriously plagued by the interference of the direct air- and groundwaves. For the data considered in this study, it can be argued that band-limited spectral blueing may provide somewhat better results than standard band-limited spectral whitening, particularly in the uppermost part of the section affected by the interference of the air- and groundwaves. Interestingly, this finding is consistent with the fact that the amplitude spectrum resulting from least-squares-type deterministic deconvolution is characterized by a systematic enhancement of higher frequencies at the expense of lower frequencies and hence is blue rather than white. It is also consistent with increasing evidence that spectral "blueness" is a seemingly universal, albeit enigmatic, property of the distribution of reflection coefficients in the Earth. Our results therefore indicate that spectral balancing techniques in general and spectral blueing in particular represent simple, yet effective means of enhancing the vertical resolution of surface georadar data and, in many cases, could turn out to be a preferable alternative to standard deconvolution approaches.
Resumo:
Time-lapse geophysical data acquired during transient hydrological experiments are being increasingly employed to estimate subsurface hydraulic properties at the field scale. In particular, crosshole ground-penetrating radar (GPR) data, collected while water infiltrates into the subsurface either by natural or artificial means, have been demonstrated in a number of studies to contain valuable information concerning the hydraulic properties of the unsaturated zone. Previous work in this domain has considered a variety of infiltration conditions and different amounts of time-lapse GPR data in the estimation procedure. However, the particular benefits and drawbacks of these different strategies as well as the impact of a variety of key and common assumptions remain unclear. Using a Bayesian Markov-chain-Monte-Carlo stochastic inversion methodology, we examine in this paper the information content of time-lapse zero-offset-profile (ZOP) GPR traveltime data, collected under three different infiltration conditions, for the estimation of van Genuchten-Mualem (VGM) parameters in a layered subsurface medium. Specifically, we systematically analyze synthetic and field GPR data acquired under natural loading and two rates of forced infiltration, and we consider the value of incorporating different amounts of time-lapse measurements into the estimation procedure. Our results confirm that, for all infiltration scenarios considered, the ZOP GPR traveltime data contain important information about subsurface hydraulic properties as a function of depth, with forced infiltration offering the greatest potential for VGM parameter refinement because of the higher stressing of the hydrological system. Considering greater amounts of time-lapse data in the inversion procedure is also found to help refine VGM parameter estimates. Quite importantly, however, inconsistencies observed in the field results point to the strong possibility that posterior uncertainties are being influenced by model structural errors, which in turn underlines the fundamental importance of a systematic analysis of such errors in future related studies.
Resumo:
The OLS estimator of the intergenerational earnings correlation is biased towards zero, while the instrumental variables estimator is biased upwards. The first of these results arises because of measurement error, while the latter rests on the presumption that the education of the parent family is an invalid instrument. We propose a panel data framework for quantifying the asymptotic biases of these estimators, as well as a mis-specification test for the IV estimator. [Author]
Resumo:
Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: "no copy" approach - data stay mostly in the CSV files; "zero configuration" - no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results.
Resumo:
OBJECTIVES: This study aimed at investigating whether data from medical teleconsultations may contribute to influenza surveillance. METHODS: International Classification of Primary Care 2nd Edition (ICPC-2) codes were used to analyse the proportion of teleconsultations due to influenza-related symptoms. Results were compared with the weekly Swiss Sentinel reports. RESULTS: When using the ICPC-2 code for fever we could reproduce the seasonal influenza peaks of the winter seasons 07/08, 08/09 and 09/10 as depicted by the Sentinel data. For the pandemic influenza 09/10, we detected a much higher first peak in summer 2009 which correlated with a potential underreporting in the Sentinel system. CONCLUSIONS: ICPC-2 data from medical teleconsultations allows influenza surveillance in real time and correlates very well with the Swiss Sentinel system.
Resumo:
This letter describes a data telemetry biomedical experiment. An implant, consisting of a biometric data sensor, electronics, an antenna, and a biocompatible capsule, is described. All the elements were co-designed in order to maximize the transmission distance. The device was implanted in a pig for an in vivo experiment of temperature monitoring.
Resumo:
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open 'data commoning' culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared 'Investigation-Study-Assay' framework to support that vision.
Resumo:
Tobacco control has been recognized as a main public health concern in Seychelles for the past two decades. Tobacco advertising, sponsoring and promotion has been banned for years, tobacco products are submitted to high taxes, high-profile awareness programs are organized regularly, and several other control measures have been implemented. The Republic of Seychelles was the first country to ratify the WHO Framework Convention on Tobacco Control (FCTC) in the African region. Three population-based surveys have been conducted in adults in Seychelles and results showed a substantial decrease in the prevalence of smoking among adults between 1989 and 2004. A first survey in adolescents was conducted in Seychelles in 2002 (the Global Youth Tobacco Survey, GYTS) in a representative sample of 1321 girls and boys aged 13-15 years. The results show that approximately half of students had tried smoking and a quarter of both boys and girls had smoked at least one cigarette during the past 30 days. Although "current smoking" is defined differently in adolescents (>or=1 cigarette during the past 30 days) and in adults (>or=1 cigarette per day), which precludes direct comparison, the high smoking prevalence in youth in Seychelles likely predicts an increasing prevalence of tobacco use in the next adult generation, particularly in women. GYTS 2002 also provides important data on a wide range of specific individual and societal factors influencing tobacco use. Hence, GYTS can be a powerful tool for monitoring the situation of tobacco use in adolescents, for highlighting the need for new policy and programs, and for evaluating the impact of current and future programs.
Resumo:
A computerized handheld procedure is presented in this paper. It is intended as a database complementary tool, to enhance prospective risk analysis in the field of occupational health. The Pendragon forms software (version 3.2) has been used to implement acquisition procedures on Personal Digital Assistants (PDAs) and to transfer data to a computer in an MS-Access format. The data acquisition strategy proposed relies on the risk assessment method practiced at the Institute of Occupational Health Sciences (IST). It involves the use of a systematic hazard list and semi-quantitative risk assessment scales. A set of 7 modular forms has been developed to cover the basic need of field audits. Despite the minor drawbacks observed, the results obtained so far show that handhelds are adequate to support field risk assessment and follow-up activities. Further improvements must still be made in order to increase the tool effectiveness and field adequacy.