934 resultados para Data anonymization and sanitization
Resumo:
In situ diffusion experiments are performed in geological formations at underground research laboratories to overcome the limitations of laboratory diffusion experiments and investigate scale effects. Tracer concentrations are monitored at the injection interval during the experiment (dilution data) and measured from host rock samples around the injection interval at the end of the experiment (overcoring data). Diffusion and sorption parameters are derived from the inverse numerical modeling of the measured tracer data. The identifiability and the uncertainties of tritium and Na-22(+) diffusion and sorption parameters are studied here by synthetic experiments having the same characteristics as the in situ diffusion and retention (DR) experiment performed on Opalinus Clay. Contrary to previous identifiability analyses of in situ diffusion experiments, which used either dilution or overcoring data at approximate locations, our analysis of the parameter identifiability relies simultaneously on dilution and overcoring data, accounts for the actual position of the overcoring samples in the claystone, uses realistic values of the standard deviation of the measurement errors, relies on model identification criteria to select the most appropriate hypothesis about the existence of a borehole disturbed zone and addresses the effect of errors in the location of the sampling profiles. The simultaneous use of dilution and overcoring data provides accurate parameter estimates in the presence of measurement errors, allows the identification of the right hypothesis about the borehole disturbed zone and diminishes other model uncertainties such as those caused by errors in the volume of the circulation system and the effective diffusion coefficient of the filter. The proper interpretation of the experiment requires the right hypothesis about the borehole disturbed zone. A wrong assumption leads to large estimation errors. The use of model identification criteria helps in the selection of the best model. Small errors in the depth of the overcoring samples lead to large parameter estimation errors. Therefore, attention should be paid to minimize the errors in positioning the depth of the samples. The results of the identifiability analysis do not depend on the particular realization of random numbers. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background:Erythropoiesis-stimulating agents (ESAs) reduce the need for red blood cell transfusions; however, they increase the risk of thromboembolic events and mortality. The impact of ESAs on quality of life (QoL) is controversial and led to different recommendations of medical societies and authorities in the USA and Europe. We aimed to critically evaluate and quantify the effects of ESAs on QoL in cancer patients.Methods:We included data from randomised controlled trials (RCTs) on the effects of ESAs on QoL in cancer patients. Randomised controlled trials were identified by searching electronic data bases and other sources up to January 2011. To reduce publication and outcome reporting biases, we included unreported results from clinical study reports. We conducted meta-analyses on fatigue- and anaemia-related symptoms measured with the Functional Assessment of Cancer Therapy-Fatigue (FACT-F) and FACT-Anaemia (FACT-An) subscales (primary outcomes) or other validated instruments.Results:We identified 58 eligible RCTs. Clinical study reports were available for 27% (4 out of 15) of the investigator-initiated trials and 95% (41 out of 43) of the industry-initiated trials. We excluded 21 RTCs as we could not use their QoL data for meta-analyses, either because of incomplete reporting (17 RCTs) or because of premature closure of the trial (4 RCTs). We included 37 RCTs with 10 581 patients; 21 RCTs were placebo controlled. Chemotherapy was given in 27 of the 37 RCTs. The median baseline haemoglobin (Hb) level was 10.1 g dl(-1); in 8 studies ESAs were stopped at Hb levels below 13 g dl(-1) and in 27 above 13 g dl(-1). For FACT-F, the mean difference (MD) was 2.41 (95% confidence interval (95% CI) 1.39-3.43; P<0.0001; 23 studies, n=6108) in all cancer patients and 2.81 (95% CI 1.73-3.90; P<0.0001; 19 RCTs, n=4697) in patients receiving chemotherapy, which was below the threshold (⩾3) for a clinically important difference (CID). Erythropoiesis-stimulating agents had a positive effect on anaemia-related symptoms (MD 4.09; 95% CI 2.37-5.80; P=0.001; 14 studies, n=2765) in all cancer patients and 4.50 (95% CI 2.55-6.45; P<0.0001; 11 RCTs, n=2436) in patients receiving chemotherapy, which was above the threshold (⩾4) for a CID. Of note, this effect persisted when we restricted the analysis to placebo-controlled RCTs in patients receiving chemotherapy. There was some evidence that the MDs for FACT-F were above the threshold for a CID in RCTs including cancer patients receiving chemotherapy with Hb levels below 12 g dl(-1) at baseline and in RCTs stopping ESAs at Hb levels above 13 g dl(-1). However, these findings for FACT-F were not confirmed when we restricted the analysis to placebo-controlled RCTs in patients receiving chemotherapy.Conclusions:In cancer patients, particularly those receiving chemotherapy, we found that ESAs provide a small but clinically important improvement in anaemia-related symptoms (FACT-An). For fatigue-related symptoms (FACT-F), the overall effect did not reach the threshold for a CID.British Journal of Cancer advance online publication, 17 April 2014; doi:10.1038/bjc.2014.171 www.bjcancer.com.
Resumo:
Aims: The reported rate of stent thrombosis (ST) after drug-eluting stent (DES) implantation varies among registries. To investigate differences in baseline characteristics and clinical outcome in European and Japanese all-comers registries, we performed a pooled analysis of patient-level data. Methods and results: The j-Cypher registry (JC) is a multicentre observational study conducted in Japan, including 12,824 patients undergoing SES implantation. From the Bern-Rotterdam registry (BR) enrolled at two academic hospitals in Switzerland and the Netherlands, 3,823 patients with SES were included in the current analysis. Patients in BR were younger, more frequently smokers and presented more frequently with ST-elevation myocardial infarction (MI). Conversely, JC patients more frequently had diabetes and hypertension. At five years, the definite ST rate was significantly lower in JC than BR (JC 1.6% vs. BR 3.3%, p<0.001), while the unadjusted mortality tended to be lower in BR than in JC (BR 13.2% vs. JC 14.4%, log-rank p=0.052). After adjustment, the j-Cypher registry was associated with a significantly lower risk of all-cause mortality (HR 0.56, 95% CI: 0.49-0.64) as well as definite stent thrombosis (HR 0.46, 95% CI: 0.35-0.61). Conclusions: The baseline characteristics of the two large registries were different. After statistical adjustment, JC was associated with lower mortality and ST.
Resumo:
CONTEXT Subclinical hypothyroidism has been associated with increased risk of coronary heart disease (CHD), particularly with thyrotropin levels of 10.0 mIU/L or greater. The measurement of thyroid antibodies helps predict the progression to overt hypothyroidism, but it is unclear whether thyroid autoimmunity independently affects CHD risk. OBJECTIVE The objective of the study was to compare the CHD risk of subclinical hypothyroidism with and without thyroid peroxidase antibodies (TPOAbs). DATA SOURCES AND STUDY SELECTION A MEDLINE and EMBASE search from 1950 to 2011 was conducted for prospective cohorts, reporting baseline thyroid function, antibodies, and CHD outcomes. DATA EXTRACTION Individual data of 38 274 participants from six cohorts for CHD mortality followed up for 460 333 person-years and 33 394 participants from four cohorts for CHD events. DATA SYNTHESIS Among 38 274 adults (median age 55 y, 63% women), 1691 (4.4%) had subclinical hypothyroidism, of whom 775 (45.8%) had positive TPOAbs. During follow-up, 1436 participants died of CHD and 3285 had CHD events. Compared with euthyroid individuals, age- and gender-adjusted risks of CHD mortality in subclinical hypothyroidism were similar among individuals with and without TPOAbs [hazard ratio (HR) 1.15, 95% confidence interval (CI) 0.87-1.53 vs HR 1.26, CI 1.01-1.58, P for interaction = .62], as were risks of CHD events (HR 1.16, CI 0.87-1.56 vs HR 1.26, CI 1.02-1.56, P for interaction = .65). Risks of CHD mortality and events increased with higher thyrotropin, but within each stratum, risks did not differ by TPOAb status. CONCLUSIONS CHD risk associated with subclinical hypothyroidism did not differ by TPOAb status, suggesting that biomarkers of thyroid autoimmunity do not add independent prognostic information for CHD outcomes.
Resumo:
Navigation of deep space probes is most commonly operated using the spacecraft Doppler tracking technique. Orbital parameters are determined from a series of repeated measurements of the frequency shift of a microwave carrier over a given integration time. Currently, both ESA and NASA operate antennas at several sites around the world to ensure the tracking of deep space probes. Just a small number of software packages are nowadays used to process Doppler observations. The Astronomical Institute of the University of Bern (AIUB) has recently started the development of Doppler data processing capabilities within the Bernese GNSS Software. This software has been extensively used for Precise Orbit Determination of Earth orbiting satellites using GPS data collected by on-board receivers and for subsequent determination of the Earth gravity field. In this paper, we present the currently achieved status of the Doppler data modeling and orbit determination capabilities in the Bernese GNSS Software using GRAIL data. In particular we will focus on the implemented orbit determination procedure used for the combined analysis of Doppler and intersatellite Ka-band data. We show that even at this earlier stage of the development we can achieve an accuracy of few mHz on two-way S-band Doppler observation and of 2 µm/s on KBRR data from the GRAIL primary mission phase.
Resumo:
OBJECTIVE The objective was to determine the risk of stroke associated with subclinical hypothyroidism. DATA SOURCES AND STUDY SELECTION Published prospective cohort studies were identified through a systematic search through November 2013 without restrictions in several databases. Unpublished studies were identified through the Thyroid Studies Collaboration. We collected individual participant data on thyroid function and stroke outcome. Euthyroidism was defined as TSH levels of 0.45-4.49 mIU/L, and subclinical hypothyroidism was defined as TSH levels of 4.5-19.9 mIU/L with normal T4 levels. DATA EXTRACTION AND SYNTHESIS We collected individual participant data on 47 573 adults (3451 subclinical hypothyroidism) from 17 cohorts and followed up from 1972-2014 (489 192 person-years). Age- and sex-adjusted pooled hazard ratios (HRs) for participants with subclinical hypothyroidism compared to euthyroidism were 1.05 (95% confidence interval [CI], 0.91-1.21) for stroke events (combined fatal and nonfatal stroke) and 1.07 (95% CI, 0.80-1.42) for fatal stroke. Stratified by age, the HR for stroke events was 3.32 (95% CI, 1.25-8.80) for individuals aged 18-49 years. There was an increased risk of fatal stroke in the age groups 18-49 and 50-64 years, with a HR of 4.22 (95% CI, 1.08-16.55) and 2.86 (95% CI, 1.31-6.26), respectively (p trend 0.04). We found no increased risk for those 65-79 years old (HR, 1.00; 95% CI, 0.86-1.18) or ≥ 80 years old (HR, 1.31; 95% CI, 0.79-2.18). There was a pattern of increased risk of fatal stroke with higher TSH concentrations. CONCLUSIONS Although no overall effect of subclinical hypothyroidism on stroke could be demonstrated, an increased risk in subjects younger than 65 years and those with higher TSH concentrations was observed.
Resumo:
We analyzed more than 200 OSIRIS NAC images with a pixel scale of 0.9-2.4 m/pixel of comet 67P/Churyumov-Gerasimenko (67P) that have been acquired from onboard the Rosetta spacecraft in August and September 2014 using stereo-photogrammetric methods (SPG). We derived improved spacecraft position and pointing data for the OSIRIS images and a high-resolution shape model that consists of about 16 million facets (2 m horizontal sampling) and a typical vertical accuracy at the decimeter scale. From this model, we derive a volume for the northern hemisphere of 9.35 km(3) +/- 0.1 km(3). With the assumption of a homogeneous density distribution and taking into account the current uncertainty of the position of the comet's center-of-mass, we extrapolated this value to an overall volume of 18.7 km(3) +/- 1.2 km(3), and, with a current best estimate of 1.0 X 10(13) kg for the mass, we derive a bulk density of 535 kg/m(3) +/- 35 kg/m(3). Furthermore, we used SPG methods to analyze the rotational elements of 67P. The rotational period for August and September 2014 was determined to be 12.4041 +/- 0.0004 h. For the orientation of the rotational axis (z-axis of the body-fixed reference frame) we derived a precession model with a half-cone angle of 0.14 degrees, a cone center position at 69.54 degrees/64.11 degrees (RA/Dec J2000 equatorial coordinates), and a precession period of 10.7 days. For the definition of zero longitude (x-axis orientation), we finally selected the boulder-like Cheops feature on the big lobe of 67P and fixed its spherical coordinates to 142.35 degrees right-hand-rule eastern longitude and -0.28 degrees latitude. This completes the definition of the new Cheops reference frame for 67P. Finally, we defined cartographic mapping standards for common use and combined analyses of scientific results that have been obtained not only within the OSIRIS team, but also within other groups of the Rosetta mission.
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
Data management and sharing are relatively new concepts in the health and life sciences fields. This presentation will cover some basic policies as well as the impediments to data sharing unique to health and life sciences data.
Resumo:
With most clinical trials, missing data presents a statistical problem in evaluating a treatment's efficacy. There are many methods commonly used to assess missing data; however, these methods leave room for bias to enter the study. This thesis was a secondary analysis on data taken from TIME, a phase 2 randomized clinical trial conducted to evaluate the safety and effect of the administration timing of bone marrow mononuclear cells (BMMNC) for subjects with acute myocardial infarction (AMI).^ We evaluated the effect of missing data by comparing the variance inflation factor (VIF) of the effect of therapy between all subjects and only subjects with complete data. Through the general linear model, an unbiased solution was made for the VIF of the treatment's efficacy using the weighted least squares method to incorporate missing data. Two groups were identified from the TIME data: 1) all subjects and 2) subjects with complete data (baseline and follow-up measurements). After the general solution was found for the VIF, it was migrated Excel 2010 to evaluate data from TIME. The resulting numerical value from the two groups was compared to assess the effect of missing data.^ The VIF values from the TIME study were considerably less in the group with missing data. By design, we varied the correlation factor in order to evaluate the VIFs of both groups. As the correlation factor increased, the VIF values increased at a faster rate in the group with only complete data. Furthermore, while varying the correlation factor, the number of subjects with missing data was also varied to see how missing data affects the VIF. When subjects with only baseline data was increased, we saw a significant rate increase in VIF values in the group with only complete data while the group with missing data saw a steady and consistent increase in the VIF. The same was seen when we varied the group with follow-up only data. This essentially showed that the VIFs steadily increased when missing data is not ignored. When missing data is ignored as with our comparison group, the VIF values sharply increase as correlation increases.^
Resumo:
Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^
Resumo:
Pteropods are a group of holoplanktonic gastropods for which global biomass distribution patterns remain poorly resolved. The aim of this study was to collect and synthesize existing pteropod (Gymnosomata, Thecosomata and Pseudothecosomata) abundance and biomass data, in order to evaluate the global distribution of pteropod carbon biomass, with a particular emphasis on its seasonal, temporal and vertical patterns. We collected 25 902 data points from several online databases and a number of scientific articles. The biomass data has been gridded onto a 360 x 180° grid, with a vertical resolution of 33 WOA depth levels. Data has been converted to NetCDF format. Data were collected between 1951-2010, with sampling depths ranging from 0-1000 m. Pteropod biomass data was either extracted directly or derived through converting abundance to biomass with pteropod specific length to weight conversions. In the Northern Hemisphere (NH) the data were distributed evenly throughout the year, whereas sampling in the Southern Hemisphere was biased towards the austral summer months. 86% of all biomass values were located in the NH, most (42%) within the latitudinal band of 30-50° N. The range of global biomass values spanned over three orders of magnitude, with a mean and median biomass concentration of 8.2 mg C l-1 (SD = 61.4) and 0.25 mg C l-1, respectively for all data points, and with a mean of 9.1 mg C l-1 (SD = 64.8) and a median of 0.25 mg C l-1 for non-zero biomass values. The highest mean and median biomass concentrations were located in the NH between 40-50° S (mean biomass: 68.8 mg C l-1 (SD = 213.4) median biomass: 2.5 mg C l-1) while, in the SH, they were within the 70-80° S latitudinal band (mean: 10.5 mg C l-1 (SD = 38.8) and median: 0.2 mg C l-1). Biomass values were lowest in the equatorial regions. A broad range of biomass concentrations was observed at all depths, with the biomass peak located in the surface layer (0-25 m) and values generally decreasing with depth. However, biomass peaks were located at different depths in different ocean basins: 0-25 m depth in the N Atlantic, 50-100 m in the Pacific, 100-200 m in the Arctic, 200-500 m in the Brazilian region and >500 m in the Indo-Pacific region. Biomass in the NH was relatively invariant over the seasonal cycle, but more seasonally variable in the SH. The collected database provides a valuable tool for modellers for the study of ecosystem processes and global biogeochemical cycles.
Resumo:
The large discrepancy between field and laboratory measurements of mineral reaction rates is a long-standing problem in earth sciences, often attributed to factors extrinsic to the mineral itself. Nevertheless, differences in reaction rate are also observed within laboratory measurements, raising the possibility of intrinsic variations as well. Critical insight is available from analysis of the relationship between the reaction rate and its distribution over the mineral surface. This analysis recognizes the fundamental variance of the rate. The resulting anisotropic rate distributions are completely obscured by the common practice of surface area normalization. In a simple experiment using a single crystal and its polycrystalline counterpart, we demonstrate the sensitivity of dissolution rate to grain size, results that undermine the use of "classical" rate constants. Comparison of selected published crystal surface step retreat velocities (Jordan and Rammensee, 1998) as well as large single crystal dissolution data (Busenberg and Plummer, 1986) provide further evidence of this fundamental variability. Our key finding highlights the unsubstantiated use of a single-valued "mean" rate or rate constant as a function of environmental conditions. Reactivity predictions and long-term reservoir stability calculations based on laboratory measurements are thus not directly applicable to natural settings without a probabilistic approach. Such a probabilistic approach must incorporate both the variation of surface energy as a general range (intrinsic variation) as well as constraints to this variation owing to the heterogeneity of complex material (e.g., density of domain borders). We suggest the introduction of surface energy spectra (or the resulting rate spectra) containing information about the probability of existing rate ranges and the critical modes of surface energy.