962 resultados para variable data printing
Resumo:
In biostatistical applications, interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time, then the data is described by the well known singly-censored current status model, also known as interval censored data, case I. We extend this current status model by allowing the presence of a time-dependent process, which is partly observed and allowing C to depend on T through the observed part of this time-dependent process. Because of the high dimension of the covariate process, no globally efficient estimators exist with a good practical performance at moderate sample sizes. We follow the approach of Robins and Rotnitzky (1992) by modeling the censoring variable, given the time-variable and the covariate-process, i.e., the missingness process, under the restriction that it satisfied coarsening at random. We propose a generalization of the simple current status estimator of the distribution of T and of smooth functionals of the distribution of T, which is based on an estimate of the missingness. In this estimator the covariates enter only through the estimate of the missingness process. Due to the coarsening at random assumption, the estimator has the interesting property that if we estimate the missingness process more nonparametrically, then we improve its efficiency. We show that by local estimation of an optimal model or optimal function of the covariates for the missingness process, the generalized current status estimator for smooth functionals become locally efficient; meaning it is efficient if the right model or covariate is consistently estimated and it is consistent and asymptotically normal in general. Estimation of the optimal model requires estimation of the conditional distribution of T, given the covariates. Any (prior) knowledge of this conditional distribution can be used at this stage without any risk of losing root-n consistency. We also propose locally efficient one step estimators. Finally, we show some simulation results.
Resumo:
In many applications the observed data can be viewed as a censored high dimensional full data random variable X. By the curve of dimensionality it is typically not possible to construct estimators that are asymptotically efficient at every probability distribution in a semiparametric censored data model of such a high dimensional censored data structure. We provide a general method for construction of one-step estimators that are efficient at a chosen submodel of the full-data model, are still well behaved off this submodel and can be chosen to always improve on a given initial estimator. These one-step estimators rely on good estimators of the censoring mechanism and thus will require a parametric or semiparametric model for the censoring mechanism. We present a general theorem that provides a template for proving the desired asymptotic results. We illustrate the general one-step estimation methods by constructing locally efficient one-step estimators of marginal distributions and regression parameters with right-censored data, current status data and bivariate right-censored data, in all models allowing the presence of time-dependent covariates. The conditions of the asymptotics theorem are rigorously verified in one of the examples and the key condition of the general theorem is verified for all examples.
Resumo:
In estimation of a survival function, current status data arises when the only information available on individuals is their survival status at a single monitoring time. Here we briefly review extensions of this form of data structure in two directions: (i) doubly censored current status data, where there is incomplete information on the origin of the failure time random variable, and (ii) current status information on more complicated stochastic processes. Simple examples of these data forms are presented for motivation.
Resumo:
We consider nonparametric missing data models for which the censoring mechanism satisfies coarsening at random and which allow complete observations on the variable X of interest. W show that beyond some empirical process conditions the only essential condition for efficiency of an NPMLE of the distribution of X is that the regions associated with incomplete observations on X contain enough complete observations. This is heuristically explained by describing the EM-algorithm. We provide identifiably of the self-consistency equation and efficiency of the NPMLE in order to make this statement rigorous. The usual kind of differentiability conditions in the proof are avoided by using an identity which holds for the NPMLE of linear parameters in convex models. We provide a bivariate censoring application in which the condition and hence the NPMLE fails, but where other estimators, not based on the NPMLE principle, are highly inefficient. It is shown how to slightly reduce the data so that the conditions hold for the reduced data. The conditions are verified for the univariate censoring, double censored, and Ibragimov-Has'minski models.
Resumo:
In this paper we propose methods for smooth hazard estimation of a time variable where that variable is interval censored. These methods allow one to model the transformed hazard in terms of either smooth (smoothing splines) or linear functions of time and other relevant time varying predictor variables. We illustrate the use of this method on a dataset of hemophiliacs where the outcome, time to seroconversion for HIV, is interval censored and left-truncated.
Resumo:
Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic particles, with some recording only outdoor concentrations of black or elemental carbon, some recording indoor concentrations of black carbon, and others recording both indoor and outdoor concentrations of black carbon. A joint model for outdoor and indoor exposure that specifies a spatially varying latent variable provides greater spatial coverage in the area of interest. We propose a penalised spline formation of the model that relates to generalised kringing of the latent traffic pollution variable and leads to a natural Bayesian Markov Chain Monte Carlo algorithm for model fitting. We propose methods that allow us to control the degress of freedom of the smoother in a Bayesian framework. Finally, we present results from an analysis that applies the model to data from summer and winter separately
Resumo:
In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying recurrent event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the recurrent event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximizing a conditional likelihood function of observed event counts and solving estimation equations. Large sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumor study is presented to illustrate the use of the proposed methods.
Resumo:
The purpose of this study is to develop statistical methodology to facilitate indirect estimation of the concentration of antiretroviral drugs and viral loads in the prostate gland and the seminal vesicle. The differences in antiretroviral drug concentrations in these organs may lead to suboptimal concentrations in one gland compared to the other. Suboptimal levels of the antiretroviral drugs will not be able to fully suppress the virus in that gland, lead to a source of sexually transmissible virus and increase the chance of selecting for drug resistant virus. This information may be useful selecting antiretroviral drug regimen that will achieve optimal concentrations in most of male genital tract glands. Using fractionally collected semen ejaculates, Lundquist (1949) measured levels of surrogate markers in each fraction that are uniquely produced by specific male accessory glands. To determine the original glandular concentrations of the surrogate markers, Lundquist solved a simultaneous series of linear equations. This method has several limitations. In particular, it does not yield a unique solution, it does not address measurement error, and it disregards inter-subject variability in the parameters. To cope with these limitations, we developed a mechanistic latent variable model based on the physiology of the male genital tract and surrogate markers. We employ a Bayesian approach and perform a sensitivity analysis with regard to the distributional assumptions on the random effects and priors. The model and Bayesian approach is validated on experimental data where the concentration of a drug should be (biologically) differentially distributed between the two glands. In this example, the Bayesian model-based conclusions are found to be robust to model specification and this hierarchical approach leads to more scientifically valid conclusions than the original methodology. In particular, unlike existing methods, the proposed model based approach was not affected by a common form of outliers.
Resumo:
There is a need by engine manufactures for computationally efficient and accurate predictive combustion modeling tools for integration in engine simulation software for the assessment of combustion system hardware designs and early development of engine calibrations. This thesis discusses the process for the development and validation of a combustion modeling tool for Gasoline Direct Injected Spark Ignited Engine with variable valve timing, lift and duration valvetrain hardware from experimental data. Data was correlated and regressed from accepted methods for calculating the turbulent flow and flame propagation characteristics for an internal combustion engine. A non-linear regression modeling method was utilized to develop a combustion model to determine the fuel mass burn rate at multiple points during the combustion process. The computational fluid dynamic software Converge ©, was used to simulate and correlate the 3-D combustion system, port and piston geometry to the turbulent flow development within the cylinder to properly predict the experimental data turbulent flow parameters through the intake, compression and expansion processes. The engine simulation software GT-Power © is then used to determine the 1-D flow characteristics of the engine hardware being tested to correlate the regressed combustion modeling tool to experimental data to determine accuracy. The results of the combustion modeling tool show accurate trends capturing the combustion sensitivities to turbulent flow, thermodynamic and internal residual effects with changes in intake and exhaust valve timing, lift and duration.
Resumo:
The accuracy of simulating the aerodynamics and structural properties of the blades is crucial in the wind-turbine technology. Hence the models used to implement these features need to be very precise and their level of detailing needs to be high. With the variety of blade designs being developed the models should be versatile enough to adapt to the changes required by every design. We are going to implement a combination of numerical models which are associated with the structural and the aerodynamic part of the simulation using the computational power of a parallel HPC cluster. The structural part models the heterogeneous internal structure of the beam based on a novel implementation of the Generalized Timoshenko Beam Model Technique.. Using this technique the 3-D structure of the blade is reduced into a 1-D beam which is asymptotically equivalent. This reduces the computational cost of the model without compromising its accuracy. This structural model interacts with the Flow model which is a modified version of the Blade Element Momentum Theory. The modified version of the BEM accounts for the large deflections of the blade and also considers the pre-defined structure of the blade. The coning, sweeping of the blade, tilt of the nacelle and the twist of the sections along the blade length are all computed by the model which aren’t considered in the classical BEM theory. Each of these two models provides feedback to the other and the interactive computations lead to more accurate outputs. We successfully implemented the computational models to analyze and simulate the structural and aerodynamic aspects of the blades. The interactive nature of these models and their ability to recompute data using the feedback from each other makes this code more efficient than the commercial codes available. In this thesis we start off with the verification of these models by testing it on the well-known benchmark blade for the NREL-5MW Reference Wind Turbine, an alternative fixed-speed stall-controlled blade design proposed by Delft University, and a novel alternative design that we proposed for a variable-speed stall-controlled turbine, which offers the potential for more uniform power control and improved annual energy production.. To optimize the power output of the stall-controlled blade we modify the existing designs and study their behavior using the aforementioned aero elastic model.
Resumo:
BACKGROUND: The extent to which mortality differs following individual acquired immunodeficiency syndrome (AIDS)-defining events (ADEs) has not been assessed among patients initiating combination antiretroviral therapy. METHODS: We analyzed data from 31,620 patients with no prior ADEs who started combination antiretroviral therapy. Cox proportional hazards models were used to estimate mortality hazard ratios for each ADE that occurred in >50 patients, after stratification by cohort and adjustment for sex, HIV transmission group, number of antiretroviral drugs initiated, regimen, age, date of starting combination antiretroviral therapy, and CD4+ cell count and HIV RNA load at initiation of combination antiretroviral therapy. ADEs that occurred in <50 patients were grouped together to form a "rare ADEs" category. RESULTS: During a median follow-up period of 43 months (interquartile range, 19-70 months), 2880 ADEs were diagnosed in 2262 patients; 1146 patients died. The most common ADEs were esophageal candidiasis (in 360 patients), Pneumocystis jiroveci pneumonia (320 patients), and Kaposi sarcoma (308 patients). The greatest mortality hazard ratio was associated with non-Hodgkin's lymphoma (hazard ratio, 17.59; 95% confidence interval, 13.84-22.35) and progressive multifocal leukoencephalopathy (hazard ratio, 10.0; 95% confidence interval, 6.70-14.92). Three groups of ADEs were identified on the basis of the ranked hazard ratios with bootstrapped confidence intervals: severe (non-Hodgkin's lymphoma and progressive multifocal leukoencephalopathy [hazard ratio, 7.26; 95% confidence interval, 5.55-9.48]), moderate (cryptococcosis, cerebral toxoplasmosis, AIDS dementia complex, disseminated Mycobacterium avium complex, and rare ADEs [hazard ratio, 2.35; 95% confidence interval, 1.76-3.13]), and mild (all other ADEs [hazard ratio, 1.47; 95% confidence interval, 1.08-2.00]). CONCLUSIONS: In the combination antiretroviral therapy era, mortality rates subsequent to an ADE depend on the specific diagnosis. The proposed classification of ADEs may be useful in clinical end point trials, prognostic studies, and patient management.
Resumo:
Persons with Down syndrome (DS) uniquely have an increased frequency of leukemias but a decreased total frequency of solid tumors. The distribution and frequency of specific types of brain tumors have never been studied in DS. We evaluated the frequency of primary neural cell embryonal tumors and gliomas in a large international data set. The observed number of children with DS having a medulloblastoma, central nervous system primitive neuroectodermal tumor (CNS-PNET) or glial tumor was compared to the expected number. Data were collected from cancer registries or brain tumor registries in 13 countries of Europe, America, Asia and Oceania. The number of DS children with each category of tumor was treated as a Poisson variable with mean equal to 0.000884 times the total number of registrations in that category. Among 8,043 neural cell embryonal tumors (6,882 medulloblastomas and 1,161 CNS-PNETs), only one patient with medulloblastoma had DS, while 7.11 children in total and 6.08 with medulloblastoma were expected to have DS. (p 0.016 and 0.0066 respectively). Among 13,797 children with glioma, 10 had DS, whereas 12.2 were expected. Children with DS appear to be specifically protected against primary neural cell embryonal tumors of the CNS, whereas gliomas occur at the same frequency as in the general population. A similar protection against neuroblastoma, the principal extracranial neural cell embryonal tumor, has been observed in children with DS. Additional genetic material on the supernumerary chromosome 21 may protect against embryonal neural cell tumor development.
Resumo:
Mycobacterium bovis populations in countries with persistent bovine tuberculosis usually show a prevalent spoligotype with a wide geographical distribution. This study applied mycobacterial interspersed repetitive-unit-variable-number tandem-repeat (MIRU-VNTR) typing to a random panel of 115 M. bovis isolates that are representative of the most frequent spoligotype in the Iberian Peninsula, SB0121. VNTR typing targeted nine loci: ETR-A (alias VNTR2165), ETR-B (VNTR2461), ETR-D (MIRU4, VNTR580), ETR-E (MIRU31, VNTR3192), MIRU26 (VNTR2996), QUB11a (VNTR2163a), QUB11b (VNTR2163b), QUB26 (VNTR4052), and QUB3232 (VNTR3232). We found a high degree of diversity among the studied isolates (discriminatory index [D] = 0.9856), which were split into 65 different MIRU-VNTR types. An alternative short-format MIRU-VNTR typing targeting only the four loci with the highest variability values was found to offer an equivalent discriminatory index. Minimum spanning trees using the MIRU-VNTR data showed the hypothetical evolution of an apparent clonal group. MIRU-VNTR analysis was also applied to the isolates of 176 animals from 15 farms infected by M. bovis SB0121; in 10 farms, the analysis revealed the coexistence of two to five different MIRU types differing in one to six loci, which highlights the frequency of undetected heterogeneity.
Resumo:
Three samples of the skarn mineral rustumite Ca10(Si2O7)2(SiO4)(OH)2Cl2, space group C2/c, a ≈7.6, b ≈ 18.5, c ≈ 15.5 Å, β ≈ 104°, with variable OH, Cl, F content were investigated by electron microprobe, single-crystal X-ray structure refinements, and Raman spectroscopy. “Rust1LCl” is a low chlorine rustumite Ca10(Si2O7)2(SiO4)(OH1.88F0.12)(Cl1.28,OH0.72) from skarns associated with the Rize batholith near Ikizedere, Turkey. “Rust2F” is a F-bearing rustumite Ca10(Si2O7)2(SiO4)(OH1.13F0.87) (Cl1 96OH0.04) from xenoliths in ignimbrites of the Upper Chegem Caldera, Northern Caucasus, Russia. “Rust3LClF” represents a low-Cl, F-bearing rustumite Ca10(Si2O7)2(SiO4)0.87(H4O4)0.13(OH1.01F0.99) (Cl1.00 OH1.00) from altered merwinite skarns of the Birkhin massif, Baikal Lake area, Eastern Siberia, Russia. Rustumite from Birkhin massif is characterized by a significant hydrogarnet-like or fluorine substitution at the apices of the orthosilicate group, leading to specific atomic displacements. The crystal structures including hydrogen positions have been refined from single-crystal X-ray data to R1 = 0.0205 (Rust1_LCl), R1 = 0.0295 (Rust2_F), and R1 = 0.0243 (Rust3_LCl_F), respectively. Depletion in Cl and replacement by OH is associated with smaller unit-cell dimensions. The substitution of OH by F leads to shorter hydrogen bonds O-H⋯F instead of O-H⋯OH. Raman spectra for all samples have been measured and confirm slight strengthening of the hydrogen bonds with uptake of F.This study discusses the complex crystal chemistry of the skarn mineral rustumite and may provide a wider understanding of the chemical reactions related to contact metamorphism of limestones.