897 resultados para Numerical Analysis and Scientific Computing
Resumo:
Jewell and Kalbfleisch (1992) consider the use of marker processes for applications related to estimation of the survival distribution of time to failure. Marker processes were assumed to be stochastic processes that, at a given point in time, provide information about the current hazard and consequently on the remaining time to failure. Particular attention was paid to calculations based on a simple additive model for the relationship between the hazard function at time t and the history of the marker process up until time t. Specific applications to the analysis of AIDS data included the use of markers as surrogate responses for onset of AIDS with censored data and as predictors of the time elapsed since infection in prevalent individuals. Here we review recent work on the use of marker data to tackle these kinds of problems with AIDS data. The Poisson marker process with an additive model, introduced in Jewell and Kalbfleisch (1992) may be a useful "test" example for comparison of various procedures.
Resumo:
For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are dynamic in that the contents, including figures, tables, etc., can be recalculated each time a view of the document is generated. Our model treats a dynamic document as a master or ``source'' document from which one can generate different views in the form of traditional, derived documents for different audiences. We introduce the concept of a compendium as both a container for the different elements that make up the document and its computations (i.e. text, code, data, ...), and as a means for distributing, managing and updating the collection. The step from disseminating analyses via a compendium to reproducible research is a small one. By reproducible research, we mean research papers with accompanying software tools that allow the reader to directly reproduce the results and employ the methods that are presented in the research paper. Some of the issues involved in paradigms for the production, distribution and use of such reproducible research are discussed.
Resumo:
AIMS: A registry mandated by the European Society of Cardiology collects data on trends in interventional cardiology within Europe. Special interest focuses on relative increases and ratios in new techniques and their distributions across Europe. We report the data through 2004 and give an overview of the development of coronary interventions since the first data collection in 1992. METHODS AND RESULTS: Questionnaires were distributed yearly to delegates of all national societies of cardiology represented in the European Society of Cardiology. The goal was to collect the case numbers of all local institutions and operators. The overall numbers of coronary angiographies increased from 1992 to 2004 from 684 000 to 2 238 000 (from 1250 to 3930 per million inhabitants). The respective numbers for percutaneous coronary interventions (PCIs) and coronary stenting procedures increased from 184 000 to 885 000 (from 335 to 1550) and from 3000 to 770 000 (from 5 to 1350), respectively. Germany was the most active country with 712 000 angiographies (8600), 249 000 angioplasties (3000), and 200 000 stenting procedures (2400) in 2004. The indication has shifted towards acute coronary syndromes, as demonstrated by rising rates of interventions for acute myocardial infarction over the last decade. The procedures are more readily performed and perceived safer, as shown by increasing rate of "ad hoc" PCIs and decreasing need for emergency coronary artery bypass grafting (CABG). In 2004, the use of drug-eluting stents continued to rise. However, an enormous variability is reported with the highest rate in Switzerland (70%). If the rate of progression remains constant until 2010 the projected number of coronary angiographies will be over three million, and the number of PCIs about 1.5 million with a stenting rate of almost 100%. CONCLUSION: Interventional cardiology in Europe is ever expanding. New coronary revascularization procedures, alternative or complementary to balloon angioplasty, have come and gone. Only stenting has stood the test of time and matured to the default technique. Facilitated access to PCI, more complete and earlier detection of coronary artery disease promise continued growth of the procedure despite the uncontested success of prevention.
Resumo:
Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time (AFT) model. Assessment of the two methods is conducted through simulation studies and through analysis of microarray data obtained from a set of patients with diffuse large B-cell lymphoma where time to survival is of interest. The approaches are shown to match or exceed the predictive performance of a Cox-based and an AFT-based variable selection method. The methods are moreover shown to be much more computationally efficient than their respective Cox- and AFT- based counterparts.
Resumo:
This paper introduces a novel approach to making inference about the regression parameters in the accelerated failure time (AFT) model for current status and interval censored data. The estimator is constructed by inverting a Wald type test for testing a null proportional hazards model. A numerically efficient Markov chain Monte Carlo (MCMC) based resampling method is proposed to simultaneously obtain the point estimator and a consistent estimator of its variance-covariance matrix. We illustrate our approach with interval censored data sets from two clinical studies. Extensive numerical studies are conducted to evaluate the finite sample performance of the new estimators.
Resumo:
The ability to make scientific findings reproducible is increasingly important in areas where substantive results are the product of complex statistical computations. Reproducibility can allow others to verify the published findings and conduct alternate analyses of the same data. A question that arises naturally is how can one conduct and distribute reproducible research? This question is relevant from the point of view of both the authors who want to make their research reproducible and readers who want to reproduce relevant findings reported in the scientific literature. We present a framework in which reproducible research can be conducted and distributed via cached computations and describe specific tools for both authors and readers. As a prototype implementation we introduce three software packages written in the R language. The cacheSweave and stashR packages together provide tools for caching computational results in a key-value style database which can be published to a public repository for readers to download. The SRPM package provides tools for generating and interacting with "shared reproducibility packages" (SRPs) which can facilitate the distribution of the data and code. As a case study we demonstrate the use of the toolkit on a national study of air pollution exposure and mortality.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p < .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models.
Resumo:
We present the cacher and CodeDepends packages for R, which provide tools for (1) caching and analyzing the code for statistical analyses and (2) distributing these analyses to others in an efficient manner over the web. The cacher package takes objects created by evaluating R expressions and stores them in key-value databases. These databases of cached objects can subsequently be assembled into “cache packages” for distribution over the web. The cacher package also provides tools to help readers examine the data and code in a statistical analysis and reproduce, modify, or improve upon the results. In addition, readers can easily conduct alternate analyses of the data. The CodeDepends package provides complementary tools for analyzing and visualizing the code for a statistical analysis and this functionality has been integrated into the cacher package. In this chapter we describe the cacher and CodeDepends packages and provide examples of how they can be used for reproducible research.
Resumo:
The stashR package (a Set of Tools for Administering SHared Repositories) for R implements a simple key-value style database where character string keys are associated with data values. The key-value databases can be either stored locally on the user's computer or accessed remotely via the Internet. Methods specific to the stashR package allow users to share data repositories or access previously created remote data repositories. In particular, methods are available for the S4 classes localDB and remoteDB to insert, retrieve, or delete data from the database as well as to synchronize local copies of the data to the remote version of the database. Users efficiently access information from a remote database by retrieving only the data files indexed by user-specified keys and caching this data in a local copy of the remote database. The local and remote counterparts of the stashR package offer the potential to enhance reproducible research by allowing users of Sweave to cache their R computations for a research paper in a localDB database. This database can then be stored on the Internet as a remoteDB database. When readers of the research paper wish to reproduce the computations involved in creating a specific figure or calculating a specific numeric value, they can access the remoteDB database and obtain the R objects involved in the computation.
Resumo:
The cause of porcine congenital progressive ataxia and spastic paresis (CPA) is unknown. This severe neuropathy manifests shortly after birth and is lethal. The disease is inherited as a single autosomal recessive allele, designated cpa. In a previous study, we demonstrated close linkage of cpa to microsatellite SW902 on porcine chromosome 3 (SSC3), which corresponds syntenically to human chromosome 2. This latter chromosome contains ion channel genes (Ca(2+), K(+) and Na(+)), a cholinergic receptor gene and the spastin (SPG4) gene, which cause human epilepsy and ataxia when mutated. We mapped porcine CACNB4, KCNJ3, SCN2A and CHRNA1 to SSC15 and SPG4 to SSC3 with the INRA-Minnesota porcine radiation hybrid panel (IMpRH) and we sequenced the entire open reading frames of CACNB4 and SPG4 without finding any differences between healthy and affected piglets. An anti-epileptic drug treatment with ethosuximide did not change the severity of the disease, and pigs with CPA did not exhibit the corticospinal tract axonal degeneration found in humans suffering from hereditary spastic paraplegia, which is associated with mutations in SPG4. For all these reasons, the hypothesis that CACNB4, CHRNA1, KCNJ3, SCN2A or SPG4 are identical with the CPA gene was rejected.
Resumo:
Recent developments in clinical radiology have resulted in additional developments in the field of forensic radiology. After implementation of cross-sectional radiology and optical surface documentation in forensic medicine, difficulties in the validation and analysis of the acquired data was experienced. To address this problem and for the comparison of autopsy and radiological data a centralized database with internet technology for forensic cases was created. The main goals of the database are (1) creation of a digital and standardized documentation tool for forensic-radiological and pathological findings; (2) establishing a basis for validation of forensic cross-sectional radiology as a non-invasive examination method in forensic medicine that means comparing and evaluating the radiological and autopsy data and analyzing the accuracy of such data; and (3) providing a conduit for continuing research and education in forensic medicine. Considering the infrequent availability of CT or MRI for forensic institutions and the heterogeneous nature of case material in forensic medicine an evaluation of benefits and limitations of cross-sectional imaging concerning certain forensic features by a single institution may be of limited value. A centralized database permitting international forensic and cross disciplinary collaborations may provide important support for forensic-radiological casework and research.