964 resultados para data fitting


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In natural history studies of chronic disease, it is of interest to understand the evolution of key variables that measure aspects of disease progression. This is particularly true for immunological variables in persons infected with the Human Immunodeficiency Virus (HIV). The natural timescale for such studies is time since infection. However, most data available for analysis arise from prevalent cohorts, where the date of infection is unknown for most or all individuals. As a result, standard curve fitting algorithms are not immediately applicable. Here we propose two methods to circumvent this difficulty. The first uses repeated measurement data to provide information not only on the level of the variable of interest, but also on its rate of change, while the second uses an estimate of the expected time since infection. Both methods are based on the principal curves algorithm of Hastie and Stuetzle, and are applied to data from a prevalent cohort of HIV-infected homosexual men, giving estimates of the average pattern of CD4+ lymphocyte decline. These methods are applicable to natural history studies using data from prevalent cohorts where the time of disease origin is uncertain, provided certain ancillary information is available from external sources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimation of the number of mixture components (k) is an unsolved problem. Available methods for estimation of k include bootstrapping the likelihood ratio test statistics and optimizing a variety of validity functionals such as AIC, BIC/MDL, and ICOMP. We investigate the minimization of distance between fitted mixture model and the true density as a method for estimating k. The distances considered are Kullback-Leibler (KL) and “L sub 2”. We estimate these distances using cross validation. A reliable estimate of k is obtained by voting of B estimates of k corresponding to B cross validation estimates of distance. This estimation methods with KL distance is very similar to Monte Carlo cross validated likelihood methods discussed by Smyth (2000). With focus on univariate normal mixtures, we present simulation studies that compare the cross validated distance method with AIC, BIC/MDL, and ICOMP. We also apply the cross validation estimate of distance approach along with AIC, BIC/MDL and ICOMP approach, to data from an osteoporosis drug trial in order to find groups that differentially respond to treatment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerative distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicate for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c-myc expression level, is subject to measurement error.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple outcomes data are commonly used to characterize treatment effects in medical research, for instance, multiple symptoms to characterize potential remission of a psychiatric disorder. Often either a global, i.e. symptom-invariant, treatment effect is evaluated. Such a treatment effect may over generalize the effect across the outcomes. On the other hand individual treatment effects, varying across all outcomes, are complicated to interpret, and their estimation may lose precision relative to a global summary. An effective compromise to summarize the treatment effect may be through patterns of the treatment effects, i.e. "differentiated effects." In this paper we propose a two-category model to differentiate treatment effects into two groups. A model fitting algorithm and simulation study are presented, and several methods are developed to analyze heterogeneity presenting in the treatment effects. The method is illustrated using an analysis of schizophrenia symptom data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We are concerned with the estimation of the exterior surface of tube-shaped anatomical structures. This interest is motivated by two distinct scientific goals, one dealing with the distribution of HIV microbicide in the colon and the other with measuring degradation in white-matter tracts in the brain. Our problem is posed as the estimation of the support of a distribution in three dimensions from a sample from that distribution, possibly measured with error. We propose a novel tube-fitting algorithm to construct such estimators. Further, we conduct a simulation study to aid in the choice of a key parameter of the algorithm, and we test our algorithm with validation study tailored to the motivating data sets. Finally, we apply the tube-fitting algorithm to a colon image produced by single photon emission computed tomography (SPECT)and to a white-matter tract image produced using diffusion tensor `imaging (DTI).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Time-averaged discharge rates (TADR) were calculated for five lava flows at Pacaya Volcano (Guatemala), using an adapted version of a previously developed satellite-based model. Imagery acquired during periods of effusive activity between the years 2000 and 2010 were obtained from two sensors of differing temporal and spatial resolutions; the Moderate Resolution Imaging Spectroradiometer (MODIS), and the Geostationary Operational Environmental Satellites (GOES) Imager. A total of 2873 MODIS and 2642 GOES images were searched manually for volcanic “hot spots”. It was found that MODIS imagery, with superior spatial resolution, produced better results than GOES imagery, so only MODIS data were used for quantitative analyses. Spectral radiances were transformed into TADR via two methods; first, by best-fitting some of the parameters (i.e. density, vesicularity, crystal content, temperature change) of the TADR estimation model to match flow volumes previously estimated from ground surveys and aerial photographs, and second by measuring those parameters from lava samples to make independent estimates. A relatively stable relationship was defined using the second method, which suggests the possibility of estimating lava discharge rates in near-real-time during future volcanic crises at Pacaya.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Volcán Pacaya is one of three currently active volcanoes in Guatemala. Volcanic activity originates from the local tectonic subduction of the Cocos plate beneath the Caribbean plate along the Pacific Guatemalan coast. Pacaya is characterized by generally strombolian type activity with occasional larger vulcanian type eruptions approximately every ten years. One particularly large eruption occurred on May 27, 2010. Using GPS data collected for approximately 8 years before this eruption and data from an additional three years of collection afterwards, surface movement covering the period of the eruption can be measured and used as a tool to help understand activity at the volcano. Initial positions were obtained from raw data using the Automatic Precise Positioning Service provided by the NASA Jet Propulsion Laboratory. Forward modeling of observed 3-D displacements for three time periods (before, covering and after the May 2010 eruption) revealed that a plausible source for deformation is related to a vertical dike or planar surface trending NNW-SSE through the cone. For three distinct time periods the best fitting models describe deformation of the volcano: 0.45 right lateral movement and 0.55 m tensile opening along the dike mentioned above from October 2001 through January 2009 (pre-eruption); 0.55 m left lateral slip along the dike mentioned above for the period from January 2009 and January 2011 (covering the eruption); -0.025 m dip slip along the dike for the period from January 2011 through March 2013 (post-eruption). In all bestfit models the dike is oriented with a 75° westward dip. These data have respective RMS misfit values of 5.49 cm, 12.38 cm and 6.90 cm for each modeled period. During the time period that includes the eruption the volcano most likely experienced a combination of slip and inflation below the edifice which created a large scar at the surface down the northern flank of the volcano. All models that a dipping dike may be experiencing a combination of inflation and oblique slip below the edifice which augments the possibility of a westward collapse in the future.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Absolute quantitation of clinical (1)H-MR spectra is virtually always incomplete for single subjects because the separate determination of spectrum, baseline, and transverse and longitudinal relaxation times in single subjects is prohibitively long. Integrated Processing and Acquisition of Data (IPAD) based on a combined 2-dimensional experimental and fitting strategy is suggested to substantially improve the information content from a given measurement time. A series of localized saturation-recovery spectra was recorded and combined with 2-dimensional prior-knowledge fitting to simultaneously determine metabolite T(1) (from analysis of the saturation-recovery time course), metabolite T(2) (from lineshape analysis based on metabolite and water peak shapes), macromolecular baseline (based on T(1) differences and analysis of the saturation-recovery time course), and metabolite concentrations (using prior knowledge fitting and conventional procedures of absolute standardization). The procedure was tested on metabolite solutions and applied in 25 subjects (15-78 years old). Metabolite content was comparable to previously found values. Interindividual variation was larger than intraindividual variation in repeated spectra for metabolite content as well as for some relaxation times. Relaxation times were different for various metabolite groups. Parts of the interindividual variation could be explained by significant age dependence of relaxation times.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Localized short-echo-time (1)H-MR spectra of human brain contain contributions of many low-molecular-weight metabolites and baseline contributions of macromolecules. Two approaches to model such spectra are compared and the data acquisition sequence, optimized for reproducibility, is presented. Modeling relies on prior knowledge constraints and linear combination of metabolite spectra. Investigated was what can be gained by basis parameterization, i.e., description of basis spectra as sums of parametric lineshapes. Effects of basis composition and addition of experimentally measured macromolecular baselines were investigated also. Both fitting methods yielded quantitatively similar values, model deviations, error estimates, and reproducibility in the evaluation of 64 spectra of human gray and white matter from 40 subjects. Major advantages of parameterized basis functions are the possibilities to evaluate fitting parameters separately, to treat subgroup spectra as independent moieties, and to incorporate deviations from straightforward metabolite models. It was found that most of the 22 basis metabolites used may provide meaningful data when comparing patient cohorts. In individual spectra, sums of closely related metabolites are often more meaningful. Inclusion of a macromolecular basis component leads to relatively small, but significantly different tissue content for most metabolites. It provides a means to quantitate baseline contributions that may contain crucial clinical information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose Ophthalmologists are confronted with a set of different image modalities to diagnose eye tumors e.g., fundus photography, CT and MRI. However, these images are often complementary and represent pathologies differently. Some aspects of tumors can only be seen in a particular modality. A fusion of modalities would improve the contextual information for diagnosis. The presented work attempts to register color fundus photography with MRI volumes. This would complement the low resolution 3D information in the MRI with high resolution 2D fundus images. Methods MRI volumes were acquired from 12 infants under the age of 5 with unilateral retinoblastoma. The contrast-enhanced T1-FLAIR sequence was performed with an isotropic resolution of less than 0.5mm. Fundus images were acquired with a RetCam camera. For healthy eyes, two landmarks were used: the optic disk and the fovea. The eyes were detected and extracted from the MRI volume using a 3D adaption of the Fast Radial Symmetry Transform (FRST). The cropped volume was automatically segmented using the Split Bregman algorithm. The optic nerve was enhanced by a Frangi vessel filter. By intersection the nerve with the retina the optic disk was found. The fovea position was estimated by constraining the position with the angle between the optic and the visual axis as well as the distance from the optic disk. The optical axis was detected automatically by fitting a parable on to the lens surface. On the fundus, the optic disk and the fovea were detected by using the method of Budai et al. Finally, the image was projected on to the segmented surface using the lens position as the camera center. In tumor affected eyes, the manually segmented tumors were used instead of the optic disk and macula for the registration. Results In all of the 12 MRI volumes that were tested the 24 eyes were found correctly, including healthy and pathological cases. In healthy eyes the optic nerve head was found in all of the tested eyes with an error of 1.08 +/- 0.37mm. A successful registration can be seen in figure 1. Conclusions The presented method is a step toward automatic fusion of modalities in ophthalmology. The combination enhances the MRI volume with higher resolution from the color fundus on the retina. Tumor treatment planning is improved by avoiding critical structures and disease progression monitoring is made easier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is a challenge to measure the impact of releasing data to the public since the effects may not be directly linked to particular open data activities or substantial impact may only occur several years after publishing the data. This paper proposes a framework to assess the impact of releasing open data by applying the Social Return on Investment (SROI) approach. SROI was developed for organizations intended to generate social and environmental benefits thus fitting the purpose of most open data initiatives. We link the four steps of SROI (input, output, outcome, impact) with the 14 high-value data categories of the G8 Open Data Charter to create a matrix of open data examples, activities, and impacts in each of the data categories. This Impact Monitoring Framework helps data providers to navigate the impact space of open data laying out the conceptual basis for further research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposed an automated three-dimensional (3D) lumbar intervertebral disc (IVD) segmentation strategy from Magnetic Resonance Imaging (MRI) data. Starting from two user supplied landmarks, the geometrical parameters of all lumbar vertebral bodies and intervertebral discs are automatically extracted from a mid-sagittal slice using a graphical model based template matching approach. Based on the estimated two-dimensional (2D) geometrical parameters, a 3D variable-radius soft tube model of the lumbar spine column is built by model fitting to the 3D data volume. Taking the geometrical information from the 3D lumbar spine column as constraints and segmentation initialization, the disc segmentation is achieved by a multi-kernel diffeomorphic registration between a 3D template of the disc and the observed MRI data. Experiments on 15 patient data sets showed the robustness and the accuracy of the proposed algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a framework for fitting multiple random walks to animal movement paths consisting of ordered sets of step lengths and turning angles. Each step and turn is assigned to one of a number of random walks, each characteristic of a different behavioral state. Behavioral state assignments may be inferred purely from movement data or may include the habitat type in which the animals are located. Switching between different behavioral states may be modeled explicitly using a state transition matrix estimated directly from data, or switching probabilities may take into account the proximity of animals to landscape features. Model fitting is undertaken within a Bayesian framework using the WinBUGS software. These methods allow for identification of different movement states using several properties of observed paths and lead naturally to the formulation of movement models. Analysis of relocation data from elk released in east-central Ontario, Canada, suggests a biphasic movement behavior: elk are either in an "encamped" state in which step lengths are small and turning angles are high, or in an "exploratory" state, in which daily step lengths are several kilometers and turning angles are small. Animals encamp in open habitat (agricultural fields and opened forest), but the exploratory state is not associated with any particular habitat type.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When choosing among models to describe categorical data, the necessity to consider interactions makes selection more difficult. With just four variables, considering all interactions, there are 166 different hierarchical models and many more non-hierarchical models. Two procedures have been developed for categorical data which will produce the "best" subset or subsets of each model size where size refers to the number of effects in the model. Both procedures are patterned after the Leaps and Bounds approach used by Furnival and Wilson for continuous data and do not generally require fitting all models. For hierarchical models, likelihood ratio statistics (G('2)) are computed using iterative proportional fitting and "best" is determined by comparing, among models with the same number of effects, the Pr((chi)(,k)('2) (GREATERTHEQ) G(,ij)('2)) where k is the degrees of freedom for ith model of size j. To fit non-hierarchical as well as hierarchical models, a weighted least squares procedure has been developed.^ The procedures are applied to published occupational data relating to the occurrence of byssinosis. These results are compared to previously published analyses of the same data. Also, the procedures are applied to published data on symptoms in psychiatric patients and again compared to previously published analyses.^ These procedures will make categorical data analysis more accessible to researchers who are not statisticians. The procedures should also encourage more complex exploratory analyses of epidemiologic data and contribute to the development of new hypotheses for study. ^