545 resultados para Smoothing


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes a numerically simple routine for locally adaptive smoothing. The locally heterogeneous regression function is modelled as a penalized spline with a smoothly varying smoothing parameter modelled as another penalized spline. This is being formulated as hierarchical mixed model, with spline coe±cients following a normal distribution, which by itself has a smooth structure over the variances. The modelling exercise is in line with Baladandayuthapani, Mallick & Carroll (2005) or Crainiceanu, Ruppert & Carroll (2006). But in contrast to these papers Laplace's method is used for estimation based on the marginal likelihood. This is numerically simple and fast and provides satisfactory results quickly. We also extend the idea to spatial smoothing and smoothing in the presence of non normal response.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Numerous time series studies have provided strong evidence of an association between increased levels of ambient air pollution and increased levels of hospital admissions, typically at 0, 1, or 2 days after an air pollution episode. An important research aim is to extend existing statistical models so that a more detailed understanding of the time course of hospitalization after exposure to air pollution can be obtained. Information about this time course, combined with prior knowledge about biological mechanisms, could provide the basis for hypotheses concerning the mechanism by which air pollution causes disease. Previous studies have identified two important methodological questions: (1) How can we estimate the shape of the distributed lag between increased air pollution exposure and increased mortality or morbidity? and (2) How should we estimate the cumulative population health risk from short-term exposure to air pollution? Distributed lag models are appropriate tools for estimating air pollution health effects that may be spread over several days. However, estimation for distributed lag models in air pollution and health applications is hampered by the substantial noise in the data and the inherently weak signal that is the target of investigation. We introduce an hierarchical Bayesian distributed lag model that incorporates prior information about the time course of pollution effects and combines information across multiple locations. The model has a connection to penalized spline smoothing using a special type of penalty matrix. We apply the model to estimating the distributed lag between exposure to particulate matter air pollution and hospitalization for cardiovascular and respiratory disease using data from a large United States air pollution and hospitalization database of Medicare enrollees in 94 counties covering the years 1999-2002.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a state-of-the-art application of smoothing for dependent bivariate binomial spatial data to Loa loa prevalence mapping in West Africa. This application is special because it starts with the non-spatial calibration of survey instruments, continues with the spatial model building and assessment and ends with robust, tested software that will be used by the field scientists of the World Health Organization for online prevalence map updating. From a statistical perspective several important methodological issues were addressed: (a) building spatial models that are complex enough to capture the structure of the data but remain computationally usable; (b)reducing the computational burden in the handling of very large covariate data sets; (c) devising methods for comparing spatial prediction methods for a given exceedance policy threshold.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Amplifications and deletions of chromosomal DNA, as well as copy-neutral loss of heterozygosity have been associated with diseases processes. High-throughput single nucleotide polymorphism (SNP) arrays are useful for making genome-wide estimates of copy number and genotype calls. Because neighboring SNPs in high throughput SNP arrays are likely to have dependent copy number and genotype due to the underlying haplotype structure and linkage disequilibrium, hidden Markov models (HMM) may be useful for improving genotype calls and copy number estimates that do not incorporate information from nearby SNPs. We improve previous approaches that utilize a HMM framework for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding confidence scores when available. Using simulated data, we demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package ICE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth will produce curve with nicks occurring at the censoring times, whereas there is no such problem with the least squares method. Furthermore, the asymptotic variance of the least squares estimator is shown to be smaller under regularity conditions. However, in the implementation of the bootstrap procedures, the moment method is computationally more efficient than the least squares method because the former approach uses condensed bootstrap data. The performance of the proposed procedures is studied through Monte Carlo simulations and an epidemiological example on intravenous drug users.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Visual fixation is employed by humans and some animals to keep a specific 3D location at the center of the visual gaze. Inspired by this phenomenon in nature, this paper explores the idea to transfer this mechanism to the context of video stabilization for a handheld video camera. A novel approach is presented that stabilizes a video by fixating on automatically extracted 3D target points. This approach is different from existing automatic solutions that stabilize the video by smoothing. To determine the 3D target points, the recorded scene is analyzed with a stateof- the-art structure-from-motion algorithm, which estimates camera motion and reconstructs a 3D point cloud of the static scene objects. Special algorithms are presented that search either virtual or real 3D target points, which back-project close to the center of the image for as long a period of time as possible. The stabilization algorithm then transforms the original images of the sequence so that these 3D target points are kept exactly in the center of the image, which, in case of real 3D target points, produces a perfectly stable result at the image center. Furthermore, different methods of additional user interaction are investigated. It is shown that the stabilization process can easily be controlled and that it can be combined with state-of-theart tracking techniques in order to obtain a powerful image stabilization tool. The approach is evaluated on a variety of videos taken with a hand-held camera in natural scenes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Stochastic models for three-dimensional particles have many applications in applied sciences. Lévy–based particle models are a flexible approach to particle modelling. The structure of the random particles is given by a kernel smoothing of a Lévy basis. The models are easy to simulate but statistical inference procedures have not yet received much attention in the literature. The kernel is not always identifiable and we suggest one approach to remedy this problem. We propose a method to draw inference about the kernel from data often used in local stereology and study the performance of our approach in a simulation study.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One common assumption in interpreting ice-core CO(2) records is that diffusion in the ice does not affect the concentration profile. However, this assumption remains untested because the extremely small CO(2) diffusion coefficient in ice has not been accurately determined in the laboratory. In this study we take advantage of high levels of CO(2) associated with refrozen layers in an ice core from Siple Dome, Antarctica, to study CO(2) diffusion rates. We use noble gases (Xe/Ar and Kr/Ar), electrical conductivity and Ca(2+) ion concentrations to show that substantial CO(2) diffusion may occur in ice on timescales of thousands of years. We estimate the permeation coefficient for CO(2) in ice is similar to 4 x 10(-21) mol m(-1) s(-1) Pa(-1) at -23 degrees C in the top 287 m (corresponding to 2.74 kyr). Smoothing of the CO(2) record by diffusion at this depth/age is one or two orders of magnitude smaller than the smoothing in the firn. However, simulations for depths of similar to 930-950m (similar to 60-70 kyr) indicate that smoothing of the CO(2) record by diffusion in deep ice is comparable to smoothing in the firn. Other types of diffusion (e.g. via liquid in ice grain boundaries or veins) may also be important but their influence has not been quantified.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently a new method to set the scale in lattice gauge theories, based on the gradient flow generated by the Wilson action, has been proposed, and the systematic errors of the new scales t0 and w0 have been investigated by various groups. The Wilson flow provides also an interesting alternative smoothing procedure particularly useful for the measurement of the topological charge as a pure gluonic observable. We show the viability of this method for N=1 supersymmetric Yang-Mills theory by analysing the configurations produced by the DESY-Muenster Collaboration. The relation between the scale and the topological charge has been investigated showing a strong correlation. We have found that the scale has a linear dependence on the topological charge, the slope of which increases decreasing the volume and the gluino mass. Moreover we have investigated this dependence as a function of the reference parameter used to define the scale: the tuning of this parameter turns out to be fundamental for a more reliable scale setting. Similar conclusions hold for the Sommer parameter r0.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The concentrations of chironomid remains in lake sediments are very variable and, therefore, chironomid stratigraphies often include samples with a low number of counts. Thus, the effect of low count sums on reconstructed temperatures is an important issue when applying chironomid‐temperature inference models. Using an existing data set, we simulated low count sums by randomly picking subsets of head capsules from surface‐sediment samples with a high number of specimens. Subsequently, a chironomid‐temperature inference model was used to assess how the inferred temperatures are affected by low counts. The simulations indicate that the variability of inferred temperatures increases progressively with decreasing count sums. At counts below 50 specimens, a further reduction in count sum can cause a disproportionate increase in the variation of inferred temperatures, whereas at higher count sums the inferences are more stable. Furthermore, low count samples may consistently infer too low or too high temperatures and, therefore, produce a systematic error in a reconstruction. Smoothing reconstructed temperatures downcore is proposed as a possible way to compensate for the high variability due to low count sums. By combining adjacent samples in a stratigraphy, to produce samples of a more reliable size, it is possible to assess if low counts cause a systematic error in inferred temperatures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a comparison of different definitions of the topological charge on the lattice, using a small-volume ensemble with 2 flavours of dynamical twisted mass fermions. The investigated definitions are: index of the overlap Dirac operator, spectral projectors, spectral flow of the HermitianWilson- Dirac operator and field theoretic with different kinds of smoothing of gauge fields (HYP and APE smearings, gradient flow, cooling). We also show some results on the topological susceptibility.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Data Envelopment Analysis (DEA) efficiency score obtained for an individual firm is a point estimate without any confidence interval around it. In recent years, researchers have resorted to bootstrapping in order to generate empirical distributions of efficiency scores. This procedure assumes that all firms have the same probability of getting an efficiency score from any specified interval within the [0,1] range. We propose a bootstrap procedure that empirically generates the conditional distribution of efficiency for each individual firm given systematic factors that influence its efficiency. Instead of resampling directly from the pooled DEA scores, we first regress these scores on a set of explanatory variables not included at the DEA stage and bootstrap the residuals from this regression. These pseudo-efficiency scores incorporate the systematic effects of unit-specific factors along with the contribution of the randomly drawn residual. Data from the U.S. airline industry are utilized in an empirical application.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many studies have shown relationships between air pollution and the rate of hospital admissions for asthma. A few studies have controlled for age-specific effects by adding separate smoothing functions for each age group. However, it has not yet been reported whether air pollution effects are significantly different for different age groups. This lack of information is the motivation for this study, which tests the hypothesis that air pollution effects on asthmatic hospital admissions are significantly different by age groups. Each air pollutant's effect on asthmatic hospital admissions by age groups was estimated separately. In this study, daily time-series data for hospital admission rates from seven cities in Korea from June 1999 through 2003 were analyzed. The outcome variable, daily hospital admission rates for asthma, was related to five air pollutants which were used as the independent variables, namely particulate matter <10 micrometers (μm) in aerodynamic diameter (PM10), carbon monoxide (CO), ozone (O3), nitrogen dioxide (NO2), and sulfur dioxide (SO2). Meteorological variables were considered as confounders. Admission data were divided into three age groups: children (<15 years of age), adults (ages 15-64), and elderly (≥ 65 years of age). The adult age group was considered to be the reference group for each city. In order to estimate age-specific air pollution effects, the analysis was separated into two stages. In the first stage, Generalized Additive Models (GAMs) with cubic spline for smoothing were applied to estimate the age-city-specific air pollution effects on asthmatic hospital admission rates by city and age group. In the second stage, the Bayesian Hierarchical Model with non-informative prior which has large variance was used to combine city-specific effects by age groups. The hypothesis test showed that the effects of PM10, CO and NO2 were significantly different by age groups. Assuming that the air pollution effect for adults is zero as a reference, age-specific air pollution effects were: -0.00154 (95% confidence interval(CI)= (-0.0030,-0.0001)) for children and 0.00126 (95% CI = (0.0006, 0.0019)) for the elderly for PM 10; -0.0195 (95% CI = (-0.0386,-0.0004)) for children for CO; and 0.00494 (95% CI = (0.0028, 0.0071)) for the elderly for NO2. Relative rates (RRs) were 1.008 (95% CI = (1.000-1.017)) in adults and 1.021 (95% CI = (1.012-1.030)) in the elderly for every 10 μg/m3 increase of PM10 , 1.019 (95% CI = (1.005-1.033)) in adults and 1.022 (95% CI = (1.012-1.033)) in the elderly for every 0.1 part per million (ppm) increase of CO; 1.006 (95%CI = (1.002-1.009)) and 1.019 (95%CI = (1.007-1.032)) in the elderly for every 1 part per billion (ppb) increase of NO2 and SO2, respectively. Asthma hospital admissions were significantly increased for PM10 and CO in adults, and for PM10, CO, NO2 and SO2 in the elderly.^