950 resultados para kernel density method


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The identification of disease clusters in space or space-time is of vital importance for public health policy and action. In the case of methicillin-resistant Staphylococcus aureus (MRSA), it is particularly important to distinguish between community and health care-associated infections, and to identify reservoirs of infection. 832 cases of MRSA in the West Midlands (UK) were tested for clustering and evidence of community transmission, after being geo-located to the centroids of UK unit postcodes (postal areas roughly equivalent to Zip+4 zip code areas). An age-stratified analysis was also carried out at the coarser spatial resolution of UK Census Output Areas. Stochastic simulation and kernel density estimation were combined to identify significant local clusters of MRSA (p<0.025), which were supported by SaTScan spatial and spatio-temporal scan. In order to investigate local sampling effort, a spatial 'random labelling' approach was used, with MRSA as cases and MSSA (methicillin-sensitive S. aureus) as controls. Heavy sampling in general was a response to MRSA outbreaks, which in turn appeared to be associated with medical care environments. The significance of clusters identified by kernel estimation was independently supported by information on the locations and client groups of nursing homes, and by preliminary molecular typing of isolates. In the absence of occupational/ lifestyle data on patients, the assumption was made that an individual's location and consequent risk is adequately represented by their residential postcode. The problems of this assumption are discussed, with recommendations for future data collection.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Quantile regression (QR) was first introduced by Roger Koenker and Gilbert Bassett in 1978. It is robust to outliers which affect least squares estimator on a large scale in linear regression. Instead of modeling mean of the response, QR provides an alternative way to model the relationship between quantiles of the response and covariates. Therefore, QR can be widely used to solve problems in econometrics, environmental sciences and health sciences. Sample size is an important factor in the planning stage of experimental design and observational studies. In ordinary linear regression, sample size may be determined based on either precision analysis or power analysis with closed form formulas. There are also methods that calculate sample size based on precision analysis for QR like C.Jennen-Steinmetz and S.Wellek (2005). A method to estimate sample size for QR based on power analysis was proposed by Shao and Wang (2009). In this paper, a new method is proposed to calculate sample size based on power analysis under hypothesis test of covariate effects. Even though error distribution assumption is not necessary for QR analysis itself, researchers have to make assumptions of error distribution and covariate structure in the planning stage of a study to obtain a reasonable estimate of sample size. In this project, both parametric and nonparametric methods are provided to estimate error distribution. Since the method proposed can be implemented in R, user is able to choose either parametric distribution or nonparametric kernel density estimation for error distribution. User also needs to specify the covariate structure and effect size to carry out sample size and power calculation. The performance of the method proposed is further evaluated using numerical simulation. The results suggest that the sample sizes obtained from our method provide empirical powers that are closed to the nominal power level, for example, 80%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Izenman and Sommer (1988) used a non-parametric Kernel density estimation technique to fit a seven-component model to the paper thickness of the 1872 Hidalgo stamp issue of Mexico. They observed an apparent conflict when fitting a normal mixture model with three components with unequal variances. This conflict is examined further by investigating the most appropriate number of components when fitting a normal mixture of components with equal variances.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work is discussed the importance of the renewable production forecast in an island environment. A probabilistic forecast based on kernel density estimators is proposed. The aggregation of these forecasts, allows the determination of thermal generation amount needed to schedule and operating a power grid of an island with high penetration of renewable generation. A case study based on electric system of S. Miguel Island is presented. The results show that the forecast techniques are an imperative tool help the grid management.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Reports of triatomine infestation in urban areas have increased. We analysed the spatial distribution of infestation by triatomines in the urban area of Diamantina, in the state of Minas Gerais, Brazil. Triatomines were obtained by community-based entomological surveillance. Spatial patterns of infestation were analysed by Ripley’s K function and Kernel density estimator. Normalised difference vegetation index (NDVI) and land cover derived from satellite imagery were compared between infested and uninfested areas. A total of 140 adults of four species were captured (100 Triatoma vitticeps, 25Panstrongylus geniculatus, 8 Panstrongylus megistus, and 7 Triatoma arthurneivai specimens). In total, 87.9% were captured within domiciles. Infection by trypanosomes was observed in 19.6% of 107 examined insects. The spatial distributions ofT. vitticeps, P. geniculatus, T. arthurneivai, and trypanosome-positive triatomines were clustered, occurring mainly in peripheral areas. NDVI values were statistically higher in areas infested by T. vitticeps and P. geniculatus. Buildings infested by these species were located closer to open fields, whereas infestations of P. megistus andT. arthurneivai were closer to bare soil. Human occupation and modification of natural areas may be involved in triatomine invasion, exposing the population to these vectors.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A problem in the archaeometric classification of Catalan Renaissance pottery is the fact, thatthe clay supply of the pottery workshops was centrally organized by guilds, and thereforeusually all potters of a single production centre produced chemically similar ceramics.However, analysing the glazes of the ware usually a large number of inclusions in the glaze isfound, which reveal technological differences between single workshops. These inclusionshave been used by the potters in order to opacify the transparent glaze and to achieve a whitebackground for further decoration.In order to distinguish different technological preparation procedures of the single workshops,at a Scanning Electron Microscope the chemical composition of those inclusions as well astheir size in the two-dimensional cut is recorded. Based on the latter, a frequency distributionof the apparent diameters is estimated for each sample and type of inclusion.Following an approach by S.D. Wicksell (1925), it is principally possible to transform thedistributions of the apparent 2D-diameters back to those of the true three-dimensional bodies.The applicability of this approach and its practical problems are examined using differentways of kernel density estimation and Monte-Carlo tests of the methodology. Finally, it istested in how far the obtained frequency distributions can be used to classify the pottery

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We use aggregate GDP data and within-country income shares for theperiod 1970-1998 to assign a level of income to each person in theworld. We then estimate the gaussian kernel density function for theworldwide distribution of income. We compute world poverty rates byintegrating the density function below the poverty lines. The $1/daypoverty rate has fallen from 20% to 5% over the last twenty five years.The $2/day rate has fallen from 44% to 18%. There are between 300 and500 million less poor people in 1998 than there were in the 70s.We estimate global income inequality using seven different popularindexes: the Gini coefficient, the variance of log-income, two ofAtkinson s indexes, the Mean Logarithmic Deviation, the Theil indexand the coefficient of variation. All indexes show a reduction in globalincome inequality between 1980 and 1998. We also find that most globaldisparities can be accounted for by across-country, not within-country,inequalities. Within-country disparities have increased slightly duringthe sample period, but not nearly enough to offset the substantialreduction in across-country disparities. The across-country reductionsin inequality are driven mainly, but not fully, by the large growth rateof the incomes of the 1.2 billion Chinese citizens. Unless Africa startsgrowing in the near future, we project that income inequalities willstart rising again. If Africa does not start growing, then China, India,the OECD and the rest of middle-income and rich countries diverge awayfrom it, and global inequality will rise. Thus, the aggregate GDP growthof the African continent should be the priority of anyone concerned withincreasing global income inequality.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop a general error analysis framework for the Monte Carlo simulationof densities for functionals in Wiener space. We also study variancereduction methods with the help of Malliavin derivatives. For this, wegive some general heuristic principles which are applied to diffusionprocesses. A comparison with kernel density estimates is made.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we propose an innovative methodology for automated profiling of illicit tablets bytheir surface granularity; a feature previously unexamined for this purpose. We make use of the tinyinconsistencies at the tablet surface, referred to as speckles, to generate a quantitative granularity profileof tablets. Euclidian distance is used as a measurement of (dis)similarity between granularity profiles.The frequency of observed distances is then modelled by kernel density estimation in order to generalizethe observations and to calculate likelihood ratios (LRs). The resulting LRs are used to evaluate thepotential of granularity profiles to differentiate between same-batch and different-batches tablets.Furthermore, we use the LRs as a similarity metric to refine database queries. We are able to derivereliable LRs within a scope that represent the true evidential value of the granularity feature. Thesemetrics are used to refine candidate hit-lists form a database containing physical features of illicittablets. We observe improved or identical ranking of candidate tablets in 87.5% of cases when granularityis considered.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose robust estimators of the generalized log-gamma distribution and, more generally, of location-shape-scale families of distributions. A (weighted) Q tau estimator minimizes a tau scale of the differences between empirical and theoretical quantiles. It is n(1/2) consistent; unfortunately, it is not asymptotically normal and, therefore, inconvenient for inference. However, it is a convenient starting point for a one-step weighted likelihood estimator, where the weights are based on a disparity measure between the model density and a kernel density estimate. The one-step weighted likelihood estimator is asymptotically normal and fully efficient under the model. It is also highly robust under outlier contamination. Supplementary materials are available online.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tropical forests are sources of many ecosystem services, but these forests are vanishing rapidly. The situation is severe in Sub-Saharan Africa and especially in Tanzania. The causes of change are multidimensional and strongly interdependent, and only understanding them comprehensively helps to change the ongoing unsustainable trends of forest decline. Ongoing forest changes, their spatiality and connection to humans and environment can be studied with the methods of Land Change Science. The knowledge produced with these methods helps to make arguments about the actors, actions and causes that are behind the forest decline. In this study of Unguja Island in Zanzibar the focus is in the current forest cover and its changes between 1996 and 2009. The cover and changes are measured with often used remote sensing methods of automated land cover classification and post-classification comparison from medium resolution satellite images. Kernel Density Estimation is used to determine the clusters of change, sub-area –analysis provides information about the differences between regions, while distance and regression analyses connect changes to environmental factors. These analyses do not only explain the happened changes, but also allow building quantitative and spatial future scenarios. Similar study has not been made for Unguja and therefore it provides new information, which is beneficial for the whole society. The results show that 572 km2 of Unguja is still forested, but 0,82–1,19% of these forests are disappearing annually. Besides deforestation also vertical degradation and spatial changes are significant problems. Deforestation is most severe in the communal indigenous forests, but also agroforests are decreasing. Spatially deforestation concentrates to the areas close to the coastline, population and Zanzibar Town. Biophysical factors on the other hand do not seem to influence the ongoing deforestation process. If the current trend continues there should be approximately 485 km2 of forests remaining in 2025. Solutions to these deforestation problems should be looked from sustainable land use management, surveying and protection of the forests in risk areas and spatially targeted self-sustainable tree planting schemes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A problem in the archaeometric classification of Catalan Renaissance pottery is the fact, that the clay supply of the pottery workshops was centrally organized by guilds, and therefore usually all potters of a single production centre produced chemically similar ceramics. However, analysing the glazes of the ware usually a large number of inclusions in the glaze is found, which reveal technological differences between single workshops. These inclusions have been used by the potters in order to opacify the transparent glaze and to achieve a white background for further decoration. In order to distinguish different technological preparation procedures of the single workshops, at a Scanning Electron Microscope the chemical composition of those inclusions as well as their size in the two-dimensional cut is recorded. Based on the latter, a frequency distribution of the apparent diameters is estimated for each sample and type of inclusion. Following an approach by S.D. Wicksell (1925), it is principally possible to transform the distributions of the apparent 2D-diameters back to those of the true three-dimensional bodies. The applicability of this approach and its practical problems are examined using different ways of kernel density estimation and Monte-Carlo tests of the methodology. Finally, it is tested in how far the obtained frequency distributions can be used to classify the pottery

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.