950 resultados para kernel density method
Resumo:
We generalize the popular ensemble Kalman filter to an ensemble transform filter, in which the prior distribution can take the form of a Gaussian mixture or a Gaussian kernel density estimator. The design of the filter is based on a continuous formulation of the Bayesian filter analysis step. We call the new filter algorithm the ensemble Gaussian-mixture filter (EGMF). The EGMF is implemented for three simple test problems (Brownian dynamics in one dimension, Langevin dynamics in two dimensions and the three-dimensional Lorenz-63 model). It is demonstrated that the EGMF is capable of tracking systems with non-Gaussian uni- and multimodal ensemble distributions. Copyright © 2011 Royal Meteorological Society
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
We report a morphology-based approach for the automatic identification of outlier neurons, as well as its application to the NeuroMorpho.org database, with more than 5,000 neurons. Each neuron in a given analysis is represented by a feature vector composed of 20 measurements, which are then projected into a two-dimensional space by applying principal component analysis. Bivariate kernel density estimation is then used to obtain the probability distribution for the group of cells, so that the cells with highest probabilities are understood as archetypes while those with the smallest probabilities are classified as outliers. The potential of the methodology is illustrated in several cases involving uniform cell types as well as cell types for specific animal species. The results provide insights regarding the distribution of cells, yielding single and multi-variate clusters, and they suggest that outlier cells tend to be more planar and tortuous. The proposed methodology can be used in several situations involving one or more categories of cells, as well as for detection of new categories and possible artifacts.
Resumo:
China is a large country characterized by remarkable growth and distinct regional diversity. Spatial disparity has always been a hot issue since China has been struggling to follow a balanced growth path but still confronting with unprecedented pressures and challenges. To better understand the inequality level benchmarking spatial distributions of Chinese provinces and municipalities and estimate dynamic trajectory of sustainable development in China, I constructed the Composite Index of Regional Development (CIRD) with five sub pillars/dimensions involving Macroeconomic Index (MEI), Science and Innovation Index (SCI), Environmental Sustainability Index (ESI), Human Capital Index (HCI) and Public Facilities Index (PFI), endeavoring to cover various fields of regional socioeconomic development. Ranking reports on the five sub dimensions and aggregated CIRD were provided in order to better measure the developmental degrees of 31 or 30 Chinese provinces and municipalities over 13 years from 1998 to 2010 as the time interval of three “Five-year Plans”. Further empirical applications of this CIRD focused on clustering and convergence estimation, attempting to fill up the gap in quantifying the developmental levels of regional comprehensive socioeconomics and estimating the dynamic convergence trajectory of regional sustainable development in a long run. Four clusters were benchmarked geographically-oriented in the map on the basis of cluster analysis, and club-convergence was observed in the Chinese provinces and municipalities based on stochastic kernel density estimation.
Resumo:
Background: Clear examples of ecological speciation exist, often involving divergence in trophic morphology. However, substantial variation also exists in how far the ecological speciation process proceeds, potentially linked to the number of ecological axes, traits, or genes subject to divergent selection. In addition, recent studies highlight how differentiation might occur between the sexes, rather than between populations. We examine variation in trophic morphology in two host-plant ecotypes of walking-stick insects (Timema cristinae), known to have diverged in morphological traits related to crypsis and predator avoidance, and to have reached an intermediate point in the ecological speciation process. Here we test how host plant use, sex, and rearing environment affect variation in trophic morphology in this species using traditional multivariate, novel kernel density based and Bayesian morphometric analyses. Results: Contrary to expectations, we find limited host-associated divergence in mandible shape. Instead, the main predictor of shape variation is sex, with secondary roles of population of origin and rearing environment. Conclusion: Our results show that trophic morphology does not strongly contribute to host-adapted ecotype divergence in T. cristinae and that traits can respond to complex selection regimes by diverging along different intraspecific lines, thereby impeding progress toward speciation.
Resumo:
Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^
Resumo:
Using data from March Current Population Surveys we find gains from economic growth over the 1990s business cycle (1989-2000) were more equitably distributed than over the 1980s business cycle (1979-1989) using summary inequality measures as well as kernel density estimations. The entire distribution of household size-adjusted income moved upwards in the 1990s with profound improvements for African Americans, single mothers and those living in households receiving welfare. Most gains occurred over the growth period 1993-2000. Improvements in average income and income inequity over the latter period are reminiscent of gains seen in the first three decades after World War II.
Resumo:
The paper proposes a new application of non-parametric statistical processing of signals recorded from vibration tests for damage detection and evaluation on I-section steel segments. The steel segments investigated constitute the energy dissipating part of a new type of hysteretic damper that is used for passive control of buildings and civil engineering structures subjected to earthquake-type dynamic loadings. Two I-section steel segments with different levels of damage were instrumented with piezoceramic sensors and subjected to controlled white noise random vibrations. The signals recorded during the tests were processed using two non-parametric methods (the power spectral density method and the frequency response function method) that had never previously been applied to hysteretic dampers. The appropriateness of these methods for quantifying the level of damage on the I-shape steel segments is validated experimentally. Based on the results of the random vibrations, the paper proposes a new index that predicts the level of damage and the proximity of failure of the hysteretic damper
Resumo:
Uma grande diversidade de macrofibras poliméricas para reforço de concreto se encontram disponibilizadas hoje em dia. Por natureza estas fibras apresentam grande diversidade de características e propriedades. Estas variações afetam sua atuação como reforço no concreto. No entanto, não há normas brasileiras sobre o assunto e as metodologias de caracterização de normas estrangeiras apresentam divergências. Algumas normas definem que a caracterização do comportamento mecânico deva ser feita nos fios originais e outras que se devam utilizar métodos definidos para caracterização de materiais metálicos. A norma EN14889-2:2006 apresenta maior abrangência, mas deixa dúvidas quanto à adequação dos critérios de caracterização geométrica das fibras e não define um método de ensaio específico para sua caracterização mecânica. Assim, há a necessidade de estabelecimento de uma metodologia que permita a realização de um programa de controle de qualidade da fibra nas condições de emprego. Esta metodologia também proporcionaria uma forma de caracterização do material para estudos experimentais, o que permitiria maior fundamentação científica desses trabalhos que, frequentemente, fundamentam-se apenas em dados dos fabricantes. Assim, foi desenvolvido um estudo experimental focando a caracterização de duas macrofibras poliméricas disponíveis no mercado brasileiro. Focou-se o estudo na determinação dos parâmetros geométricos e na caracterização mecânica através da determinação da resistência à tração e avaliação do módulo de elasticidade. Na caracterização geométrica foi adotada como referência a norma europeia EN14889-2:2006. As medições do comprimento se efetuaram por dois métodos: o método do paquímetro e o método de análise de imagens digitais, empregando um software para processamento das imagens. Para a medição do diâmetro, além das metodologias mencionadas, foi usado o método da densidade. Conclui-se que o método do paquímetro, com o cuidado de esticar previamente as macrofibras, e o método das imagens digitais podem ser igualmente utilizados para medir o comprimento. Já parar determinar o diâmetro, recomenda-se o método da densidade. Quanto à caracterização mecânica, foi desenvolvida uma metodologia própria a partir de informações obtidas de outros ensaios. Assim, efetuaram-se ensaios de tração direta nas macrofibras coladas em molduras de tecido têxtil. Complementarmente, foi avaliado também o efeito do contato abrasivo das macrofibras com os agregados durante a mistura em betoneira no comportamento mecânico do material. Também se avaliou o efeito do método de determinação da área da seção transversal nos resultados medidos no ensaio de tração da fibra. Conclui-se que o método proposto para o ensaio de tração direta da fibra é viável, especialmente para a determinação da resistência à tração. O valor do módulo de elasticidade, por sua vez, acaba sendo subestimado. A determinação da área da seção da fibra através do método da densidade forneceu também os melhores resultados. Além disso, comprovou-se que o atrito das fibras com o agregado durante a mistura compromete o comportamento mecânico, reduzindo tanto a resistência quanto o módulo de elasticidade. Assim, pode-se afirmar que a metodologia proposta para o controle geométrico e mecânico das macrofibras poliméricas é adequada para a caracterização do material.
Resumo:
This paper examines the determinants of foreign direct investment (FDI) under free trade agreements (FTAs) from a new institutional perspective. First, the determinants of FDI are theoretically discussed from a new institutional perspective. Then, FDI is statistically analyzed at the aggregate level. Kernel density estimation of firm-size reveals some evidence of "structural changes" after FTAs, as characterized by the investing firms' paid-up capital stock. Statistical tests of the average and variance of the size distribution confirm this in the case of FTAs with Asian partner countries. For FTAs with South American partner countries, the presence of FTAs seems to promote larger-scale FDIs. These results remain correlational instead of causal, and more statistical analyses would be needed to infer causality. Policy implications suggest that participants should consider "institutional" aspects of FTAs, that is, the size matters as a determinant of FDI. Future work along this line is needed to study "firm heterogeneity."
Resumo:
A set of techniques referred to as circular statistics has been developed for the analysis of directional and orientational data. The unit of measure for such data is angular (usually in either degrees or radians), and the statistical distributions underlying the techniques are characterised by their cyclic nature-for example, angles of 359.9 degrees are considered close to angles of 0 degrees. In this paper, we assert that such approaches can be easily adapted to analyse time-of-day and time-of-week data, and in particular daily cycles in the numbers of incidents reported to the police. We begin the paper by describing circular statistics. We then discuss how these may be modified, and demonstrate the approach with some examples for reported incidents in the Cardiff area of Wales. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
This paper investigates the performance of EASI algorithm and the proposed EKENS algorithm for linear and nonlinear mixtures. The proposed EKENS algorithm is based on the modified equivariant algorithm and kernel density estimation. Theory and characteristic of both the algorithms are discussed for blind source separation model. The separation structure of nonlinear mixtures is based on a nonlinear stage followed by a linear stage. Simulations with artificial and natural data demonstrate the feasibility and good performance of the proposed EKENS algorithm.
Resumo:
Motivation: In molecular biology, molecular events describe observable alterations of biomolecules, such as binding of proteins or RNA production. These events might be responsible for drug reactions or development of certain diseases. As such, biomedical event extraction, the process of automatically detecting description of molecular interactions in research articles, attracted substantial research interest recently. Event trigger identification, detecting the words describing the event types, is a crucial and prerequisite step in the pipeline process of biomedical event extraction. Taking the event types as classes, event trigger identification can be viewed as a classification task. For each word in a sentence, a trained classifier predicts whether the word corresponds to an event type and which event type based on the context features. Therefore, a well-designed feature set with a good level of discrimination and generalization is crucial for the performance of event trigger identification. Results: In this article, we propose a novel framework for event trigger identification. In particular, we learn biomedical domain knowledge from a large text corpus built from Medline and embed it into word features using neural language modeling. The embedded features are then combined with the syntactic and semantic context features using the multiple kernel learning method. The combined feature set is used for training the event trigger classifier. Experimental results on the golden standard corpus show that >2.5% improvement on F-score is achieved by the proposed framework when compared with the state-of-the-art approach, demonstrating the effectiveness of the proposed framework. © 2014 The Author 2014. The source code for the proposed framework is freely available and can be downloaded at http://cse.seu.edu.cn/people/zhoudeyu/ETI_Sourcecode.zip.
Resumo:
This thesis stems from the project with real-time environmental monitoring company EMSAT Corporation. They were looking for methods to automatically ag spikes and other anomalies in their environmental sensor data streams. The problem presents several challenges: near real-time anomaly detection, absence of labeled data and time-changing data streams. Here, we address this problem using both a statistical parametric approach as well as a non-parametric approach like Kernel Density Estimation (KDE). The main contribution of this thesis is extending the KDE to work more effectively for evolving data streams, particularly in presence of concept drift. To address that, we have developed a framework for integrating Adaptive Windowing (ADWIN) change detection algorithm with KDE. We have tested this approach on several real world data sets and received positive feedback from our industry collaborator. Some results appearing in this thesis have been presented at ECML PKDD 2015 Doctoral Consortium.