996 resultados para statistical distance
Resumo:
A new approach to pattern recognition using invariant parameters based on higher order spectra is presented. In particular, invariant parameters derived from the bispectrum are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale and amplification invariant, as well. A minimal set of these invariants is selected as the feature vector for pattern classification, and a minimum distance classifier using a statistical distance measure is used to classify test patterns. The classification technique is shown to distinguish two similar, but different bolts given their one-dimensional profiles. Pattern recognition using higher order spectral invariants is fast, suited for parallel implementation, and has high immunity to additive Gaussian noise. Simulation results show very high classification accuracy, even for low signal-to-noise ratios.
Resumo:
Basing signature schemes on strong lattice problems has been a long standing open issue. Today, two families of lattice-based signature schemes are known: the ones based on the hash-and-sign construction of Gentry et al.; and Lyubashevsky’s schemes, which are based on the Fiat-Shamir framework. In this paper we show for the first time how to adapt the schemes of Lyubashevsky to the ring signature setting. In particular we transform the scheme of ASIACRYPT 2009 into a ring signature scheme that provides strong properties of security under the random oracle model. Anonymity is ensured in the sense that signatures of different users are within negligible statistical distance even under full key exposure. In fact, the scheme satisfies a notion which is stronger than the classical full key exposure setting as even if the keypair of the signing user is adversarially chosen, the statistical distance between signatures of different users remains negligible. Considering unforgeability, the best lattice-based ring signature schemes provide either unforgeability against arbitrary chosen subring attacks or insider corruption in log-sized rings. In this paper we present two variants of our scheme. In the basic one, unforgeability is ensured in those two settings. Increasing signature and key sizes by a factor k (typically 80 − 100), we provide a variant in which unforgeability is ensured against insider corruption attacks for arbitrary rings. The technique used is pretty general and can be adapted to other existing schemes.
Resumo:
Design of speaker identification schemes for a small number of speakers (around 10) with a high degree of accuracy in controlled environment is a practical proposition today. When the number of speakers is large (say 50–100), many of these schemes cannot be directly extended, as both recognition error and computation time increase monotonically with population size. The feature selection problem is also complex for such schemes. Though there were earlier attempts to rank order features based on statistical distance measures, it has been observed only recently that the best two independent measurements are not the same as the combination in two's for pattern classification. We propose here a systematic approach to the problem using the decision tree or hierarchical classifier with the following objectives: (1) Design of optimal policy at each node of the tree given the tree structure i.e., the tree skeleton and the features to be used at each node. (2) Determination of the optimal feature measurement and decision policy given only the tree skeleton. Applicability of optimization procedures such as dynamic programming in the design of such trees is studied. The experimental results deal with the design of a 50 speaker identification scheme based on this approach.
Resumo:
[ES] La controversia acerca de si la implantación y negociación de activos derivados afecta a la estabilidad de los respectivos mercados de contado perdura desde hace más de dos décadas. En este trabajo abordamos la problemática anterior desde una nueva perspectiva. Concretamente, analizamos el impacto que sobre la estructura del mercado bursátil ha podido tener la introducción de los mercados de activos derivados sobre el IBEX-35. Para ello, definimos e identificamos la estructura del mercado bursátil para el periodo de estudio, y, a continuación, analizamos el efecto que sobre la misma ha tenido la aparición de los nuevos mercados de derivados. Nuestros resultados son consistentes con los de otros autores, ya que si bien no se ha producido un cambio generalizado y substancial en la estructura del mercado bursátil, la introducción de los nuevos mercados sí parece que ha afectado a un número reducido de empresas incluidas en el IBEX-35.
Resumo:
In this work, a sample of planetary nebulae located in the inner-disk and bulge of the Galaxy is used in order to find the galactocentric distance which better separates these two populations, from the point of view of abundances. Statistical distance scales are used to study the distribution of abundances across the disk-bulge interface. A Kolmogorov-Smirnov test is used to find the distance at which the chemical properties of these regions better separate. The results of the statistical analysis indicate that, on the average, the inner population has lower abundances than the outer. Additionally, for the a-element abundances, the inner population does not follow the disk radial gradient towards the galactic center. Based on our results, we suggest a bulge-disk interface at 1.5 kpc, marking the transition between the bulge and inner-disk of the Galaxy as defined by the intermediate mass population.
Resumo:
Esta dissertação concentra-se nos processos estocásticos espaciais definidos em um reticulado, os chamados modelos do tipo Cliff & Ord. Minha contribuição nesta tese consiste em utilizar aproximações de Edgeworth e saddlepoint para investigar as propriedades em amostras finitas do teste para detectar a presença de dependência espacial em modelos SAR (autoregressivo espacial), e propor uma nova classe de modelos econométricos espaciais na qual os parâmetros que afetam a estrutura da média são distintos dos parâmetros presentes na estrutura da variância do processo. Isto permite uma interpretação mais clara dos parâmetros do modelo, além de generalizar uma proposta de taxonomia feita por Anselin (2003). Eu proponho um estimador para os parâmetros do modelo e derivo a distribuição assintótica do estimador. O modelo sugerido na dissertação fornece uma interpretação interessante ao modelo SARAR, bastante comum na literatura. A investigação das propriedades em amostras finitas dos testes expande com relação a literatura permitindo que a matriz de vizinhança do processo espacial seja uma função não-linear do parâmetro de dependência espacial. A utilização de aproximações ao invés de simulações (mais comum na literatura), permite uma maneira fácil de comparar as propriedades dos testes com diferentes matrizes de vizinhança e corrigir o tamanho ao comparar a potência dos testes. Eu obtenho teste invariante ótimo que é também localmente uniformemente mais potente (LUMPI). Construo o envelope de potência para o teste LUMPI e mostro que ele é virtualmente UMP, pois a potência do teste está muito próxima ao envelope (considerando as estruturas espaciais definidas na dissertação). Eu sugiro um procedimento prático para construir um teste que tem boa potência em uma gama de situações onde talvez o teste LUMPI não tenha boas propriedades. Eu concluo que a potência do teste aumenta com o tamanho da amostra e com o parâmetro de dependência espacial (o que está de acordo com a literatura). Entretanto, disputo a visão consensual que a potência do teste diminui a medida que a matriz de vizinhança fica mais densa. Isto reflete um erro de medida comum na literatura, pois a distância estatística entre a hipótese nula e a alternativa varia muito com a estrutura da matriz. Fazendo a correção, concluo que a potência do teste aumenta com a distância da alternativa à nula, como esperado.
Resumo:
In this paper, a sample of planetary nebulae in the Galaxy's inner-disk and bulge is used to find the galactocentric distance that optimally separates these two populations in terms of their abundances. Statistical distance scales were used to investigate the distribution of abundances across the disk–bulge interface, while a Kolmogorov–Smirnov test was used to find the distance at which the chemical properties of these regions separate optimally. The statistical analysis indicates that, on average, the inner population is characterized by lower abundances than the outer component. Additionally, for the α-element abundances, the inner population does not follow the disk's radial gradient toward the Galactic Center. Based on our results, we suggest a bulge–disk interface at 1.5 kpc, marking the transition between the bulge and the inner disk of the Galaxy as defined by the intermediate-mass population.
Resumo:
2000 Mathematics Subject Classification: 62P10, 62H30
Resumo:
The use of Mahalanobis squared distance–based novelty detection in statistical damage identification has become increasingly popular in recent years. The merit of the Mahalanobis squared distance–based method is that it is simple and requires low computational effort to enable the use of a higher dimensional damage-sensitive feature, which is generally more sensitive to structural changes. Mahalanobis squared distance–based damage identification is also believed to be one of the most suitable methods for modern sensing systems such as wireless sensors. Although possessing such advantages, this method is rather strict with the input requirement as it assumes the training data to be multivariate normal, which is not always available particularly at an early monitoring stage. As a consequence, it may result in an ill-conditioned training model with erroneous novelty detection and damage identification outcomes. To date, there appears to be no study on how to systematically cope with such practical issues especially in the context of a statistical damage identification problem. To address this need, this article proposes a controlled data generation scheme, which is based upon the Monte Carlo simulation methodology with the addition of several controlling and evaluation tools to assess the condition of output data. By evaluating the convergence of the data condition indices, the proposed scheme is able to determine the optimal setups for the data generation process and subsequently avoid unnecessarily excessive data. The efficacy of this scheme is demonstrated via applications to a benchmark structure data in the field.
Resumo:
This article presents the field applications and validations for the controlled Monte Carlo data generation scheme. This scheme was previously derived to assist the Mahalanobis squared distance–based damage identification method to cope with data-shortage problems which often cause inadequate data multinormality and unreliable identification outcome. To do so, real-vibration datasets from two actual civil engineering structures with such data (and identification) problems are selected as the test objects which are then shown to be in need of enhancement to consolidate their conditions. By utilizing the robust probability measures of the data condition indices in controlled Monte Carlo data generation and statistical sensitivity analysis of the Mahalanobis squared distance computational system, well-conditioned synthetic data generated by an optimal controlled Monte Carlo data generation configurations can be unbiasedly evaluated against those generated by other set-ups and against the original data. The analysis results reconfirm that controlled Monte Carlo data generation is able to overcome the shortage of observations, improve the data multinormality and enhance the reliability of the Mahalanobis squared distance–based damage identification method particularly with respect to false-positive errors. The results also highlight the dynamic structure of controlled Monte Carlo data generation that makes this scheme well adaptive to any type of input data with any (original) distributional condition.
Resumo:
We develop spatial statistical models for stream networks that can estimate relationships between a response variable and other covariates, make predictions at unsampled locations, and predict an average or total for a stream or a stream segment. There have been very few attempts to develop valid spatial covariance models that incorporate flow, stream distance, or both. The application of typical spatial autocovariance functions based on Euclidean distance, such as the spherical covariance model, are not valid when using stream distance. In this paper we develop a large class of valid models that incorporate flow and stream distance by using spatial moving averages. These methods integrate a moving average function, or kernel, against a white noise process. By running the moving average function upstream from a location, we develop models that use flow, and by construction they are valid models based on stream distance. We show that with proper weighting, many of the usual spatial models based on Euclidean distance have a counterpart for stream networks. Using sulfate concentrations from an example data set, the Maryland Biological Stream Survey (MBSS), we show that models using flow may be more appropriate than models that only use stream distance. For the MBSS data set, we use restricted maximum likelihood to fit a valid covariance matrix that uses flow and stream distance, and then we use this covariance matrix to estimate fixed effects and make kriging and block kriging predictions.
Resumo:
In this thesis, the relationship between air pollution and human health has been investigated utilising Geographic Information System (GIS) as an analysis tool. The research focused on how vehicular air pollution affects human health. The main objective of this study was to analyse the spatial variability of pollutants, taking Brisbane City in Australia as a case study, by the identification of the areas of high concentration of air pollutants and their relationship with the numbers of death caused by air pollutants. A correlation test was performed to establish the relationship between air pollution, number of deaths from respiratory disease, and total distance travelled by road vehicles in Brisbane. GIS was utilized to investigate the spatial distribution of the air pollutants. The main finding of this research is the comparison between spatial and non-spatial analysis approaches, which indicated that correlation analysis and simple buffer analysis of GIS using the average levels of air pollutants from a single monitoring station or by group of few monitoring stations is a relatively simple method for assessing the health effects of air pollution. There was a significant positive correlation between variable under consideration, and the research shows a decreasing trend of concentration of nitrogen dioxide at the Eagle Farm and Springwood sites and an increasing trend at CBD site. Statistical analysis shows that there exists a positive relationship between the level of emission and number of deaths, though the impact is not uniform as certain sections of the population are more vulnerable to exposure. Further statistical tests found that the elderly people of over 75 years age and children between 0-15 years of age are the more vulnerable people exposed to air pollution. A non-spatial approach alone may be insufficient for an appropriate evaluation of the impact of air pollutant variables and their inter-relationships. It is important to evaluate the spatial features of air pollutants before modeling the air pollution-health relationships.
Resumo:
Numerous research studies have evaluated whether distance learning is a viable alternative to traditional learning methods. These studies have generally made use of cross-sectional surveys for collecting data, comparing distance to traditional learners with intent to validate the former as a viable educational tool. Inherent fundamental differences between traditional and distance learning pedagogies, however, reduce the reliability of these comparative studies and constrain the validity of analyses resulting from this analytical approach. This article presents the results of a research project undertaken to analyze expectations and experiences of distance learners with their degree programs. Students were given surveys designed to examine factors expected to affect their overall value assessment of their distance learning program. Multivariate statistical analyses were used to analyze the correlations among variables of interest to support hypothesized relationships among them. Focusing on distance learners overcomes some of the limitations with assessments that compare off- and on-campus student experiences. Evaluation and modeling of distance learner responses on perceived value for money of the distance education they received indicate that the two most important influences are course communication requirements, which had a negative effect, and course logistical simplicity, which revealed a positive effect. Combined, these two factors accounted for approximately 47% of the variability in perceived value for money of the educational program of sampled students. A detailed focus on comparing expectations with outcomes of distance learners complements the existing literature dominated by comparative studies of distance and nondistance learners.
Resumo:
The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na–Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na–HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous–Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the geological framework of these sedimentary basins, can be adopted in other complex multi-aquifer systems to assess hydrochemical evolution and its geological controls.