21 resultados para generalized entropy
em Aston University Research Archive
Resumo:
In the Bayesian framework, predictions for a regression problem are expressed in terms of a distribution of output values. The mode of this distribution corresponds to the most probable output, while the uncertainty associated with the predictions can conveniently be expressed in terms of error bars. In this paper we consider the evaluation of error bars in the context of the class of generalized linear regression models. We provide insights into the dependence of the error bars on the location of the data points and we derive an upper bound on the true error bars in terms of the contributions from individual data points which are themselves easily evaluated.
Resumo:
A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.
Resumo:
Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.
Resumo:
The concept of entropy rate is well defined in dynamical systems theory but is impossible to apply it directly to finite real world data sets. With this in mind, Pincus developed Approximate Entropy (ApEn), which uses ideas from Eckmann and Ruelle to create a regularity measure based on entropy rate that can be used to determine the influence of chaotic behaviour in a real world signal. However, this measure was found not to be robust and so an improved formulation known as the Sample Entropy (SampEn) was created by Richman and Moorman to address these issues. We have developed a new, related, regularity measure which is not based on the theory provided by Eckmann and Ruelle and proves a more well-behaved measure of complexity than the previous measures whilst still retaining a low computational cost.
Resumo:
Several indices of plant capacity utilization based on the concept of best practice frontier have been proposed in the literature (Fare et al. 1992; De Borger and Kerstens, 1998). This paper suggests an alternative measure of capacity utilization change based on Generalized Malmquist index, proposed by Grifell-Tatje' and Lovell in 1998. The advantage of this specification is that it allows the measurement of productivity growth ignoring the nature of scale economies. Afterwards, this index is used to measure capacity change of a panel of Italian firms over the period 1989-94 using Data Envelopment Analysis and then its abilities of explaining the short-run movements of output are assessed.
Resumo:
There has been much recent research into extracting useful diagnostic features from the electrocardiogram with numerous studies claiming impressive results. However, the robustness and consistency of the methods employed in these studies is rarely, if ever, mentioned. Hence, we propose two new methods; a biologically motivated time series derived from consecutive P-wave durations, and a mathematically motivated regularity measure. We investigate the robustness of these two methods when compared with current corresponding methods. We find that the new time series performs admirably as a compliment to the current method and the new regularity measure consistently outperforms the current measure in numerous tests on real and synthetic data.
Resumo:
In this study, a new entropy measure known as kernel entropy (KerEnt), which quantifies the irregularity in a series, was applied to nocturnal oxygen saturation (SaO 2) recordings. A total of 96 subjects suspected of suffering from sleep apnea-hypopnea syndrome (SAHS) took part in the study: 32 SAHS-negative and 64 SAHS-positive subjects. Their SaO 2 signals were separately processed by means of KerEnt. Our results show that a higher degree of irregularity is associated to SAHS-positive subjects. Statistical analysis revealed significant differences between the KerEnt values of SAHS-negative and SAHS-positive groups. The diagnostic utility of this parameter was studied by means of receiver operating characteristic (ROC) analysis. A classification accuracy of 81.25% (81.25% sensitivity and 81.25% specificity) was achieved. Repeated apneas during sleep increase irregularity in SaO 2 data. This effect can be measured by KerEnt in order to detect SAHS. This non-linear measure can provide useful information for the development of alternative diagnostic techniques in order to reduce the demand for conventional polysomnography (PSG). © 2011 IEEE.
Resumo:
Removing noise from piecewise constant (PWC) signals is a challenging signal processing problem arising in many practical contexts. For example, in exploration geosciences, noisy drill hole records need to be separated into stratigraphic zones, and in biophysics, jumps between molecular dwell states have to be extracted from noisy fluorescence microscopy signals. Many PWC denoising methods exist, including total variation regularization, mean shift clustering, stepwise jump placement, running medians, convex clustering shrinkage and bilateral filtering; conventional linear signal processing methods are fundamentally unsuited. This paper (part I, the first of two) shows that most of these methods are associated with a special case of a generalized functional, minimized to achieve PWC denoising. The minimizer can be obtained by diverse solver algorithms, including stepwise jump placement, convex programming, finite differences, iterated running medians, least angle regression, regularization path following and coordinate descent. In the second paper, part II, we introduce novel PWC denoising methods, and comparisons between these methods performed on synthetic and real signals, showing that the new understanding of the problem gained in part I leads to new methods that have a useful role to play.
Resumo:
Removing noise from signals which are piecewise constant (PWC) is a challenging signal processing problem that arises in many practical scientific and engineering contexts. In the first paper (part I) of this series of two, we presented background theory building on results from the image processing community to show that the majority of these algorithms, and more proposed in the wider literature, are each associated with a special case of a generalized functional, that, when minimized, solves the PWC denoising problem. It shows how the minimizer can be obtained by a range of computational solver algorithms. In this second paper (part II), using this understanding developed in part I, we introduce several novel PWC denoising methods, which, for example, combine the global behaviour of mean shift clustering with the local smoothing of total variation diffusion, and show example solver algorithms for these new methods. Comparisons between these methods are performed on synthetic and real signals, revealing that our new methods have a useful role to play. Finally, overlaps between the generalized methods of these two papers and others such as wavelet shrinkage, hidden Markov models, and piecewise smooth filtering are touched on.
Resumo:
The dynamics of the non-equilibrium Ising model with parallel updates is investigated using a generalized mean field approximation that incorporates multiple two-site correlations at any two time steps, which can be obtained recursively. The proposed method shows significant improvement in predicting local system properties compared to other mean field approximation techniques, particularly in systems with symmetric interactions. Results are also evaluated against those obtained from Monte Carlo simulations. The method is also employed to obtain parameter values for the kinetic inverse Ising modeling problem, where couplings and local field values of a fully connected spin system are inferred from data. © 2014 IOP Publishing Ltd and SISSA Medialab srl.
Resumo:
A new generalized sphere decoding algorithm is proposed for underdetermined MIMO systems with fewer receive antennas N than transmit antennas M. The proposed algorithm is significantly faster than the existing generalized sphere decoding algorithms. The basic idea is to partition the transmitted signal vector into two subvectors x and x with N - 1 and M - N + 1 elements respectively. After some simple transformations, an outer layer Sphere Decoder (SD) can be used to choose proper x and then use an inner layer SD to decide x, thus the whole transmitted signal vector is obtained. Simulation results show that Double Layer Sphere Decoding (DLSD) has far less complexity than the existing Generalized Sphere Decoding (GSDs).
Resumo:
A version of the thermodynamic perturbation theory based on a scaling transformation of the partition function has been applied to the statistical derivation of the equation of state in a highpressure region. Two modifications of the equations of state have been obtained on the basis of the free energy functional perturbation series. The comparative analysis of the experimental PV T- data on the isothermal compression for the supercritical fluids of inert gases has been carried out. © 2012.
Resumo:
A generalized Drucker–Prager (GD–P) viscoplastic yield surface model was developed and validated for asphalt concrete. The GD–P model was formulated based on fabric tensor modified stresses to consider the material inherent anisotropy. A smooth and convex octahedral yield surface function was developed in the GD–P model to characterize the full range of the internal friction angles from 0° to 90°. In contrast, the existing Extended Drucker–Prager (ED–P) was demonstrated to be applicable only for a material that has an internal friction angle less than 22°. Laboratory tests were performed to evaluate the anisotropic effect and to validate the GD–P model. Results indicated that (1) the yield stresses of an isotropic yield surface model are greater in compression and less in extension than that of an anisotropic model, which can result in an under-prediction of the viscoplastic deformation; and (2) the yield stresses predicted by the GD–P model matched well with the experimental results of the octahedral shear strength tests at different normal and confining stresses. By contrast, the ED–P model over-predicted the octahedral yield stresses, which can lead to an under-prediction of the permanent deformation. In summary, the rutting depth of an asphalt pavement would be underestimated without considering anisotropy and convexity of the yield surface for asphalt concrete. The proposed GD–P model was demonstrated to be capable of overcoming these limitations of the existing yield surface models for the asphalt concrete.
Resumo:
Fuzzy data envelopment analysis (DEA) models emerge as another class of DEA models to account for imprecise inputs and outputs for decision making units (DMUs). Although several approaches for solving fuzzy DEA models have been developed, there are some drawbacks, ranging from the inability to provide satisfactory discrimination power to simplistic numerical examples that handles only triangular fuzzy numbers or symmetrical fuzzy numbers. To address these drawbacks, this paper proposes using the concept of expected value in generalized DEA (GDEA) model. This allows the unification of three models - fuzzy expected CCR, fuzzy expected BCC, and fuzzy expected FDH models - and the ability of these models to handle both symmetrical and asymmetrical fuzzy numbers. We also explored the role of fuzzy GDEA model as a ranking method and compared it to existing super-efficiency evaluation models. Our proposed model is always feasible, while infeasibility problems remain in certain cases under existing super-efficiency models. In order to illustrate the performance of the proposed method, it is first tested using two established numerical examples and compared with the results obtained from alternative methods. A third example on energy dependency among 23 European Union (EU) member countries is further used to validate and describe the efficacy of our approach under asymmetric fuzzy numbers.