141 resultados para High-Dimensional Space Geometrical Informatics (HDSGI)
Resumo:
This paper introduces a new neurofuzzy model construction and parameter estimation algorithm from observed finite data sets, based on a Takagi and Sugeno (T-S) inference mechanism and a new extended Gram-Schmidt orthogonal decomposition algorithm, for the modeling of a priori unknown dynamical systems in the form of a set of fuzzy rules. The first contribution of the paper is the introduction of a one to one mapping between a fuzzy rule-base and a model matrix feature subspace using the T-S inference mechanism. This link enables the numerical properties associated with a rule-based matrix subspace, the relationships amongst these matrix subspaces, and the correlation between the output vector and a rule-base matrix subspace, to be investigated and extracted as rule-based knowledge to enhance model transparency. The matrix subspace spanned by a fuzzy rule is initially derived as the input regression matrix multiplied by a weighting matrix that consists of the corresponding fuzzy membership functions over the training data set. Model transparency is explored by the derivation of an equivalence between an A-optimality experimental design criterion of the weighting matrix and the average model output sensitivity to the fuzzy rule, so that rule-bases can be effectively measured by their identifiability via the A-optimality experimental design criterion. The A-optimality experimental design criterion of the weighting matrices of fuzzy rules is used to construct an initial model rule-base. An extended Gram-Schmidt algorithm is then developed to estimate the parameter vector for each rule. This new algorithm decomposes the model rule-bases via an orthogonal subspace decomposition approach, so as to enhance model transparency with the capability of interpreting the derived rule-base energy level. This new approach is computationally simpler than the conventional Gram-Schmidt algorithm for resolving high dimensional regression problems, whereby it is computationally desirable to decompose complex models into a few submodels rather than a single model with large number of input variables and the associated curse of dimensionality problem. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
Although extensively studied within the lidar community, the multiple scattering phenomenon has always been considered a rare curiosity by radar meteorologists. Up to few years ago its appearance has only been associated with two- or three-body-scattering features (e.g. hail flares and mirror images) involving highly reflective surfaces. Recent atmospheric research aimed at better understanding of the water cycle and the role played by clouds and precipitation in affecting the Earth's climate has driven the deployment of high frequency radars in space. Examples are the TRMM 13.5 GHz, the CloudSat 94 GHz, the upcoming EarthCARE 94 GHz, and the GPM dual 13-35 GHz radars. These systems are able to detect the vertical distribution of hydrometeors and thus provide crucial feedbacks for radiation and climate studies. The shift towards higher frequencies increases the sensitivity to hydrometeors, improves the spatial resolution and reduces the size and weight of the radar systems. On the other hand, higher frequency radars are affected by stronger extinction, especially in the presence of large precipitating particles (e.g. raindrops or hail particles), which may eventually drive the signal below the minimum detection threshold. In such circumstances the interpretation of the radar equation via the single scattering approximation may be problematic. Errors will be large when the radiation emitted from the radar after interacting more than once with the medium still contributes substantially to the received power. This is the case if the transport mean-free-path becomes comparable with the instrument footprint (determined by the antenna beam-width and the platform altitude). This situation resembles to what has already been experienced in lidar observations, but with a predominance of wide- versus small-angle scattering events. At millimeter wavelengths, hydrometeors diffuse radiation rather isotropically compared to the visible or near infrared region where scattering is predominantly in the forward direction. A complete understanding of radiation transport modeling and data analysis methods under wide-angle multiple scattering conditions is mandatory for a correct interpretation of echoes observed by space-borne millimeter radars. This paper reviews the status of research in this field. Different numerical techniques currently implemented to account for higher order scattering are reviewed and their weaknesses and strengths highlighted. Examples of simulated radar backscattering profiles are provided with particular emphasis given to situations in which the multiple scattering contributions become comparable or overwhelm the single scattering signal. We show evidences of multiple scattering effects from air-borne and from CloudSat observations, i.e. unique signatures which cannot be explained by single scattering theory. Ideas how to identify and tackle the multiple scattering effects are discussed. Finally perspectives and suggestions for future work are outlined. This work represents a reference-guide for studies focused at modeling the radiation transport and at interpreting data from high frequency space-borne radar systems that probe highly opaque scattering media such as thick ice clouds or precipitating clouds.
Resumo:
A generalized or tunable-kernel model is proposed for probability density function estimation based on an orthogonal forward regression procedure. Each stage of the density estimation process determines a tunable kernel, namely, its center vector and diagonal covariance matrix, by minimizing a leave-one-out test criterion. The kernel mixing weights of the constructed sparse density estimate are finally updated using the multiplicative nonnegative quadratic programming algorithm to ensure the nonnegative and unity constraints, and this weight-updating process additionally has the desired ability to further reduce the model size. The proposed tunable-kernel model has advantages, in terms of model generalization capability and model sparsity, over the standard fixed-kernel model that restricts kernel centers to the training data points and employs a single common kernel variance for every kernel. On the other hand, it does not optimize all the model parameters together and thus avoids the problems of high-dimensional ill-conditioned nonlinear optimization associated with the conventional finite mixture model. Several examples are included to demonstrate the ability of the proposed novel tunable-kernel model to effectively construct a very compact density estimate accurately.
Resumo:
A common problem in many data based modelling algorithms such as associative memory networks is the problem of the curse of dimensionality. In this paper, a new two-stage neurofuzzy system design and construction algorithm (NeuDeC) for nonlinear dynamical processes is introduced to effectively tackle this problem. A new simple preprocessing method is initially derived and applied to reduce the rule base, followed by a fine model detection process based on the reduced rule set by using forward orthogonal least squares model structure detection. In both stages, new A-optimality experimental design-based criteria we used. In the preprocessing stage, a lower bound of the A-optimality design criterion is derived and applied as a subset selection metric, but in the later stage, the A-optimality design criterion is incorporated into a new composite cost function that minimises model prediction error as well as penalises the model parameter variance. The utilisation of NeuDeC leads to unbiased model parameters with low parameter variance and the additional benefit of a parsimonious model structure. Numerical examples are included to demonstrate the effectiveness of this new modelling approach for high dimensional inputs.
Resumo:
Despite continuing developments in information technology and the growing economic significance of the emerging Eastern European, South American and Asian economies, international financial activity remains strongly concentrated in a relatively small number of international financial centres. That concentration of financial activity requires a critical mass of office occupation and creates demand for high specification, high cost space. The demand for that space is increasingly linked to the fortunes of global capital markets. That linkage has been emphasised by developments in real estate markets, notably the development of global real estate investment, innovation in property investment vehicles and the growth of debt securitisation. The resultant interlinking of occupier, asset, debt and development markets within and across global financial centres is a source of potential volatility and risk. The paper sets out a broad conceptual model of the linkages and their implications for systemic market risk and presents preliminary empirical results that provide support for the model proposed.
Resumo:
This paper proposes a nonlinear regression structure comprising a wavelet network and a linear term. The introduction of the linear term is aimed at providing a more parsimonious interpolation in high-dimensional spaces when the modelling samples are sparse. A constructive procedure for building such structures, termed linear-wavelet networks, is described. For illustration, the proposed procedure is employed in the framework of dynamic system identification. In an example involving a simulated fermentation process, it is shown that a linear-wavelet network yields a smaller approximation error when compared with a wavelet network with the same number of regressors. The proposed technique is also applied to the identification of a pressure plant from experimental data. In this case, the results show that the introduction of wavelets considerably improves the prediction ability of a linear model. Standard errors on the estimated model coefficients are also calculated to assess the numerical conditioning of the identification process.
Resumo:
A set of random variables is exchangeable if its joint distribution function is invariant under permutation of the arguments. The concept of exchangeability is discussed, with a view towards potential application in evaluating ensemble forecasts. It is argued that the paradigm of ensembles being an independent draw from an underlying distribution function is probably too narrow; allowing ensemble members to be merely exchangeable might be a more versatile model. The question is discussed whether established methods of ensemble evaluation need alteration under this model, with reliability being given particular attention. It turns out that the standard methodology of rank histograms can still be applied. As a first application of the exchangeability concept, it is shown that the method of minimum spanning trees to evaluate the reliability of high dimensional ensembles is mathematically sound.
Resumo:
Particle filters are fully non-linear data assimilation techniques that aim to represent the probability distribution of the model state given the observations (the posterior) by a number of particles. In high-dimensional geophysical applications the number of particles required by the sequential importance resampling (SIR) particle filter in order to capture the high probability region of the posterior, is too large to make them usable. However particle filters can be formulated using proposal densities, which gives greater freedom in how particles are sampled and allows for a much smaller number of particles. Here a particle filter is presented which uses the proposal density to ensure that all particles end up in the high probability region of the posterior probability density function. This gives rise to the possibility of non-linear data assimilation in large dimensional systems. The particle filter formulation is compared to the optimal proposal density particle filter and the implicit particle filter, both of which also utilise a proposal density. We show that when observations are available every time step, both schemes will be degenerate when the number of independent observations is large, unlike the new scheme. The sensitivity of the new scheme to its parameter values is explored theoretically and demonstrated using the Lorenz (1963) model.
Resumo:
Motivated by the motion planning problem for oriented vehicles travelling in a 3-Dimensional space; Euclidean space E3, the sphere S3 and Hyperboloid H3. For such problems the orientation of the vehicle is naturally represented by an orthonormal frame over a point in the underlying manifold. The orthonormal frame bundles of the space forms R3,S3 and H3 correspond with their isometry groups and are the Euclidean group of motion SE(3), the rotation group SO(4) and the Lorentzian group SO(1; 3) respectively. Orthonormal frame bundles of space forms coincide with their isometry groups and therefore the focus shifts to left-invariant control systems defined on Lie groups. In this paper a method for integrating these systems is given where the controls are time-independent. For constant twist motions or helical motions, the corresponding curves g(t) 2 SE(3) are given in closed form by using the well known Rodrigues’ formula. However, this formula is only applicable to the Euclidean case. This paper gives a method for computing the non-Euclidean screw/helical motions in closed form. This involves decoupling the system into two lower dimensional systems using the double cover properties of Lie groups, then the lower dimensional systems are solved explicitly in closed form.
Resumo:
The nuclear time-dependent Hartree-Fock model formulated in three-dimensional space, based on the full standard Skyrme energy density functional complemented with the tensor force, is presented. Full self-consistency is achieved by the model. The application to the isovector giant dipole resonance is discussed in the linear limit, ranging from spherical nuclei (16O and 120Sn) to systems displaying axial or triaxial deformation (24Mg, 28Si, 178Os, 190W and 238U). Particular attention is paid to the spin-dependent terms from the central sector of the functional, recently included together with the tensor. They turn out to be capable of producing a qualitative change on the strength distribution in this channel. The effect on the deformation properties is also discussed. The quantitative effects on the linear response are small and, overall, the giant dipole energy remains unaffected. Calculations are compared to predictions from the (quasi)-particle random-phase approximation and experimental data where available, finding good agreement
Resumo:
Traditional dictionary learning algorithms are used for finding a sparse representation on high dimensional data by transforming samples into a one-dimensional (1D) vector. This 1D model loses the inherent spatial structure property of data. An alternative solution is to employ Tensor Decomposition for dictionary learning on their original structural form —a tensor— by learning multiple dictionaries along each mode and the corresponding sparse representation in respect to the Kronecker product of these dictionaries. To learn tensor dictionaries along each mode, all the existing methods update each dictionary iteratively in an alternating manner. Because atoms from each mode dictionary jointly make contributions to the sparsity of tensor, existing works ignore atoms correlations between different mode dictionaries by treating each mode dictionary independently. In this paper, we propose a joint multiple dictionary learning method for tensor sparse coding, which explores atom correlations for sparse representation and updates multiple atoms from each mode dictionary simultaneously. In this algorithm, the Frequent-Pattern Tree (FP-tree) mining algorithm is employed to exploit frequent atom patterns in the sparse representation. Inspired by the idea of K-SVD, we develop a new dictionary update method that jointly updates elements in each pattern. Experimental results demonstrate our method outperforms other tensor based dictionary learning algorithms.
Resumo:
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Resumo:
The disadvantage of the majority of data assimilation schemes is the assumption that the conditional probability density function of the state of the system given the observations [posterior probability density function (PDF)] is distributed either locally or globally as a Gaussian. The advantage, however, is that through various different mechanisms they ensure initial conditions that are predominantly in linear balance and therefore spurious gravity wave generation is suppressed. The equivalent-weights particle filter is a data assimilation scheme that allows for a representation of a potentially multimodal posterior PDF. It does this via proposal densities that lead to extra terms being added to the model equations and means the advantage of the traditional data assimilation schemes, in generating predominantly balanced initial conditions, is no longer guaranteed. This paper looks in detail at the impact the equivalent-weights particle filter has on dynamical balance and gravity wave generation in a primitive equation model. The primary conclusions are that (i) provided the model error covariance matrix imposes geostrophic balance, then each additional term required by the equivalent-weights particle filter is also geostrophically balanced; (ii) the relaxation term required to ensure the particles are in the locality of the observations has little effect on gravity waves and actually induces a reduction in gravity wave energy if sufficiently large; and (iii) the equivalent-weights term, which leads to the particles having equivalent significance in the posterior PDF, produces a change in gravity wave energy comparable to the stochastic model error. Thus, the scheme does not produce significant spurious gravity wave energy and so has potential for application in real high-dimensional geophysical applications.
Resumo:
This paper investigates the use of a particle filter for data assimilation with a full scale coupled ocean–atmosphere general circulation model. Synthetic twin experiments are performed to assess the performance of the equivalent weights filter in such a high-dimensional system. Artificial 2-dimensional sea surface temperature fields are used as observational data every day. Results are presented for different values of the free parameters in the method. Measures of the performance of the filter are root mean square errors, trajectories of individual variables in the model and rank histograms. Filter degeneracy is not observed and the performance of the filter is shown to depend on the ability to keep maximum spread in the ensemble.
Resumo:
Filter degeneracy is the main obstacle for the implementation of particle filter in non-linear high-dimensional models. A new scheme, the implicit equal-weights particle filter (IEWPF), is introduced. In this scheme samples are drawn implicitly from proposal densities with a different covariance for each particle, such that all particle weights are equal by construction. We test and explore the properties of the new scheme using a 1,000-dimensional simple linear model, and the 1,000-dimensional non-linear Lorenz96 model, and compare the performance of the scheme to a Local Ensemble Kalman Filter. The experiments show that the new scheme can easily be implemented in high-dimensional systems and is never degenerate, with good convergence properties in both systems.