101 resultados para vector quantization based Gaussian modeling
Resumo:
Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.
Resumo:
Accurate estimates of how soil water stress affects plant transpiration are crucial for reliable land surface model (LSM) predictions. Current LSMs generally use a water stress factor, β, dependent on soil moisture content, θ, that ranges linearly between β = 1 for unstressed vegetation and β = 0 when wilting point is reached. This paper explores the feasibility of replacing the current approach with equations that use soil water potential as their independent variable, or with a set of equations that involve hydraulic and chemical signaling, thereby ensuring feedbacks between the entire soil–root–xylem–leaf system. A comparison with the original linear θ-based water stress parameterization, and with its improved curvi-linear version, was conducted. Assessment of model suitability was focused on their ability to simulate the correct (as derived from experimental data) curve shape of relative transpiration versus fraction of transpirable soil water. We used model sensitivity analyses under progressive soil drying conditions, employing two commonly used approaches to calculate water retention and hydraulic conductivity curves. Furthermore, for each of these hydraulic parameterizations we used two different parameter sets, for 3 soil texture types; a total of 12 soil hydraulic permutations. Results showed that the resulting transpiration reduction functions (TRFs) varied considerably among the models. The fact that soil hydraulic conductivity played a major role in the model that involved hydraulic and chemical signaling led to unrealistic values of β, and hence TRF, for many soil hydraulic parameter sets. However, this model is much better equipped to simulate the behavior of different plant species. Based on these findings, we only recommend implementation of this approach into LSMs if great care with choice of soil hydraulic parameters is taken
Resumo:
An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.
Resumo:
Learning low dimensional manifold from highly nonlinear data of high dimensionality has become increasingly important for discovering intrinsic representation that can be utilized for data visualization and preprocessing. The autoencoder is a powerful dimensionality reduction technique based on minimizing reconstruction error, and it has regained popularity because it has been efficiently used for greedy pretraining of deep neural networks. Compared to Neural Network (NN), the superiority of Gaussian Process (GP) has been shown in model inference, optimization and performance. GP has been successfully applied in nonlinear Dimensionality Reduction (DR) algorithms, such as Gaussian Process Latent Variable Model (GPLVM). In this paper we propose the Gaussian Processes Autoencoder Model (GPAM) for dimensionality reduction by extending the classic NN based autoencoder to GP based autoencoder. More interestingly, the novel model can also be viewed as back constrained GPLVM (BC-GPLVM) where the back constraint smooth function is represented by a GP. Experiments verify the performance of the newly proposed model.
On-line Gaussian mixture density estimator for adaptive minimum bit-error-rate beamforming receivers
Resumo:
We develop an on-line Gaussian mixture density estimator (OGMDE) in the complex-valued domain to facilitate adaptive minimum bit-error-rate (MBER) beamforming receiver for multiple antenna based space-division multiple access systems. Specifically, the novel OGMDE is proposed to adaptively model the probability density function of the beamformer’s output by tracking the incoming data sample by sample. With the aid of the proposed OGMDE, our adaptive beamformer is capable of updating the beamformer’s weights sample by sample to directly minimize the achievable bit error rate (BER). We show that this OGMDE based MBER beamformer outperforms the existing on-line MBER beamformer, known as the least BER beamformer, in terms of both the convergence speed and the achievable BER.
Resumo:
Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.
An LDA and probability-based classifier for the diagnosis of Alzheimer's Disease from structural MRI
Resumo:
In this paper a custom classification algorithm based on linear discriminant analysis and probability-based weights is implemented and applied to the hippocampus measurements of structural magnetic resonance images from healthy subjects and Alzheimer’s Disease sufferers; and then attempts to diagnose them as accurately as possible. The classifier works by classifying each measurement of a hippocampal volume as healthy controlsized or Alzheimer’s Disease-sized, these new features are then weighted and used to classify the subject as a healthy control or suffering from Alzheimer’s Disease. The preliminary results obtained reach an accuracy of 85.8% and this is a similar accuracy to state-of-the-art methods such as a Naive Bayes classifier and a Support Vector Machine. An advantage of the method proposed in this paper over the aforementioned state of the art classifiers is the descriptive ability of the classifications it produces. The descriptive model can be of great help to aid a doctor in the diagnosis of Alzheimer’s Disease, or even further the understand of how Alzheimer’s Disease affects the hippocampus.
Resumo:
Currently, multi-attribute auctions are becoming widespread awarding mechanisms for contracts in construction, and in these auctions, criteria other than price are taken into account for ranking bidder proposals. Therefore, being the lowest-price bidder is no longer a guarantee of being awarded, thus increasing the importance of measuring any bidder’s performance when not only the first position (lowest price) matters. Modeling position performance allows a tender manager to calculate the probability curves related to the more likely positions to be occupied by any bidder who enters a competitive auction irrespective of the actual number of future participating bidders. This paper details a practical methodology based on simple statistical calculations for modeling the performance of a single bidder or a group of bidders, constituting a useful resource for analyzing one’s own success while benchmarking potential bidding competitors.
Resumo:
The personalised conditioning system (PCS) is widely studied. Potentially, it is able to reduce energy consumption while securing occupants’ thermal comfort requirements. It has been suggested that automatic optimised operation schemes for PCS should be introduced to avoid energy wastage and discomfort caused by inappropriate operation. In certain automatic operation schemes, personalised thermal sensation models are applied as key components to help in setting targets for PCS operation. In this research, a novel personal thermal sensation modelling method based on the C-Support Vector Classification (C-SVC) algorithm has been developed for PCS control. The personal thermal sensation modelling has been regarded as a classification problem. During the modelling process, the method ‘learns’ an occupant’s thermal preferences from his/her feedback, environmental parameters and personal physiological and behavioural factors. The modelling method has been verified by comparing the actual thermal sensation vote (TSV) with the modelled one based on 20 individual cases. Furthermore, the accuracy of each individual thermal sensation model has been compared with the outcomes of the PMV model. The results indicate that the modelling method presented in this paper is an effective tool to model personal thermal sensations and could be integrated within the PCS for optimised system operation and control.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
This paper describes a novel on-line learning approach for radial basis function (RBF) neural network. Based on an RBF network with individually tunable nodes and a fixed small model size, the weight vector is adjusted using the multi-innovation recursive least square algorithm on-line. When the residual error of the RBF network becomes large despite of the weight adaptation, an insignificant node with little contribution to the overall system is replaced by a new node. Structural parameters of the new node are optimized by proposed fast algorithms in order to significantly improve the modeling performance. The proposed scheme describes a novel, flexible, and fast way for on-line system identification problems. Simulation results show that the proposed approach can significantly outperform existing ones for nonstationary systems in particular.