216 resultados para Sparse Incremental Em Algorithm
Resumo:
Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.
Resumo:
In this paper, we propose a novel online modeling algorithm for nonlinear and nonstationary systems using a radial basis function (RBF) neural network with a fixed number of hidden nodes. Each of the RBF basis functions has a tunable center vector and an adjustable diagonal covariance matrix. A multi-innovation recursive least square (MRLS) algorithm is applied to update the weights of RBF online, while the modeling performance is monitored. When the modeling residual of the RBF network becomes large in spite of the weight adaptation, a node identified as insignificant is replaced with a new node, for which the tunable center vector and diagonal covariance matrix are optimized using the quantum particle swarm optimization (QPSO) algorithm. The major contribution is to combine the MRLS weight adaptation and QPSO node structure optimization in an innovative way so that it can track well the local characteristic in the nonstationary system with a very sparse model. Simulation results show that the proposed algorithm has significantly better performance than existing approaches.
Resumo:
This contribution introduces a new digital predistorter to compensate serious distortions caused by memory high power amplifiers (HPAs) which exhibit output saturation characteristics. The proposed design is based on direct learning using a data-driven B-spline Wiener system modeling approach. The nonlinear HPA with memory is first identified based on the B-spline neural network model using the Gauss-Newton algorithm, which incorporates the efficient De Boor algorithm with both B-spline curve and first derivative recursions. The estimated Wiener HPA model is then used to design the Hammerstein predistorter. In particular, the inverse of the amplitude distortion of the HPA's static nonlinearity can be calculated effectively using the Newton-Raphson formula based on the inverse of De Boor algorithm. A major advantage of this approach is that both the Wiener HPA identification and the Hammerstein predistorter inverse can be achieved very efficiently and accurately. Simulation results obtained are presented to demonstrate the effectiveness of this novel digital predistorter design.
Resumo:
We present a new sparse shape modeling framework on the Laplace-Beltrami (LB) eigenfunctions. Traditionally, the LB-eigenfunctions are used as a basis for intrinsically representing surface shapes by forming a Fourier series expansion. To reduce high frequency noise, only the first few terms are used in the expansion and higher frequency terms are simply thrown away. However, some lower frequency terms may not necessarily contribute significantly in reconstructing the surfaces. Motivated by this idea, we propose to filter out only the significant eigenfunctions by imposing l1-penalty. The new sparse framework can further avoid additional surface-based smoothing often used in the field. The proposed approach is applied in investigating the influence of age (38-79 years) and gender on amygdala and hippocampus shapes in the normal population. In addition, we show how the emotional response is related to the anatomy of the subcortical structures.
Resumo:
In most Western countries, saturated fatty acid (SFA) intake exceeds recommended levels, which is considered a risk factor for cardiovascular disease (CVD). As milk and dairy products are major contributors to SFA intake in many countries, recent research has focused on sustainable methods of producing milk with a lower saturated fat concentration by altering dairy cow diets. Human intervention studies have shown that CVD risk can be reduced by consuming dairy products with reduced SFA and increased cis-monounsaturated fatty acid (MUFA) concentrations. This milk fatty acid profile can be achieved by supplementing dairy cow diets with cis-MUFA-rich unsaturated oils. However, rumen exposure of unsaturated oils also leads to enhanced milk trans fatty acid (TFA) concentrations. Because of concerns about the effects of TFA consumption on CVD, feeding strategies that increase MUFA concentrations in milk without concomitant increases in TFA concentration are preferred by milk processors. In an attempt to limit TFA production and increase the replacement of SFA by cis-MUFA, a preparation of rumen-protected unsaturated oils was developed using saponification with calcium salts. Four multiparous Holstein-Friesian cows in mid-late lactation were used in a 4 × 4 Latin square design with 21-d periods to investigate the effect of incremental dietary inclusion of a calcium salt of cis-MUFA product (Ca-MUFA; 20, 40, and 60 g/kg of dry matter of a maize silage-based diet), on milk production, composition, and fatty acid concentration. Increasing Ca-MUFA inclusion reduced dry matter intake linearly, but no change was observed in estimated ME intake. No change in milk yield was noted, but milk fat and protein concentrations were linearly reduced. Supplementation with Ca-MUFA resulted in a linear reduction in total SFA (from 71 to 52 g/100 g of fatty acids for control and 60 g/kg of dry matter diets, respectively). In addition, concentrations of both cis- and trans-MUFA were increased with Ca-MUFA inclusion, and increases in other biohydrogenation intermediates in milk fat were also observed. The Ca-MUFA supplement was very effective at reducing milk SFA concentration and increasing cis-MUFA concentrations without incurring any negative effects on milk and milk component yields. However, reduced milk fat and protein concentrations, together with increases in milk TFA concentrations, suggest partial dissociation of the calcium salts in the rumen
Resumo:
Evolutionary meta-algorithms for pulse shaping of broadband femtosecond duration laser pulses are proposed. The genetic algorithm searching the evolutionary landscape for desired pulse shapes consists of a population of waveforms (genes), each made from two concatenated vectors, specifying phases and magnitudes, respectively, over a range of frequencies. Frequency domain operators such as mutation, two-point crossover average crossover, polynomial phase mutation, creep and three-point smoothing as well as a time-domain crossover are combined to produce fitter offsprings at each iteration step. The algorithm applies roulette wheel selection; elitists and linear fitness scaling to the gene population. A differential evolution (DE) operator that provides a source of directed mutation and new wavelet operators are proposed. Using properly tuned parameters for DE, the meta-algorithm is used to solve a waveform matching problem. Tuning allows either a greedy directed search near the best known solution or a robust search across the entire parameter space.
Resumo:
A two-stage linear-in-the-parameter model construction algorithm is proposed aimed at noisy two-class classification problems. The purpose of the first stage is to produce a prefiltered signal that is used as the desired output for the second stage which constructs a sparse linear-in-the-parameter classifier. The prefiltering stage is a two-level process aimed at maximizing a model's generalization capability, in which a new elastic-net model identification algorithm using singular value decomposition is employed at the lower level, and then, two regularization parameters are optimized using a particle-swarm-optimization algorithm at the upper level by minimizing the leave-one-out (LOO) misclassification rate. It is shown that the LOO misclassification rate based on the resultant prefiltered signal can be analytically computed without splitting the data set, and the associated computational cost is minimal due to orthogonality. The second stage of sparse classifier construction is based on orthogonal forward regression with the D-optimality algorithm. Extensive simulations of this approach for noisy data sets illustrate the competitiveness of this approach to classification of noisy data problems.
Resumo:
This paper analyze and study a pervasive computing system in a mining environment to track people based on RFID (radio frequency identification) technology. In first instance, we explain the RFID fundamentals and the LANDMARC (location identification based on dynamic active RFID calibration) algorithm, then we present the proposed algorithm combining LANDMARC and trilateration technique to collect the coordinates of the people inside the mine, next we generalize a pervasive computing system that can be implemented in mining, and finally we show the results and conclusions.
Resumo:
Remote sensing observations often have correlated errors, but the correlations are typically ignored in data assimilation for numerical weather prediction. The assumption of zero correlations is often used with data thinning methods, resulting in a loss of information. As operational centres move towards higher-resolution forecasting, there is a requirement to retain data providing detail on appropriate scales. Thus an alternative approach to dealing with observation error correlations is needed. In this article, we consider several approaches to approximating observation error correlation matrices: diagonal approximations, eigendecomposition approximations and Markov matrices. These approximations are applied in incremental variational assimilation experiments with a 1-D shallow water model using synthetic observations. Our experiments quantify analysis accuracy in comparison with a reference or ‘truth’ trajectory, as well as with analyses using the ‘true’ observation error covariance matrix. We show that it is often better to include an approximate correlation structure in the observation error covariance matrix than to incorrectly assume error independence. Furthermore, by choosing a suitable matrix approximation, it is feasible and computationally cheap to include error correlation structure in a variational data assimilation algorithm.
Resumo:
For Northern Hemisphere extra-tropical cyclone activity, the dependency of a potential anthropogenic climate change signal on the identification method applied is analysed. This study investigates the impact of the used algorithm on the changing signal, not the robustness of the climate change signal itself. Using one single transient AOGCM simulation as standard input for eleven state-of-the-art identification methods, the patterns of model simulated present day climatologies are found to be close to those computed from re-analysis, independent of the method applied. Although differences in the total number of cyclones identified exist, the climate change signals (IPCC SRES A1B) in the model run considered are largely similar between methods for all cyclones. Taking into account all tracks, decreasing numbers are found in the Mediterranean, the Arctic in the Barents and Greenland Seas, the mid-latitude Pacific and North America. Changing patterns are even more similar, if only the most severe systems are considered: the methods reveal a coherent statistically significant increase in frequency over the eastern North Atlantic and North Pacific. We found that the differences between the methods considered are largely due to the different role of weaker systems in the specific methods.
Resumo:
Northern Hemisphere cyclone activity is assessed by applying an algorithm for the detection and tracking of synoptic scale cyclones to mean sea level pressure data. The method, originally developed for the Southern Hemisphere, is adapted for application in the Northern Hemisphere winter season. NCEP-Reanalysis data from 1958/59 to 1997/98 are used as input. The sensitivities of the results to particular parameters of the algorithm are discussed for both case studies and from a climatological point of view. Results show that the choice of settings is of major relevance especially for the tracking of smaller scale and fast moving systems. With an appropriate setting the algorithm is capable of automatically tracking different types of cyclones at the same time: Both fast moving and developing systems over the large ocean basins and smaller scale cyclones over the Mediterranean basin can be assessed. The climatology of cyclone variables, e.g., cyclone track density, cyclone counts, intensification rates, propagation speeds and areas of cyclogenesis and -lysis gives detailed information on typical cyclone life cycles for different regions. The lowering of the spatial and temporal resolution of the input data from full resolution T62/06h to T42/12h decreases the cyclone track density and cyclone counts. Reducing the temporal resolution alone contributes to a decline in the number of fast moving systems, which is relevant for the cyclone track density. Lowering spatial resolution alone mainly reduces the number of weak cyclones.
Resumo:
For an increasing number of applications, mesoscale modelling systems now aim to better represent urban areas. The complexity of processes resolved by urban parametrization schemes varies with the application. The concept of fitness-for-purpose is therefore critical for both the choice of parametrizations and the way in which the scheme should be evaluated. A systematic and objective model response analysis procedure (Multiobjective Shuffled Complex Evolution Metropolis (MOSCEM) algorithm) is used to assess the fitness of the single-layer urban canopy parametrization implemented in the Weather Research and Forecasting (WRF) model. The scheme is evaluated regarding its ability to simulate observed surface energy fluxes and the sensitivity to input parameters. Recent amendments are described, focussing on features which improve its applicability to numerical weather prediction, such as a reduced and physically more meaningful list of input parameters. The study shows a high sensitivity of the scheme to parameters characterizing roof properties in contrast to a low response to road-related ones. Problems in partitioning of energy between turbulent sensible and latent heat fluxes are also emphasized. Some initial guidelines to prioritize efforts to obtain urban land-cover class characteristics in WRF are provided. Copyright © 2010 Royal Meteorological Society and Crown Copyright.
Resumo:
In this paper a modified algorithm is suggested for developing polynomial neural network (PNN) models. Optimal partial description (PD) modeling is introduced at each layer of the PNN expansion, a task accomplished using the orthogonal least squares (OLS) method. Based on the initial PD models determined by the polynomial order and the number of PD inputs, OLS selects the most significant regressor terms reducing the output error variance. The method produces PNN models exhibiting a high level of accuracy and superior generalization capabilities. Additionally, parsimonious models are obtained comprising a considerably smaller number of parameters compared to the ones generated by means of the conventional PNN algorithm. Three benchmark examples are elaborated, including modeling of the gas furnace process as well as the iris and wine classification problems. Extensive simulation results and comparison with other methods in the literature, demonstrate the effectiveness of the suggested modeling approach.