910 resultados para Biclustering algorithms
Resumo:
Frequent episode discovery framework is a popular framework in temporal data mining with many applications. Over the years, many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper, we present a unified view of all the apriori-based discovery methods for serial episodes under these different notions of frequencies. Specifically, we present a unified view of the various frequency counting algorithms. We propose a generic counting algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies, and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various counting algorithms as we show here. It also aids in understanding and obtaining the anti-monotonicity properties satisfied by the various frequencies, the properties exploited by the candidate generation step of any apriori-based method. We also point out how our unified view of counting helps to consider generalization of the algorithm to count episodes with general partial orders.
Resumo:
In this paper, we propose power management algorithms for maximizing the utility of energy harvesting sensors (EHS) that operate purely on the basis of energy harvested from the environment. In particular, we consider communication (i.e., transmission and reception) power management issues for EHS under an energy neutrality constraint. We also consider the fixed power loss effects of the circuitry, the battery inefficiency and its storage capacity, in the design of the algorithms. We propose a two-stage structure that exploits the inherent difference in the timescales at which the energy harvesting and channel fading processes evolve, without loss of optimality of the resulting solution. The outer stage schedules the power that can be used by an inner stage algorithm, so as to maximize the long term average utility and at the same time maintain energy neutrality. The inner stage optimizes the communication parameters to achieve maximum utility in the short-term, subject to the power constraint imposed by the outer stage. We optimize the algorithms for different transmission schemes such as the truncated channel inversion and retransmission strategies. The performance of the algorithms is illustrated via simulations using solar irradiance data, and for the case of Rayleigh fading channels. The results demonstrate the significant performance benefits that can be obtained using the proposed power management algorithms compared to the energy efficient (optimum when there is no storage) and the uniform power consumption (optimum when the battery has infinite capacity and is perfectly efficient) approaches.
Resumo:
We have developed an efficient fully three-dimensional (3D) reconstruction algorithm for diffuse optical tomography (DOT). The 3D DOT, a severely ill-posed problem, is tackled through a pseudodynamic (PD) approach wherein an ordinary differential equation representing the evolution of the solution on pseudotime is integrated that bypasses an explicit inversion of the associated, ill-conditioned system matrix. One of the most computationally expensive parts of the iterative DOT algorithm, the reevaluation of the Jacobian in each of the iterations, is avoided by using the adjoint-Broyden update formula to provide low rank updates to the Jacobian. In addition, wherever feasible, we have also made the algorithm efficient by integrating along the quadratic path provided by the perturbation equation containing the Hessian. These algorithms are then proven by reconstruction, using simulated and experimental data and verifying the PD results with those from the popular Gauss-Newton scheme. The major findings of this work are as follows: (i) the PD reconstructions are comparatively artifact free, providing superior absorption coefficient maps in terms of quantitative accuracy and contrast recovery; (ii) the scaling of computation time with the dimension of the measurement set is much less steep with the Jacobian update formula in place than without it; and (iii) an increase in the data dimension, even though it renders the reconstruction problem less ill conditioned and thus provides relatively artifact-free reconstructions, does not necessarily provide better contrast property recovery. For the latter, one should also take care to uniformly distribute the measurement points, avoiding regions close to the source so that the relative strength of the derivatives for measurements away from the source does not become insignificant. (c) 2012 Optical Society of America
Resumo:
We propose a novel technique for reducing the power consumed by the on-chip cache in SNUCA chip multicore platform. This is achieved by what we call a "remap table", which maps accesses to the cache banks that are as close as possible to the cores, on which the processes are scheduled. With this technique, instead of using all the available cache, we use a portion of the cache and allocate lesser cache to the application. We formulate the problem as an energy-delay (ED) minimization problem and solve it offline using a scalable genetic algorithm approach. Our experiments show up to 40% of savings in the memory sub-system power consumption and 47% savings in energy-delay product (ED).
Resumo:
We propose a novel technique for reducing the power consumed by the on-chip cache in SNUCA chip multicore platform. This is achieved by what we call a "remap table", which maps accesses to the cache banks that are as close as possible to the cores, on which the processes are scheduled. With this technique, instead of using all the available cache, we use a portion of the cache and allocate lesser cache to the application. We formulate the problem as an energy-delay (ED) minimization problem and solve it offline using a scalable genetic algorithm approach. Our experiments show up to 40% of savings in the memory sub-system power consumption and 47% savings in energy-delay product (ED).
Resumo:
In recent times computational algorithms inspired by biological processes and evolution are gaining much popularity for solving science and engineering problems. These algorithms are broadly classified into evolutionary computation and swarm intelligence algorithms, which are derived based on the analogy of natural evolution and biological activities. These include genetic algorithms, genetic programming, differential evolution, particle swarm optimization, ant colony optimization, artificial neural networks, etc. The algorithms being random-search techniques, use some heuristics to guide the search towards optimal solution and speed-up the convergence to obtain the global optimal solutions. The bio-inspired methods have several attractive features and advantages compared to conventional optimization solvers. They also facilitate the advantage of simulation and optimization environment simultaneously to solve hard-to-define (in simple expressions), real-world problems. These biologically inspired methods have provided novel ways of problem-solving for practical problems in traffic routing, networking, games, industry, robotics, economics, mechanical, chemical, electrical, civil, water resources and others fields. This article discusses the key features and development of bio-inspired computational algorithms, and their scope for application in science and engineering fields.
Resumo:
Wireless sensor networks can often be viewed in terms of a uniform deployment of a large number of nodes in a region of Euclidean space. Following deployment, the nodes self-organize into a mesh topology with a key aspect being self-localization. Having obtained a mesh topology in a dense, homogeneous deployment, a frequently used approximation is to take the hop distance between nodes to be proportional to the Euclidean distance between them. In this work, we analyze this approximation through two complementary analyses. We assume that the mesh topology is a random geometric graph on the nodes; and that some nodes are designated as anchors with known locations. First, we obtain high probability bounds on the Euclidean distances of all nodes that are h hops away from a fixed anchor node. In the second analysis, we provide a heuristic argument that leads to a direct approximation for the density function of the Euclidean distance between two nodes that are separated by a hop distance h. This approximation is shown, through simulation, to very closely match the true density function. Localization algorithms that draw upon the preceding analyses are then proposed and shown to perform better than some of the well-known algorithms present in the literature. Belief-propagation-based message-passing is then used to further enhance the performance of the proposed localization algorithms. To our knowledge, this is the first usage of message-passing for hop-count-based self-localization.
Resumo:
Structural Support Vector Machines (SSVMs) have become a popular tool in machine learning for predicting structured objects like parse trees, Part-of-Speech (POS) label sequences and image segments. Various efficient algorithmic techniques have been proposed for training SSVMs for large datasets. The typical SSVM formulation contains a regularizer term and a composite loss term. The loss term is usually composed of the Linear Maximum Error (LME) associated with the training examples. Other alternatives for the loss term are yet to be explored for SSVMs. We formulate a new SSVM with Linear Summed Error (LSE) loss term and propose efficient algorithms to train the new SSVM formulation using primal cutting-plane method and sequential dual coordinate descent method. Numerical experiments on benchmark datasets demonstrate that the sequential dual coordinate descent method is faster than the cutting-plane method and reaches the steady-state generalization performance faster. It is thus a useful alternative for training SSVMs when linear summed error is used.
Resumo:
The q-Gaussian distribution results from maximizing certain generalizations of Shannon entropy under some constraints. The importance of q-Gaussian distributions stems from the fact that they exhibit power-law behavior, and also generalize Gaussian distributions. In this paper, we propose a Smoothed Functional (SF) scheme for gradient estimation using q-Gaussian distribution, and also propose an algorithm for optimization based on the above scheme. Convergence results of the algorithm are presented. Performance of the proposed algorithm is shown by simulation results on a queuing model.
Resumo:
We consider the problem of optimal routing in a multi-stage network of queues with constraints on queue lengths. We develop three algorithms for probabilistic routing for this problem using only the total end-to-end delays. These algorithms use the smoothed functional (SF) approach to optimize the routing probabilities. In our model all the queues are assumed to have constraints on the average queue length. We also propose a novel quasi-Newton based SF algorithm. Policies like Join Shortest Queue or Least Work Left work only for unconstrained routing. Besides assuming knowledge of the queue length at all the queues. If the only information available is the expected end-to-end delay as with our case such policies cannot be used. We also give simulation results showing the performance of the SF algorithms for this problem.
Resumo:
Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.
Operator-splitting finite element algorithms for computations of high-dimensional parabolic problems
Resumo:
An operator-splitting finite element method for solving high-dimensional parabolic equations is presented. The stability and the error estimates are derived for the proposed numerical scheme. Furthermore, two variants of fully-practical operator-splitting finite element algorithms based on the quadrature points and the nodal points, respectively, are presented. Both the quadrature and the nodal point based operator-splitting algorithms are validated using a three-dimensional (3D) test problem. The numerical results obtained with the full 3D computations and the operator-split 2D + 1D computations are found to be in a good agreement with the analytical solution. Further, the optimal order of convergence is obtained in both variants of the operator-splitting algorithms. (C) 2012 Elsevier Inc. All rights reserved.