75 resultados para computer algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The rapid growth in the field of data mining has lead to the development of various methods for outlier detection. Though detection of outliers has been well explored in the context of numerical data, dealing with categorical data is still evolving. In this paper, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel multi-timescale Q-learning algorithm for average cost control in a Markov decision process subject to multiple inequality constraints. We formulate a relaxed version of this problem through the Lagrange multiplier method. Our algorithm is different from Q-learning in that it updates two parameters - a Q-value parameter and a policy parameter. The Q-value parameter is updated on a slower time scale as compared to the policy parameter. Whereas Q-learning with function approximation can diverge in some cases, our algorithm is seen to be convergent as a result of the aforementioned timescale separation. We show the results of experiments on a problem of constrained routing in a multistage queueing network. Our algorithm is seen to exhibit good performance and the various inequality constraints are seen to be satisfied upon convergence of the algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The contour tree is a topological abstraction of a scalar field that captures evolution in level set connectivity. It is an effective representation for visual exploration and analysis of scientific data. We describe a work-efficient, output sensitive, and scalable parallel algorithm for computing the contour tree of a scalar field defined on a domain that is represented using either an unstructured mesh or a structured grid. A hybrid implementation of the algorithm using the GPU and multi-core CPU can compute the contour tree of an input containing 16 million vertices in less than ten seconds with a speedup factor of upto 13. Experiments based on an implementation in a multi-core CPU environment show near-linear speedup for large data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of developing privacy-preserving machine learning algorithms in a dis-tributed multiparty setting. Here different parties own different parts of a data set, and the goal is to learn a classifier from the entire data set with-out any party revealing any information about the individual data points it owns. Pathak et al [7]recently proposed a solution to this problem in which each party learns a local classifier from its own data, and a third party then aggregates these classifiers in a privacy-preserving manner using a cryptographic scheme. The generaliza-tion performance of their algorithm is sensitive to the number of parties and the relative frac-tions of data owned by the different parties. In this paper, we describe a new differentially pri-vate algorithm for the multiparty setting that uses a stochastic gradient descent based procedure to directly optimize the overall multiparty ob-jective rather than combining classifiers learned from optimizing local objectives. The algorithm achieves a slightly weaker form of differential privacy than that of [7], but provides improved generalization guarantees that do not depend on the number of parties or the relative sizes of the individual data sets. Experimental results corrob-orate our theoretical findings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In wireless sensor networks (WSNs) the communication traffic is often time and space correlated, where multiple nodes in a proximity start transmitting at the same time. Such a situation is known as spatially correlated contention. The random access methods to resolve such contention suffers from high collision rate, whereas the traditional distributed TDMA scheduling techniques primarily try to improve the network capacity by reducing the schedule length. Usually, the situation of spatially correlated contention persists only for a short duration and therefore generating an optimal or sub-optimal schedule is not very useful. On the other hand, if the algorithm takes very large time to schedule, it will not only introduce additional delay in the data transfer but also consume more energy. To efficiently handle the spatially correlated contention in WSNs, we present a distributed TDMA slot scheduling algorithm, called DTSS algorithm. The DTSS algorithm is designed with the primary objective of reducing the time required to perform scheduling, while restricting the schedule length to maximum degree of interference graph. The algorithm uses randomized TDMA channel access as the mechanism to transmit protocol messages, which bounds the message delay and therefore reduces the time required to get a feasible schedule. The DTSS algorithm supports unicast, multicast and broadcast scheduling, simultaneously without any modification in the protocol. The protocol has been simulated using Castalia simulator to evaluate the run time performance. Simulation results show that our protocol is able to considerably reduce the time required to schedule.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electrical Impedance Tomography (EIT) is a computerized medical imaging technique which reconstructs the electrical impedance images of a domain under test from the boundary voltage-current data measured by an EIT electronic instrumentation using an image reconstruction algorithm. Being a computed tomography technique, EIT injects a constant current to the patient's body through the surface electrodes surrounding the domain to be imaged (Omega) and tries to calculate the spatial distribution of electrical conductivity or resistivity of the closed conducting domain using the potentials developed at the domain boundary (partial derivative Omega). Practical phantoms are essentially required to study, test and calibrate a medical EIT system for certifying the system before applying it on patients for diagnostic imaging. Therefore, the EIT phantoms are essentially required to generate boundary data for studying and assessing the instrumentation and inverse solvers a in EIT. For proper assessment of an inverse solver of a 2D EIT system, a perfect 2D practical phantom is required. As the practical phantoms are the assemblies of the objects with 3D geometries, the developing of a practical 2D-phantom is a great challenge and therefore, the boundary data generated from the practical phantoms with 3D geometry are found inappropriate for assessing a 2D inverse solver. Furthermore, the boundary data errors contributed by the instrumentation are also difficult to separate from the errors developed by the 3D phantoms. Hence, the errorless boundary data are found essential to assess the inverse solver in 2D EIT. In this direction, a MatLAB-based Virtual Phantom for 2D EIT (MatVP2DEIT) is developed to generate accurate boundary data for assessing the 2D-EIT inverse solvers and the image reconstruction accuracy. MatVP2DEIT is a MatLAB-based computer program which simulates a phantom in computer and generates the boundary potential data as the outputs by using the combinations of different phantom parameters as the inputs to the program. Phantom diameter, inhomogeneity geometry (shape, size and position), number of inhomogeneities, applied current magnitude, background resistivity, inhomogeneity resistivity all are set as the phantom variables which are provided as the input parameters to the MatVP2DEIT for simulating different phantom configurations. A constant current injection is simulated at the phantom boundary with different current injection protocols and boundary potential data are calculated. Boundary data sets are generated with different phantom configurations obtained with the different combinations of the phantom variables and the resistivity images are reconstructed using EIDORS. Boundary data of the virtual phantoms, containing inhomogeneities with complex geometries, are also generated for different current injection patterns using MatVP2DEIT and the resistivity imaging is studied. The effect of regularization method on the image reconstruction is also studied with the data generated by MatVP2DEIT. Resistivity images are evaluated by studying the resistivity parameters and contrast parameters estimated from the elemental resistivity profiles of the reconstructed phantom domain. Results show that the MatVP2DEIT generates accurate boundary data for different types of single or multiple objects which are efficient and accurate enough to reconstruct the resistivity images in EIDORS. The spatial resolution studies show that, the resistivity imaging conducted with the boundary data generated by MatVP2DEIT with 2048 elements, can reconstruct two circular inhomogeneities placed with a minimum distance (boundary to boundary) of 2 mm. It is also observed that, in MatVP2DEIT with 2048 elements, the boundary data generated for a phantom with a circular inhomogeneity of a diameter less than 7% of that of the phantom domain can produce resistivity images in EIDORS with a 1968 element mesh. Results also show that the MatVP2DEIT accurately generates the boundary data for neighbouring, opposite reference and trigonometric current patterns which are very suitable for resistivity reconstruction studies. MatVP2DEIT generated data are also found suitable for studying the effect of the different regularization methods on reconstruction process. Comparing the reconstructed image with an original geometry made in MatVP2DEIT, it would be easier to study the resistivity imaging procedures as well as the inverse solver performance. Using the proposed MatVP2DEIT software with modified domains, the cross sectional anatomy of a number of body parts can be simulated in PC and the impedance image reconstruction of human anatomy can be studied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a simulation-based algorithm for computing the optimal pricing policy for a product under uncertain demand dynamics. We consider a parameterized stochastic differential equation (SDE) model for the uncertain demand dynamics of the product over the planning horizon. In particular, we consider a dynamic model that is an extension of the Bass model. The performance of our algorithm is compared to that of a myopic pricing policy and is shown to give better results. Two significant advantages with our algorithm are as follows: (a) it does not require information on the system model parameters if the SDE system state is known via either a simulation device or real data, and (b) as it works efficiently even for high-dimensional parameters, it uses the efficient smoothed functional gradient estimator.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we study a problem of designing a multi-hop wireless network for interconnecting sensors (hereafter called source nodes) to a Base Station (BS), by deploying a minimum number of relay nodes at a subset of given potential locations, while meeting a quality of service (QoS) objective specified as a hop count bound for paths from the sources to the BS. The hop count bound suffices to ensure a certain probability of the data being delivered to the BS within a given maximum delay under a light traffic model. We observe that the problem is NP-Hard. For this problem, we propose a polynomial time approximation algorithm based on iteratively constructing shortest path trees and heuristically pruning away the relay nodes used until the hop count bound is violated. Results show that the algorithm performs efficiently in various randomly generated network scenarios; in over 90% of the tested scenarios, it gave solutions that were either optimal or were worse than optimal by just one relay. We then use random graph techniques to obtain, under a certain stochastic setting, an upper bound on the average case approximation ratio of a class of algorithms (including the proposed algorithm) for this problem as a function of the number of source nodes, and the hop count bound. To the best of our knowledge, the average case analysis is the first of its kind in the relay placement literature. Since the design is based on a light traffic model, we also provide simulation results (using models for the IEEE 802.15.4 physical layer and medium access control) to assess the traffic levels up to which the QoS objectives continue to be met. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The correlation clustering problem is a fundamental problem in both theory and practice, and it involves identifying clusters of objects in a data set based on their similarity. A traditional modeling of this question as a graph theoretic problem involves associating vertices with data points and indicating similarity by adjacency. Clusters then correspond to cliques in the graph. The resulting optimization problem, Cluster Editing (and several variants) are very well-studied algorithmically. In many situations, however, translating clusters to cliques can be somewhat restrictive. A more flexible notion would be that of a structure where the vertices are mutually ``not too far apart'', without necessarily being adjacent. One such generalization is realized by structures called s-clubs, which are graphs of diameter at most s. In this work, we study the question of finding a set of at most k edges whose removal leaves us with a graph whose components are s-clubs. Recently, it has been shown that unless Exponential Time Hypothesis fail (ETH) fails Cluster Editing (whose components are 1-clubs) does not admit sub-exponential time algorithm STACS, 2013]. That is, there is no algorithm solving the problem in time 2 degrees((k))n(O(1)). However, surprisingly they show that when the number of cliques in the output graph is restricted to d, then the problem can be solved in time O(2(O(root dk)) + m + n). We show that this sub-exponential time algorithm for the fixed number of cliques is rather an exception than a rule. Our first result shows that assuming the ETH, there is no algorithm solving the s-Club Cluster Edge Deletion problem in time 2 degrees((k))n(O(1)). We show, further, that even the problem of deleting edges to obtain a graph with d s-clubs cannot be solved in time 2 degrees((k))n(O)(1) for any fixed s, d >= 2. This is a radical contrast from the situation established for cliques, where sub-exponential algorithms are known.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The boxicity (resp. cubicity) of a graph G(V, E) is the minimum integer k such that G can be represented as the intersection graph of axis parallel boxes (resp. cubes) in R-k. Equivalently, it is the minimum number of interval graphs (resp. unit interval graphs) on the vertex set V, such that the intersection of their edge sets is E. The problem of computing boxicity (resp. cubicity) is known to be inapproximable, even for restricted graph classes like bipartite, co-bipartite and split graphs, within an O(n(1-epsilon))-factor for any epsilon > 0 in polynomial time, unless NP = ZPP. For any well known graph class of unbounded boxicity, there is no known approximation algorithm that gives n(1-epsilon)-factor approximation algorithm for computing boxicity in polynomial time, for any epsilon > 0. In this paper, we consider the problem of approximating the boxicity (cubicity) of circular arc graphs intersection graphs of arcs of a circle. Circular arc graphs are known to have unbounded boxicity, which could be as large as Omega(n). We give a (2 + 1/k) -factor (resp. (2 + log n]/k)-factor) polynomial time approximation algorithm for computing the boxicity (resp. cubicity) of any circular arc graph, where k >= 1 is the value of the optimum solution. For normal circular arc (NCA) graphs, with an NCA model given, this can be improved to an additive two approximation algorithm. The time complexity of the algorithms to approximately compute the boxicity (resp. cubicity) is O(mn + n(2)) in both these cases, and in O(mn + kn(2)) = O(n(3)) time we also get their corresponding box (resp. cube) representations, where n is the number of vertices of the graph and m is its number of edges. Our additive two approximation algorithm directly works for any proper circular arc graph, since their NCA models can be computed in polynomial time. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In WSNs the communication traffic is often time and space correlated, where multiple nodes in a proximity start transmitting simultaneously. Such a situation is known as spatially correlated contention. The random access method to resolve such contention suffers from high collision rate, whereas the traditional distributed TDMA scheduling techniques primarily try to improve the network capacity by reducing the schedule length. Usually, the situation of spatially correlated contention persists only for a short duration, and therefore generating an optimal or suboptimal schedule is not very useful. Additionally, if an algorithm takes very long time to schedule, it will not only introduce additional delay in the data transfer but also consume more energy. In this paper, we present a distributed TDMA slot scheduling (DTSS) algorithm, which considerably reduces the time required to perform scheduling, while restricting the schedule length to the maximum degree of interference graph. The DTSS algorithm supports unicast, multicast, and broadcast scheduling, simultaneously without any modification in the protocol. We have analyzed the protocol for average case performance and also simulated it using Castalia simulator to evaluate its runtime performance. Both analytical and simulation results show that our protocol is able to considerably reduce the time required for scheduling.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Given a function from Z(n) to itself one can determine its polynomial representability by using Kempner function. In this paper we present an alternative characterization of polynomial functions over Z(n) by constructing a generating set for the Z(n)-module of polynomial functions. This characterization results in an algorithm that is faster on average in deciding polynomial representability. We also extend the characterization to functions in several variables. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The time division multiple access (TDMA) based channel access mechanisms perform better than the contention based channel access mechanisms, in terms of channel utilization, reliability and power consumption, specially for high data rate applications in wireless sensor networks (WSNs). Most of the existing distributed TDMA scheduling techniques can be classified as either static or dynamic. The primary purpose of static TDMA scheduling algorithms is to improve the channel utilization by generating a schedule of smaller length. But, they usually take longer time to schedule, and hence, are not suitable for WSNs, in which the network topology changes dynamically. On the other hand, dynamic TDMA scheduling algorithms generate a schedule quickly, but they are not efficient in terms of generated schedule length. In this paper, we propose a novel scheme for TDMA scheduling in WSNs, which can generate a compact schedule similar to static scheduling algorithms, while its runtime performance can be matched with those of dynamic scheduling algorithms. Furthermore, the proposed distributed TDMA scheduling algorithm has the capability to trade-off schedule length with the time required to generate the schedule. This would allow the developers of WSNs, to tune the performance, as per the requirement of prevalent WSN applications, and the requirement to perform re-scheduling. Finally, the proposed TDMA scheduling is fault-tolerant to packet loss due to erroneous wireless channel. The algorithm has been simulated using the Castalia simulator to compare its performance with those of others in terms of generated schedule length and the time required to generate the TDMA schedule. Simulation results show that the proposed algorithm generates a compact schedule in a very less time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semisupervised structured classification deals with a small number of labeled examples and a large number of unlabeled structured data. In this work, we consider semisupervised structural support vector machines with domain constraints. The optimization problem, which in general is not convex, contains the loss terms associated with the labeled and unlabeled examples, along with the domain constraints. We propose a simple optimization approach that alternates between solving a supervised learning problem and a constraint matching problem. Solving the constraint matching problem is difficult for structured prediction, and we propose an efficient and effective label switching method to solve it. The alternating optimization is carried out within a deterministic annealing framework, which helps in effective constraint matching and avoiding poor local minima, which are not very useful. The algorithm is simple and easy to implement. Further, it is suitable for any structured output learning problem where exact inference is available. Experiments on benchmark sequence labeling data sets and a natural language parsing data set show that the proposed approach, though simple, achieves comparable generalization performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present two new stochastic approximation algorithms for the problem of quantile estimation. The algorithms uses the characterization of the quantile provided in terms of an optimization problem in 1]. The algorithms take the shape of a stochastic gradient descent which minimizes the optimization problem. Asymptotic convergence of the algorithms to the true quantile is proven using the ODE method. The theoretical results are also supplemented through empirical evidence. The algorithms are shown to provide significant improvement in terms of memory requirement and accuracy.