144 resultados para Synchronization algorithms
Resumo:
We present the first q-Gaussian smoothed functional (SF) estimator of the Hessian and the first Newton-based stochastic optimization algorithm that estimates both the Hessian and the gradient of the objective function using q-Gaussian perturbations. Our algorithm requires only two system simulations (regardless of the parameter dimension) and estimates both the gradient and the Hessian at each update epoch using these. We also present a proof of convergence of the proposed algorithm. In a related recent work (Ghoshdastidar, Dukkipati, & Bhatnagar, 2014), we presented gradient SF algorithms based on the q-Gaussian perturbations. Our work extends prior work on SF algorithms by generalizing the class of perturbation distributions as most distributions reported in the literature for which SF algorithms are known to work turn out to be special cases of the q-Gaussian distribution. Besides studying the convergence properties of our algorithm analytically, we also show the results of numerical simulations on a model of a queuing network, that illustrate the significance of the proposed method. In particular, we observe that our algorithm performs better in most cases, over a wide range of q-values, in comparison to Newton SF algorithms with the Gaussian and Cauchy perturbations, as well as the gradient q-Gaussian SF algorithms. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
It has been shown that iterative re-weighted strategies will often improve the performance of many sparse reconstruction algorithms. However, these strategies are algorithm dependent and cannot be easily extended for an arbitrary sparse reconstruction algorithm. In this paper, we propose a general iterative framework and a novel algorithm which iteratively enhance the performance of any given arbitrary sparse reconstruction algorithm. We theoretically analyze the proposed method using restricted isometry property and derive sufficient conditions for convergence and performance improvement. We also evaluate the performance of the proposed method using numerical experiments with both synthetic and real-world data. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
We present a new Hessian estimator based on the simultaneous perturbation procedure, that requires three system simulations regardless of the parameter dimension. We then present two Newton-based simulation optimization algorithms that incorporate this Hessian estimator. The two algorithms differ primarily in the manner in which the Hessian estimate is used. Both our algorithms do not compute the inverse Hessian explicitly, thereby saving on computational effort. While our first algorithm directly obtains the product of the inverse Hessian with the gradient of the objective, our second algorithm makes use of the Sherman-Morrison matrix inversion lemma to recursively estimate the inverse Hessian. We provide proofs of convergence for both our algorithms. Next, we consider an interesting application of our algorithms on a problem of road traffic control. Our algorithms are seen to exhibit better performance than two Newton algorithms from a recent prior work.
Resumo:
Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic that selects the most profitable parallelization idiom for every dependence type and communication pattern. This heuristic enables the extraction of interband parallelism (cross-barrier parallelism) in a number of numerical computations that range from linear algebra to structured grids and image processing. The proposed static analysis and code generation alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at runtime. We evaluate our approach and algorithms in the PPCG compiler, targeting OpenStream, a representative dataflow task-parallel language with explicit intertask dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.
Resumo:
Clock synchronization in wireless sensor networks (WSNs) assures that sensor nodes have the same reference clock time. This is necessary not only for various WSN applications but also for many system level protocols for WSNs such as MAC protocols, and protocols for sleep scheduling of sensor nodes. Clock value of a node at a particular instant of time depends on its initial value and the frequency of the crystal oscillator used in the sensor node. The frequency of the crystal oscillator varies from node to node, and may also change over time depending upon many factors like temperature, humidity, etc. As a result, clock values of different sensor nodes diverge from each other and also from the real time clock, and hence, there is a requirement for clock synchronization in WSNs. Consequently, many clock synchronization protocols for WSNs have been proposed in the recent past. These protocols differ from each other considerably, and so, there is a need to understand them using a common platform. Towards this goal, this survey paper categorizes the features of clock synchronization protocols for WSNs into three types, viz, structural features, technical features, and global objective features. Each of these categories has different options to further segregate the features for better understanding. The features of clock synchronization protocols that have been used in this survey include all the features which have been used in existing surveys as well as new features such as how the clock value is propagated, when the clock value is propagated, and when the physical clock is updated, which are required for better understanding of the clock synchronization protocols in WSNs in a systematic way. This paper also gives a brief description of a few basic clock synchronization protocols for WSNs, and shows how these protocols fit into the above classification criteria. In addition, the recent clock synchronization protocols for WSNs, which are based on the above basic clock synchronization protocols, are also given alongside the corresponding basic clock synchronization protocols. Indeed, the proposed model for characterizing the clock synchronization protocols in WSNs can be used not only for analyzing the existing protocols but also for designing new clock synchronization protocols. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
We investigate the problem of timing recovery for 2-D magnetic recording (TDMR) channels. We develop a timing error model for TDMR channel considering the phase and frequency offsets with noise. We propose a 2-D data-aided phase-locked loop (PLL) architecture for tracking variations in the position and movement of the read head in the down-track and cross-track directions and analyze the convergence of the algorithm under non-separable timing errors. We further develop a 2-D interpolation-based timing recovery scheme that works in conjunction with the 2-D PLL. We quantify the efficiency of our proposed algorithms by simulations over a 2-D magnetic recording channel with timing errors.
Resumo:
The 3-Hitting Set problem involves a family of subsets F of size at most three over an universe U. The goal is to find a subset of U of the smallest possible size that intersects every set in F. The version of the problem with parity constraints asks for a subset S of size at most k that, in addition to being a hitting set, also satisfies certain parity constraints on the sizes of the intersections of S with each set in the family F. In particular, an odd (even) set is a hitting set that hits every set at either one or three (two) elements, and a perfect code is a hitting set that intersects every set at exactly one element. These questions are of fundamental interest in many contexts for general set systems. Just as for Hitting Set, we find these questions to be interesting for the case of families consisting of sets of size at most three. In this work, we initiate an algorithmic study of these problems in this special case, focusing on a parameterized analysis. We show, for each problem, efficient fixed-parameter tractable algorithms using search trees that are tailor-made to the constraints in question, and also polynomial kernels using sunflower-like arguments in a manner that accounts for equivalence under the additional parity constraints.
Resumo:
In this work, we study the well-known r-DIMENSIONAL k-MATCHING ((r, k)-DM), and r-SET k-PACKING ((r, k)-SP) problems. Given a universe U := U-1 ... U-r and an r-uniform family F subset of U-1 x ... x U-r, the (r, k)-DM problem asks if F admits a collection of k mutually disjoint sets. Given a universe U and an r-uniform family F subset of 2(U), the (r, k)-SP problem asks if F admits a collection of k mutually disjoint sets. We employ techniques based on dynamic programming and representative families. This leads to a deterministic algorithm with running time O(2.851((r-1)k) .vertical bar F vertical bar. n log(2)n . logW) for the weighted version of (r, k)-DM, where W is the maximum weight in the input, and a deterministic algorithm with running time O(2.851((r-0.5501)k).vertical bar F vertical bar.n log(2) n . logW) for the weighted version of (r, k)-SP. Thus, we significantly improve the previous best known deterministic running times for (r, k)-DM and (r, k)-SP and the previous best known running times for their weighted versions. We rely on structural properties of (r, k)-DM and (r, k)-SP to develop algorithms that are faster than those that can be obtained by a standard use of representative sets. Incorporating the principles of iterative expansion, we obtain a better algorithm for (3, k)-DM, running in time O(2.004(3k).vertical bar F vertical bar . n log(2)n). We believe that this algorithm demonstrates an interesting application of representative families in conjunction with more traditional techniques. Furthermore, we present kernels of size O(e(r)r(k-1)(r) logW) for the weighted versions of (r, k)-DM and (r, k)-SP, improving the previous best known kernels of size O(r!r(k-1)(r) logW) for these problems.
Resumo:
Clock synchronization is highly desirable in distributed systems, including many applications in the Internet of Things and Humans. It improves the efficiency, modularity, and scalability of the system, and optimizes use of event triggers. For IoTH, BLE - a subset of the recent Bluetooth v4.0 stack - provides a low-power and loosely coupled mechanism for sensor data collection with ubiquitous units (e.g., smartphones and tablets) carried by humans. This fundamental design paradigm of BLE is enabled by a range of broadcast advertising modes. While its operational benefits are numerous, the lack of a common time reference in the broadcast mode of BLE has been a fundamental limitation. This article presents and describes CheepSync, a time synchronization service for BLE advertisers, especially tailored for applications requiring high time precision on resource constrained BLE platforms. Designed on top of the existing Bluetooth v4.0 standard, the CheepSync framework utilizes low-level time-stamping and comprehensive error compensation mechanisms for overcoming uncertainties in message transmission, clock drift, and other system-specific constraints. CheepSync was implemented on custom designed nRF24Cheep beacon platforms (as broadcasters) and commercial off-the-shelf Android ported smartphones (as passive listeners). We demonstrate the efficacy of CheepSync by numerous empirical evaluations in a variety of experimental setups, and show that its average (single-hop) time synchronization accuracy is in the 10 mu s range.