Biblioteca Digital

15 resultados para iterative algorithm

em Boston University Digital Common

Mermera: Non-Coherent Distributed Shared Memory for Parallel Computing

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The proliferation of inexpensive workstations and networks has prompted several researchers to use such distributed systems for parallel computing. Attempts have been made to offer a shared-memory programming model on such distributed memory computers. Most systems provide a shared-memory that is coherent in that all processes that use it agree on the order of all memory events. This dissertation explores the possibility of a significant improvement in the performance of some applications when they use non-coherent memory. First, a new formal model to describe existing non-coherent memories is developed. I use this model to prove that certain problems can be solved using asynchronous iterative algorithms on shared-memory in which the coherence constraints are substantially relaxed. In the course of the development of the model I discovered a new type of non-coherent behavior called Local Consistency. Second, a programming model, Mermera, is proposed. It provides programmers with a choice of hierarchically related non-coherent behaviors along with one coherent behavior. Thus, one can trade-off the ease of programming with coherent memory for improved performance with non-coherent memory. As an example, I present a program to solve a linear system of equations using an asynchronous iterative algorithm. This program uses all the behaviors offered by Mermera. Third, I describe the implementation of Mermera on a BBN Butterfly TC2000 and on a network of workstations. The performance of a version of the equation solving program that uses all the behaviors of Mermera is compared with that of a version that uses coherent behavior only. For a system of 1000 equations the former exhibits at least a 5-fold improvement in convergence time over the latter. The version using coherent behavior only does not benefit from employing more than one workstation to solve the problem while the program using non-coherent behavior continues to achieve improved performance as the number of workstations is increased from 1 to 6. This measurement corroborates our belief that non-coherent shared memory can be a performance boon for some applications.

On the Convergence of Statistical Techniques for Inferring Network Traffic Demands

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study strategies to improve the convergence of a powerful statistical technique based on an Expectation-Maximization iterative algorithm. First we analyze modeling approaches to generating starting points. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we study the convergence characteristics of our EM algorithm and compare it against a recently proposed Weighted Least Squares approach.

Iterative Linearized Density Matrix Propagation for Modeling Coherent Energy Transfer in Photosynthetic Light Harvesting

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present results of calculations [1] that employ a new mixed quantum classical iterative density matrix propagation approach (ILDM , or so called Is‐Landmap) [2] to explore the survival of coherence in different photo synthetic models. Our model studies confirm the long lived quantum coherence , while conventional theoretical tools (such as Redfield equation) fail to describe these phenomenon [3,4]. Our ILDM method is a numerical exactly propagation scheme and can be served as a bench mark calculation tools[2]. Result get from ILDM and from other recent methods have been compared and show agreement with each other[4,5]. Long lived coherence plateau has been attribute to the shift of harmonic potential due to the system bath interaction, and the harvesting efficiency is a balance between the coherence and dissipation[1]. We use this approach to investigate the excitation energy transfer dynamics in various light harvesting complex include Fenna‐Matthews‐Olsen light harvesting complex[1] and Cryptophyte Phycocyanin 645 [6]. [1] P.Huo and D.F.Coker ,J. Chem. Phys. 133, 184108 (2010) . [2] E.R. Dunkel, S. Bonella, and D.F. Coker, J. Chem. Phys. 129, 114106 (2008). [3] A. Ishizaki and G.R. Fleming, J. Chem. Phys. 130, 234111 (2009). [4] A. Ishizaki and G.R. Fleming, Proc. Natl. Acad. Sci. 106, 17255 (2009). [5] G. Tao and W.H. Miller, J. Phys. Chem. Lett. 1, 891 (2010). [6] P.Huo and D.F.Coker in preparation

Mapping Parallel Iterative Algorithms onto Workstation Networks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.

A Hybrid GLR Algorithm for Parsing with Epsilon Grammars

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We give a hybrid algorithm for parsing epsilon grammars based on Tomita's non-ϵ-grammar parsing algorithm ([Tom86]) and Nozohoor-Farshi's ϵ-grammar recognition algorithm ([NF91]). The hybrid parser handles the same set of grammars handled by Nozohoor-Farshi's recognizer. The algorithm's details and an example of its use are given. We also discuss the deployment of the hybrid algorithm within a GB parser, and the reason an ϵ grammar parser is needed in our GB parser.

A Direct Algorithm for the Type Interference in the Rank 2 Fragment of the Second--Order λ-Calculus

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the problem of type inference for a family of polymorphic type disciplines containing the power of Core-ML. This family comprises all levels of the stratification of the second-order lambda-calculus by "rank" of types. We show that typability is an undecidable problem at every rank k ≥ 3 of this stratification. While it was already known that typability is decidable at rank ≤ 2, no direct and easy-to-implement algorithm was available. To design such an algorithm, we develop a new notion of reduction and show how to use it to reduce the problem of typability at rank 2 to the problem of acyclic semi-unification. A by-product of our analysis is the publication of a simple solution procedure for acyclic semi-unification.

A Lower-Bound Result on the Power of a Genetic Algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a lower-bound result on the computational power of a genetic algorithm in the context of combinatorial optimization. We describe a new genetic algorithm, the merged genetic algorithm, and prove that for the class of monotonic functions, the algorithm finds the optimal solution, and does so with an exponential convergence rate. The analysis pertains to the ideal behavior of the algorithm where the main task reduces to showing convergence of probability distributions over the search space of combinatorial structures to the optimal one. We take exponential convergence to be indicative of efficient solvability for the sample-bounded algorithm, although a sampling theory is needed to better relate the limit behavior to actual behavior. The paper concludes with a discussion of some immediate problems that lie ahead.

An Algorithm for Inferring Quasi-Static Types

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report presents an algorithm, and its implementation, for doing type inference in the context of Quasi-Static Typing (QST) ["Quasy-static Typing." Satish Thatte Proc. ACM Symp. on Principles of Programming Languages, 1988]. The package infers types a la "QST" for the simply typed λ-calculus.

A Two-step Statistical Approach for Inferring Network Traffic Demands (Revises Technical Report BUCS-2003-003)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study a two-step approach for inferring network traffic demands. First we elaborate and evaluate a modeling approach for generating good starting points to be fed to iterative statistical inference techniques. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we evaluate and compare alternative mechanisms for generating starting points and the convergence characteristics of our EM algorithm against a recently proposed Weighted Least Squares approach.

An Online Distributed Algorithm for Inferring Policy Routing Configurations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an online distributed algorithm, the Causation Logging Algorithm (CLA), in which Autonomous Systems (ASes) in the Internet individually report route oscillations/flaps they experience to a central Internet Routing Registry (IRR). The IRR aggregates these reports and may observe what we call causation chains where each node on the chain caused a route flap at the next node along the chain. A chain may also have a causation cycle. The type of an observed causation chain/cycle allows the IRR to infer the underlying policy routing configuration (i.e., the system of economic relationships and constraints on route/path preferences). Our algorithm is based on a formal policy routing model that captures the propagation dynamics of route flaps under arbitrary changes in topology or path preferences. We derive invariant properties of causation chains/cycles for ASes which conform to economic relationships based on the popular Gao-Rexford model. The Gao-Rexford model is known to be safe in the sense that the system always converges to a stable set of paths under static conditions. Our CLA algorithm recovers the type/property of an observed causation chain of an underlying system and determines whether it conforms to the safe economic Gao-Rexford model. Causes for nonconformity can be diagnosed by comparing the properties of the causation chains with those predicted from different variants of the Gao-Rexford model.

GreedyDual* Web Caching Algorithm: Exploiting the Two Sources of Temporal Locality in Web Request Streams

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relative importance of long-term popularity and short-term temporal correlation of references for Web cache replacement policies has not been studied thoroughly. This is partially due to the lack of accurate characterization of temporal locality that enables the identification of the relative strengths of these two sources of temporal locality in a reference stream. In [21], we have proposed such a metric and have shown that Web reference streams differ significantly in the prevalence of these two sources of temporal locality. These finding underscore the importance of a Web caching strategy that can adapt in a dynamic fashion to the prevalence of these two sources of temporal locality. In this paper, we propose a novel cache replacement algorithm, GreedyDual*, which is a generalization of GreedyDual-Size. GreedyDual* uses the metrics proposed in [21] to adjust the relative worth of long-term popularity versus short-term temporal correlation of references. Our trace-driven simulation experiments show the superior performance of GreedyDual* when compared to other Web cache replacement policies proposed in the literature.

Isoperimetric Partitioning: A New Algorithm for Graph Partitioning

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Temporal structure is skilled, fluent action exists at several nested levels. At the largest scale considered here, short sequences of actions that are planned collectively in prefronatal cortex appear to be queued for performance by a cyclic competitive process that operates in concert with a parallel analog representation that implicitly specifies the relative priority of elements of the sequence. At an intermediate scale, single acts, like reaching to grasp, depend on coordinated scaling of the rates at which many muscles shorten or lengthen in parallel. To ensure success of acts such as catching an approaching ball, such parallel rate scaling, which appears to be one function of the basal ganglia, must be coupled to perceptual variables such as time-to-contact. At a finer scale, within each act, desired rate scaling can be realized only if precisely timed muscle activations first accelerate and then decelerate the limbs, to ensure that muscle length changes do not under- or over- shoot the amounts needed for precise acts. Each context of action may require a different timed muscle activation pattern than similar contexts. Because context differences that require different treatment cannot be known in advance, a formidable adaptive engine-the cerebellum-is needed to amplify differences within, and continuosly search, a vast parallel signal flow, in order to discover contextual "leading indicators" of when to generate distinctive patterns of analog signals. From some parts of the cerebellum, such signals control muscles. But a recent model shows how the lateral cerebellum may serve the competitive queuing system (frontal cortex) as a repository of quickly accessed long-term sequence memories. Thus different parts of the cerebellum may use the same adaptive engine design to serve the lowest and highest of the three levels of temporal structure treated. If so, no one-to-one mapping exists between leveels of temporal structure and major parts of the brain. Finally, recent data cast doubt on network-delay models of cerebellar adaptive timing.

A Fast BCS/FCS Algorithm for Image Segmentation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A fast and efficient segmentation algorithm based on the Boundary Contour System/Feature Contour System (BCS/FCS) of Grossberg and Mingolla [3] is presented. This implementation is based on the FFT algorithm and the parallelism of the system.

Fuzzy ART: An Adaptive Resonance Algorithm for Rapid, Stable Classification of Analog Patterns

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Fuzzy ART system introduced herein incorporates computations from fuzzy set theory into ART 1. For example, the intersection (n) operator used in ART 1 learning is replaced by the MIN operator (A) of fuzzy set theory. Fuzzy ART reduces to ART 1 in response to binary input vectors, but can also learn stable categories in response to analog input vectors. In particular, the MIN operator reduces to the intersection operator in the binary case. Learning is stable because all adaptive weights can only decrease in time. A preprocessing step, called complement coding, uses on-cell and off-cell responses to prevent category proliferation. Complement coding normalizes input vectors while preserving the amplitudes of individual feature activations.

ART 2-A: An Adaptive Resonance Algorithm for Rapid Category Learning and Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article introduces ART 2-A, an efficient algorithm that emulates the self-organizing pattern recognition and hypothesis testing properties of the ART 2 neural network architecture, but at a speed two to three orders of magnitude faster. Analysis and simulations show how the ART 2-A systems correspond to ART 2 dynamics at both the fast-learn limit and at intermediate learning rates. Intermediate learning rates permit fast commitment of category nodes but slow recoding, analogous to properties of word frequency effects, encoding specificity effects, and episodic memory. Better noise tolerance is hereby achieved without a loss of learning stability. The ART 2 and ART 2-A systems are contrasted with the leader algorithm. The speed of ART 2-A makes practical the use of ART 2 modules in large-scale neural computation.