15 resultados para iterative multitier ensembles

em Boston University Digital Common


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present results of calculations [1] that employ a new mixed quantum classical iterative density matrix propagation approach (ILDM , or so called Is‐Landmap) [2] to explore the survival of coherence in different photo synthetic models. Our model studies confirm the long lived quantum coherence , while conventional theoretical tools (such as Redfield equation) fail to describe these phenomenon [3,4]. Our ILDM method is a numerical exactly propagation scheme and can be served as a bench mark calculation tools[2]. Result get from ILDM and from other recent methods have been compared and show agreement with each other[4,5]. Long lived coherence plateau has been attribute to the shift of harmonic potential due to the system bath interaction, and the harvesting efficiency is a balance between the coherence and dissipation[1]. We use this approach to investigate the excitation energy transfer dynamics in various light harvesting complex include Fenna‐Matthews‐Olsen light harvesting complex[1] and Cryptophyte Phycocyanin 645 [6]. [1] P.Huo and D.F.Coker ,J. Chem. Phys. 133, 184108 (2010) . [2] E.R. Dunkel, S. Bonella, and D.F. Coker, J. Chem. Phys. 129, 114106 (2008). [3] A. Ishizaki and G.R. Fleming, J. Chem. Phys. 130, 234111 (2009). [4] A. Ishizaki and G.R. Fleming, Proc. Natl. Acad. Sci. 106, 17255 (2009). [5] G. Tao and W.H. Miller, J. Phys. Chem. Lett. 1, 891 (2010). [6] P.Huo and D.F.Coker in preparation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The proliferation of inexpensive workstations and networks has prompted several researchers to use such distributed systems for parallel computing. Attempts have been made to offer a shared-memory programming model on such distributed memory computers. Most systems provide a shared-memory that is coherent in that all processes that use it agree on the order of all memory events. This dissertation explores the possibility of a significant improvement in the performance of some applications when they use non-coherent memory. First, a new formal model to describe existing non-coherent memories is developed. I use this model to prove that certain problems can be solved using asynchronous iterative algorithms on shared-memory in which the coherence constraints are substantially relaxed. In the course of the development of the model I discovered a new type of non-coherent behavior called Local Consistency. Second, a programming model, Mermera, is proposed. It provides programmers with a choice of hierarchically related non-coherent behaviors along with one coherent behavior. Thus, one can trade-off the ease of programming with coherent memory for improved performance with non-coherent memory. As an example, I present a program to solve a linear system of equations using an asynchronous iterative algorithm. This program uses all the behaviors offered by Mermera. Third, I describe the implementation of Mermera on a BBN Butterfly TC2000 and on a network of workstations. The performance of a version of the equation solving program that uses all the behaviors of Mermera is compared with that of a version that uses coherent behavior only. For a system of 1000 equations the former exhibits at least a 5-fold improvement in convergence time over the latter. The version using coherent behavior only does not benefit from employing more than one workstation to solve the problem while the program using non-coherent behavior continues to achieve improved performance as the number of workstations is increased from 1 to 6. This measurement corroborates our belief that non-coherent shared memory can be a performance boon for some applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Coherent shared memory is a convenient, but inefficient, method of inter-process communication for parallel programs. By contrast, message passing can be less convenient, but more efficient. To get the benefits of both models, several non-coherent memory behaviors have recently been proposed in the literature. We present an implementation of Mermera, a shared memory system that supports both coherent and non-coherent behaviors in a manner that enables programmers to mix multiple behaviors in the same program[HS93]. A programmer can debug a Mermera program using coherent memory, and then improve its performance by selectively reducing the level of coherence in the parts that are critical to performance. Mermera permits a trade-off of coherence for performance. We analyze this trade-off through measurements of our implementation, and by an example that illustrates the style of programming needed to exploit non-coherence. We find that, even on a small network of workstations, the performance advantage of non-coherence is compelling. Raw non-coherent memory operations perform 20-40~times better than non-coherent memory operations. An example application program is shown to run 5-11~times faster when permitted to exploit non-coherence. We conclude by commenting on our use of the Isis Toolkit of multicast protocols in implementing Mermera.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Parallel computing on a network of workstations can saturate the communication network, leading to excessive message delays and consequently poor application performance. We examine empirically the consequences of integrating a flow control protocol, called Warp control [Par93], into Mermera, a software shared memory system that supports parallel computing on distributed systems [HS93]. For an asynchronous iterative program that solves a system of linear equations, our measurements show that Warp succeeds in stabilizing the network's behavior even under high levels of contention. As a result, the application achieves a higher effective communication throughput, and a reduced completion time. In some cases, however, Warp control does not achieve the performance attainable by fixed size buffering when using a statically optimal buffer size. Our use of Warp to regulate the allocation of network bandwidth emphasizes the possibility for integrating it with the allocation of other resources, such as CPU cycles and disk bandwidth, so as to optimize overall system throughput, and enable fully-shared execution of parallel programs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study strategies to improve the convergence of a powerful statistical technique based on an Expectation-Maximization iterative algorithm. First we analyze modeling approaches to generating starting points. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we study the convergence characteristics of our EM algorithm and compare it against a recently proposed Weighted Least Squares approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An iterative method for reconstructing a 3D polygonal mesh and color texture map from multiple views of an object is presented. In each iteration, the method first estimates a texture map given the current shape estimate. The texture map and its associated residual error image are obtained via maximum a posteriori estimation and reprojection of the multiple views into texture space. Next, the surface shape is adjusted to minimize residual error in texture space. The surface is deformed towards a photometrically-consistent solution via a series of 1D epipolar searches at randomly selected surface points. The texture space formulation has improved computational complexity over standard image-based error approaches, and allows computation of the reprojection error and uncertainty for any point on the surface. Moreover, shape adjustments can be constrained such that the recovered model's silhouette matches those of the input images. Experiments with real world imagery demonstrate the validity of the approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study a two-step approach for inferring network traffic demands. First we elaborate and evaluate a modeling approach for generating good starting points to be fed to iterative statistical inference techniques. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we evaluate and compare alternative mechanisms for generating starting points and the convergence characteristics of our EM algorithm against a recently proposed Weighted Least Squares approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a difficult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency coherent (sequentially consistent) memory abstraction that hides the DDS' distribution. We present Mermera, a new formalism and system that enable a smooth spectrum of noncoherent shared memory behaviors to coexist between the above two extremes. Our approach allows us to define known noncoherent memories in a new simple way, to identify new memory behaviors, and to characterize generic mixed-behavior computations. The latter are useful for programming using multiple behaviors that complement each others' advantages. On the practical side, we show that the large class of programs that use asynchronous iterative methods (AIM) can run correctly on slow memory, one of the weakest, and hence most efficient and fault-tolerant, noncoherence conditions. An example AIM program to solve linear equations, is developed to illustrate: (1) the need for concurrently mixing memory behaviors, and, (2) the performance gains attainable via noncoherence. Other program classes tolerate weak memory consistency by synchronizing in such a way as to yield executions indistinguishable from coherent ones. AIM computations on noncoherent memory yield noncoherent, yet correct, computations. We report performance data that exemplifies the potential benefits of noncoherence, in terms of raw memory performance, as well as application speed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We developed an automated system that registers chest CT scans temporally. Our registration method matches corresponding anatomical landmarks to obtain initial registration parameters. The initial point-to-point registration is then generalized to an iterative surface-to-surface registration method. Our "goodness-of-fit" measure is evaluated at each step in the iterative scheme until the registration performance is sufficient. We applied our method to register the 3D lung surfaces of 11 pairs of chest CT scans and report promising registration performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A method for reconstructing 3D rational B-spline surfaces from multiple views is proposed. The method takes advantage of the projective invariance properties of rational B-splines. Given feature correspondences in multiple views, the 3D surface is reconstructed via a four step framework. First, corresponding features in each view are given an initial surface parameter value (s; t), and a 2D B-spline is fitted in each view. After this initialization, an iterative minimization procedure alternates between updating the 2D B-spline control points and re-estimating each feature's (s; t). Next, a non-linear minimization method is used to upgrade the 2D B-splines to 2D rational B-splines, and obtain a better fit. Finally, a factorization method is used to reconstruct the 3D B-spline surface given 2D B-splines in each view. This surface recovery method can be applied in both the perspective and orthographic case. The orthographic case allows the use of additional constraints in the recovery. Experiments with real and synthetic imagery demonstrate the efficacy of the approach for the orthographic case.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A method for reconstruction of 3D rational B-spline surfaces from multiple views is proposed. Given corresponding features in multiple views, though not necessarily visible in all views, the surface is reconstructed. First 2D B-spline patches are fitted to each view. The 3D B-splines and projection matricies can then be extracted from the 2D B-splines using factorization methods. The surface fit is then further refined via an iterative procedure. Finally, a hierarchal fitting scheme is proposed to allow modeling of complex surfaces by means of knot insertion. Experiments with real imagery demonstrate the efficacy of the approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The effectiveness of service provisioning in largescale networks is highly dependent on the number and location of service facilities deployed at various hosts. The classical, centralized approach to determining the latter would amount to formulating and solving the uncapacitated k-median (UKM) problem (if the requested number of facilities is fixed), or the uncapacitated facility location (UFL) problem (if the number of facilities is also to be optimized). Clearly, such centralized approaches require knowledge of global topological and demand information, and thus do not scale and are not practical for large networks. The key question posed and answered in this paper is the following: "How can we determine in a distributed and scalable manner the number and location of service facilities?" We propose an innovative approach in which topology and demand information is limited to neighborhoods, or balls of small radius around selected facilities, whereas demand information is captured implicitly for the remaining (remote) clients outside these neighborhoods, by mapping them to clients on the edge of the neighborhood; the ball radius regulates the trade-off between scalability and performance. We develop a scalable, distributed approach that answers our key question through an iterative reoptimization of the location and the number of facilities within such balls. We show that even for small values of the radius (1 or 2), our distributed approach achieves performance under various synthetic and real Internet topologies that is comparable to that of optimal, centralized approaches requiring full topology and demand information.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a thorough characterization of the access patterns in blogspace -- a fast-growing constituent of the content available through the Internet -- which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and administrative requests spanning a 28-day period is done from three different blogosphere perspectives. The server view characterizes the aggregate access patterns of all users to all blogs; the user view characterizes how individual users interact with blogosphere objects (blogs); the object view characterizes how individual blogs are accessed. Our findings support two important conclusions. First, we show that the nature of interactions between users and objects is fundamentally different in blogspace than that observed in traditional web content. Access to objects in blogspace could be conceived as part of an interaction between an author and its readership. As we show in our work, such interactions range from one-to-many "broadcast-type" and many-to-one "registration-type" communication between an author and its readers, to multi-way, iterative "parlor-type" dialogues among members of an interest group. This more-interactive nature of the blogosphere leads to interesting traffic and communication patterns, which are different from those observed in traditional web content. Second, we identify and characterize novel features of the blogosphere workload, and we investigate the similarities and differences between typical web server workloads and blogosphere server workloads. Given the increasing share of blogspace traffic, understanding such differences is important for capacity planning and traffic engineering purposes, for example.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A growing wave of behavioral studies, using a wide variety of paradigms that were introduced or greatly refined in recent years, has generated a new wealth of parametric observations about serial order behavior. What was a mere trickle of neurophysiological studies has grown to a more steady stream of probes of neural sites and mechanisms underlying sequential behavior. Moreover, simulation models of serial behavior generation have begun to open a channel to link cellular dynamics with cognitive and behavioral dynamics. Here we summarize the major results from prominent sequence learning and performance tasks, namely immediate serial recall, typing, 2XN, discrete sequence production, and serial reaction time. These populate a continuum from higher to lower degrees of internal control of sequential organization. The main movement classes covered are speech and keypressing, both involving small amplitude movements that are very amenable to parametric study. A brief synopsis of classes of serial order models, vis-à-vis the detailing of major effects found in the behavioral data, leads to a focus on competitive queuing (CQ) models. Recently, the many behavioral predictive successes of CQ models have been joined by successful prediction of distinctively patterend electrophysiological recordings in prefrontal cortex, wherein parallel activation dynamics of multiple neural ensembles strikingly matches the parallel dynamics predicted by CQ theory. An extended CQ simulation model-the N-STREAMS neural network model-is then examined to highlight issues in ongoing attemptes to accomodate a broader range of behavioral and neurophysiological data within a CQ-consistent theory. Important contemporary issues such as the nature of working memory representations for sequential behavior, and the development and role of chunks in hierarchial control are prominent throughout.