204 resultados para scalable parallel programming


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The unsteady incompressible viscous fluid flow between two parallel infinite disks which are located at a distance h(t*) at time t* has been studied. The upper disk moves towards the lower disk with velocity h'(t*). The lower disk is porous and rotates with angular velocity Omega(t*). A magnetic field B(t*) is applied perpendicular to the two disks. It has been found that the governing Navier-Stokes equations reduce to a set of ordinary differential equations if h(t*), a(t*) and B(t*) vary with time t* in a particular manner, i.e. h(t*) = H(1 - alpha t*)(1/2), Omega(t*) = Omega(0)(1 - alpha t*)(-1), B(t*) = B-0(1 - alpha t*)(-1/2). These ordinary differential equations have been solved numerically using a shooting method. For small Reynolds numbers, analytical solutions have been obtained using a regular perturbation technique. The effects of squeeze Reynolds numbers, Hartmann number and rotation of the disk on the flow pattern, normal force or load and torque have been studied in detail

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The elastodynamic response of a pair of parallel rigid strips embedded in an infinite orthotropic medium due to elastic waves incident normally on the strips has been investigated. The mixed boundary value problem has been solved by the Integral Equation method. The normal stress and the vertical displacement have been derived in closed form. Numerical values of stress intensity factors at inner and outer edges of the strips and vertical displacement at points in the plane of the strips for several orthotropic materials have been calculated and plotted graphically to show the effect of material orthotropy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Combining the philosophies of nonlinear model predictive control and approximate dynamic programming, a new suboptimal control design technique is presented in this paper, named as model predictive static programming (MPSP), which is applicable for finite-horizon nonlinear problems with terminal constraints. This technique is computationally efficient, and hence, can possibly be implemented online. The effectiveness of the proposed method is demonstrated by designing an ascent phase guidance scheme for a ballistic missile propelled by solid motors. A comparison study with a conventional gradient method shows that the MPSP solution is quite close to the optimal solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of deciding whether the output of a boolean circuit is determined by a partial assignment to its inputs. This problem is easily shown to be hard, i.e., co-Image Image -complete. However, many of the consequences of a partial input assignment may be determined in linear time, by iterating the following step: if we know the values of some inputs to a gate, we can deduce the values of some outputs of that gate. This process of iteratively deducing some of the consequences of a partial assignment is called propagation. This paper explores the parallel complexity of propagation, i.e., the complexity of determining whether the output of a given boolean circuit is determined by propagating a given partial input assignment. We give a complete classification of the problem into those cases that are Image -complete and those that are unlikely to be Image complete.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have implemented a Firewall with this architecture in reconflgurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results in both speed and area improvement when it is implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields. High throughput classification invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly in terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for the worst case packet size. The Firewall rule update involves only memory re-initialization in software without any hardware change.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the programming an FPGA (Field Programmable Gate Array) to emulate the dynamics of DC machines. FPGA allows high speed real time simulation with high precision. The described design includes block diagram representation of DC machine, which contain all arithmetic and logical operations. The real time simulation of the machine in FPGA is controlled by user interfaces they are Keypad interface, LCD display on-line and digital to analog converter. This approach provides emulation of electrical machine by changing the parameters. Separately Exited DC machine implemented and experimental results are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have Implemented a Firewall with this architecture in reconfigurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using, our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results In both speed and area Improvement when It is Implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields.High throughput classification Invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly In terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware Implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for file worst case packet size. The Firewall rule update Involves only memory re-initialiization in software without any hardware change.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents two new algorithms for the direct parallel solution of systems of linear equations. The algorithms employ a novel recursive doubling technique to obtain solutions to an nth-order system in n steps with no more than 2n(n −1) processors. Comparing their performance with the Gaussian elimination algorithm (GE), we show that they are almost 100% faster than the latter. This speedup is achieved by dispensing with all the computation involved in the back-substitution phase of GE. It is also shown that the new algorithms exhibit error characteristics which are superior to GE. An n(n + 1) systolic array structure is proposed for the implementation of the new algorithms. We show that complete solutions can be obtained, through these single-phase solution methods, in 5n−log2n−4 computational steps, without the need for intermediate I/O operations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, the design and implementation of a single shared bus, shared memory multiprocessing system using Intel's single board computers is presented. The hardware configuration and the operating system developed to execute the parallel algorithms are discussed. The performance evaluation studies carried out on Image are outlined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new parallel algorithm for transforming an arithmetic infix expression into a par se tree is presented. The technique is based on a result due to Fischer (1980) which enables the construction of the parse tree, by appropriately scanning the vector of precedence values associated with the elements of the expression. The algorithm presented here is suitable for execution on a shared memory model of an SIMD machine with no read/write conflicts permitted. It uses O(n) processors and has a time complexity of O(log2n) where n is the expression length. Parallel algorithms for generating code for an SIMD machine are also presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract is not available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mesh topologies are important for large-scale peer-to-peer systems that use low-power transceivers. The Quality of Service (QoS) in such systems is known to decrease as the scale increases. We present a scalable approach for dissemination that exploits all the shortest paths between a pair of nodes and improves the QoS. Despite th presence of multiple shortest paths in a system, we show that these paths cannot be exploited by spreading the messages over the paths in a simple round-robin manner; nodes along one of these paths will always handle more messages than the nodes along the other paths. We characterize the set of shortest paths between a pair of nodes in regular mesh topologies and derive rules, using this characterization, to effectively spread the messages over all the available paths. These rules ensure that all the nodes that are at the same distance from the source handle roughly the same number of messages. By modeling the multihop propagation in the mesh topology as a multistage queuing network, we present simulation results from a variety of scenarios that include link failures and propagation irregularities to reflect real-world characteristics. Our method achieves improved QoS in all these scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the modern business environment, meeting due dates and avoiding delay penalties are very important goals that can be accomplished by minimizing total weighted tardiness. We consider a scheduling problem in a system of parallel processors with the objective of minimizing total weighted tardiness. Our aim in the present work is to develop an efficient algorithm for solving the parallel processor problem as compared to the available heuristics in the literature and we propose the ant colony optimization approach for this problem. An extensive experimentation is conducted to evaluate the performance of the ACO approach on different problem sizes with the varied tardiness factors. Our experimentation shows that the proposed ant colony optimization algorithm is giving promising results compared to the best of the available heuristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is shown that the conclusions arrived at regarding the instability of an incompressible fluid cylinder in the presence of the magnetic field and the streaming velocity in a recent communication easily follow from the study of propagation characteristics of Alfvén surface waves along cylindrical plasma columns made earlier.