20 resultados para Parallel processing (Electronic computers) - Research


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Large-grain synchronous dataflow graphs or multi-rate graphs have the distinct feature that the nodes of the dataflow graph fire at different rates. Such multi-rate large-grain dataflow graphs have been widely regarded as a powerful programming model for DSP applications. In this paper we propose a method to minimize buffer storage requirement in constructing rate-optimal compile-time (MBRO) schedules for multi-rate dataflow graphs. We demonstrate that the constraints to minimize buffer storage while executing at the optimal computation rate (i.e. the maximum possible computation rate without storage constraints) can be formulated as a unified linear programming problem in our framework. A novel feature of our method is that in constructing the rate-optimal schedule, it directly minimizes the memory requirement by choosing the schedule time of nodes appropriately. Lastly, a new circular-arc interval graph coloring algorithm has been proposed to further reduce the memory requirement by allowing buffer sharing among the arcs of the multi-rate dataflow graph. We have constructed an experimental testbed which implements our MBRO scheduling algorithm as well as (i) the widely used periodic admissible parallel schedules (also known as block schedules) proposed by Lee and Messerschmitt (IEEE Transactions on Computers, vol. 36, no. 1, 1987, pp. 24-35), (ii) the optimal scheduling buffer allocation (OSBA) algorithm of Ning and Gao (Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10-13, 1993, pp. 29-42), and (iii) the multi-rate software pipelining (MRSP) algorithm (Govindarajan and Gao, in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25-27, 1993, pp. 77-88). Schedules generated for a number of random dataflow graphs and for a set of DSP application programs using the different scheduling methods are compared. The experimental results have demonstrated a significant improvement (10-20%) in buffer requirements for the MBRO schedules compared to the schedules generated by the other three methods, without sacrificing the computation rate. The MBRO method also gives a 20% average improvement in computation rate compared to Lee's Block scheduling method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Workstation clusters equipped with high performance interconnect having programmable network processors facilitate interesting opportunities to enhance the performance of parallel application run on them. In this paper, we propose schemes where certain application level processing in parallel database query execution is performed on the network processor. We evaluate the performance of TPC-H queries executing on a high end cluster where all tuple processing is done on the host processor, using a timed Petri net model, and find that tuple processing costs on the host processor dominate the execution time. These results are validated using a small cluster. We therefore propose 4 schemes where certain tuple processing activity is offloaded to the network processor. The first 2 schemes offload the tuple splitting activity - computation to identify the node on which to process the tuples, resulting in an execution time speedup of 1.09 relative to the base scheme, but with I/O bus becoming the bottleneck resource. In the 3rd scheme in addition to offloading tuple processing activity, the disk and network interface are combined to avoid the I/O bus bottleneck, which results in speedups up to 1.16, but with high host processor utilization. Our 4th scheme where the network processor also performs apart of join operation along with the host processor, gives a speedup of 1.47 along with balanced system resource utilizations. Further we observe that the proposed schemes perform equally well even in a scaled architecture i.e., when the number of processors is increased from 2 to 64

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Many common activities, like reading, scanning scenes, or searching for an inconspicuous item in a cluttered environment, entail serial movements of the eyes that shift the gaze from one object to another. Previous studies have shown that the primate brain is capable of programming sequential saccadic eye movements in parallel. Given that the onset of saccades directed to a target are unpredictable in individual trials, what prevents a saccade during parallel programming from being executed in the direction of the second target before execution of another saccade in the direction of the first target remains unclear. Using a computational model, here we demonstrate that sequential saccades inhibit each other and share the brain's limited processing resources (capacity) so that the planning of a saccade in the direction of the first target always finishes first. In this framework, the latency of a saccade increases linearly with the fraction of capacity allocated to the other saccade in the sequence, and exponentially with the duration of capacity sharing. Our study establishes a link between the dual-task paradigm and the ramp-to-threshold model of response time to identify a physiologically viable mechanism that preserves the serial order of saccades without compromising the speed of performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Exploring future cathode materials for sodium-ion batteries, alluaudite class of Na2Fe2II(SO4)(3) has been recently unveiled as a 3.8 V positive insertion candidate (Barpanda et al. Nat. Commun. 2014, 5, 4358). It forms an Fe-based polyanionic compound delivering the highest Fe-redox potential along with excellent rate kinetics and reversibility. However, like all known SO4-based insertion materials, its synthesis is cumbersome that warrants careful processing avoiding any aqueous exposure. Here, an alternate low temperature ionothermal synthesis has been described to produce the alluaudite Na2+2xFe2-xII(SO4)(3). It marks the first demonstration of solvothermal synthesis of alluaudite Na2+2xM2-xII(SO4)(3) (M = 3d metals) family of cathodes. Unlike classical solid-state route, this solvothermal route favors sustainable synthesis of homogeneous nanostructured alluaudite products at only 300 degrees C, the lowest temperature value until date. The current work reports the synthetic aspects of pristine and modified ionothermal synthesis of Na2+2xFe2-xII(SO4)(3) having tunable size (300 nm similar to 5 mu m) and morphology. It shows antiferromagnetic ordering below 12 K. A reversible capacity in excess of 80 mAh/g was obtained with good rate kinetics and cycling stability over 50 cycles. Using a synergistic approach combining experimental and ab initio DFT analysis, the structural, magnetic, electronic, and electrochemical properties and the structural limitation to extract full capacity have been described.