130 resultados para parallel architecture

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design space of emerging heterogenous multi-core architectures with re-configurability element makes it feasible to design mixed fine-grained and coarse-grained parallel architectures. This paper presents a hierarchical composite array design which extends the curret design space of regular array design by combining a sequence of transformations. This technique is applied to derive a new design of a pipelined parallel regular array with different dataflow between phases of computation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a bridge between two important parallel programming paradigms: data parallelism and communicating sequential processes (CSP). Data parallel pipelined architectures obtained with the Alpha language can be embedded in a control intensive application expressed in CSP-based Handel formalism. The interface is formally defined from the semantics of the languages Alpha and Handel. This work will ease the design of compute intensive applications on FPGAs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A parallel pipelined array of cells suitable for realtime computation of histograms is proposed. The cell architecture builds on previous work to now allow operating on a stream of data at 1 pixel per clock cycle. This new cell is more suitable for interfacing to camera sensors or to microprocessors of 8-bit data buses which are common in consumer digital cameras. Arrays using the new proposed cells are obtained via C-slow retiming techniques and can be clocked at a 65% faster frequency than previous arrays. This achieves over 80% of the performance of two-pixel per clock cycle parallel pipelined arrays.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A parallel formulation of an algorithm for the histogram computation of n data items using an on-the-fly data decomposition and a novel quantum-like representation (QR) is developed. The QR transformation separates multiple data read operations from multiple bin update operations thereby making it easier to bind data items into their corresponding histogram bins. Under this model the steps required to compute the histogram is n/s + t steps, where s is a speedup factor and t is associated with pipeline latency. Here, we show that an overall speedup factor, s, is available for up to an eightfold acceleration. Our evaluation also shows that each one of these cells requires less area/time complexity compared to similar proposals found in the literature.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The time to process each of W/B processing blocks of a median calculation method on a set of N W-bit integers is improved here by a factor of three compared to the literature. Parallelism uncovered in blocks containing B-bit slices are exploited by independent accumulative parallel counters so that the median is calculated faster than any known previous method for any N, W values. The improvements to the method are discussed in the context of calculating the median for a moving set of N integers for which a pipelined architecture is developed. An extra benefit of smaller area for the architecture is also reported.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As consumers demand more functionality) from their electronic devices and manufacturers supply the demand then electrical power and clock requirements tend to increase, however reassessing system architecture can fortunately lead to suitable counter reductions. To maintain low clock rates and therefore reduce electrical power, this paper presents a parallel convolutional coder for the transmit side in many wireless consumer devices. The coder accepts a parallel data input and directly computes punctured convolutional codes without the need for a separate puncturing operation while the coded bits are available at the output of the coder in a parallel fashion. Also as the computation is in parallel then the coder can be clocked at 7 times slower than the conventional shift-register based convolutional coder (using DVB 7/8 rate). The presented coder is directly relevant to the design of modern low-power consumer devices

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nonstructural protein 3 of the severe acute respiratory syndrome (SARS) coronavirus includes a "SARS-unique domain" (SUD) consisting of three globular domains separated by short linker peptide segments. This work reports NMR structure determinations of the C-terminal domain (SUD-C) and a two-domain construct (SUD-MC) containing the middle domain (SUD-M) and the C-terminal domain, and NMR data on the conformational states of the N-terminal domain (SUD-N) and the SUD-NM two-domain construct. Both SUD-N and SUD-NM are monomeric and globular in solution; in SUD-NM, there is high mobility in the two-residue interdomain linking sequence, with no preferred relative orientation of the two domains. SUD-C adopts a frataxin like fold and has structural similarity to DNA-binding domains of DNA-modifying enzymes. The structures of both SUD-M (previously determined) and SUD-C (from the present study) are maintained in SUD-MC, where the two domains are flexibly linked. Gel-shift experiments showed that both SUD-C and SUD-MC bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases, whereby SUD-MC binds to a more restricted set of purine-containing RNA sequences than SUD-M. NMR chemical shift perturbation experiments with observations of (15)N-labeled proteins further resulted in delineation of RNA binding sites (i.e., in SUD-M, a positively charged surface area with a pronounced cavity, and in SUD-C, several residues of an anti-parallel beta-sheet). Overall, the present data provide evidence for molecular mechanisms involving the concerted actions of SUD-M and SUD-C, which result in specific RNA binding that might be unique to the SUD and, thus, to the SARS coronavirus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Proposed is a unique cell histogram architecture which will process k data items in parallel to compute 2q histogram bins per time step. An array of m/2q cells computes an m-bin histogram with a speed-up factor of k; k ⩾ 2 makes it faster than current dual-ported memory implementations. Furthermore, simple mechanisms for conflict-free storing of the histogram bins into an external memory array are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of architecture and the settlement is central to discussions concerning the Neolithic transformation asthe very visible evidence for the changes in society that run parallel to the domestication of plants and animals. Architecture hasbeen used as an important aspect of models of how the transformation occurred, and as evidence for the sharp difference betweenhunter-gatherer and farming societies. We suggest that the emerging evidence for considerable architectural complexity from theearly Neolithic indicates that some of our interpretations depend too much on a very basic understanding of structures which arenormally seen as being primarily for residential purposes and containing households, which become the organising principle for thenew communities which are often seen as fully sedentary and described as villages. Recent work in southern Jordan suggests that inthis region at least there is little evidence for a standard house, and that structures are constructed for a range of diverse primary purposes other than simple domestic shelters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Java is becoming an increasingly popular language for developing distributed and parallel scientific and engineering applications. Jini is a Java-based infrastructure developed by Sun that can allegedly provide all the services necessary to support distributed applications. It is the aim of this paper to explore and investigate the services and properties that Jini actually provides and match these against the needs of high performance distributed and parallel applications written in Java. The motivation for this work is the need to develop a distributed infrastructure to support an MPI-like interface to Java known as MPJ. In the first part of the paper we discuss the needs of MPJ, the parallel environment that we wish to support. In particular we look at aspects such as reliability and ease of use. We then move on to sketch out the Jini architecture and review the components and services that Jini provides. In the third part of the paper we critically explore a Jini infrastructure that could be used to support MPJ. Here we are particularly concerned with Jini's ability to support reliably a cocoon of MPJ processes executing in a heterogeneous envirnoment. In the final part of the paper we summarise our findings and report on future work being undertaken on Jini and MPJ.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel pipelined array of cells suitable for real-time computation of histograms is proposed. The cell architecture builds on previous work obtained via C-slow retiming techniques and can be clocked at 65 percent faster frequency than previous arrays. The new arrays can be exploited for higher throughput particularly when dual data rate sampling techniques are used to operate on single streams of data from image sensors. In this way, the new cell operates on a p-bit data bus which is more convenient for interfacing to camera sensors or to microprocessors in consumer digital cameras.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The complexity of current and emerging high performance architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven performance modelling approach is outlined that is appro- priate for modern multicore architectures. The approach is demonstrated by constructing a model of a simple shallow water code on a Cray XE6 system, from application-specific benchmarks that illustrate precisely how architectural char- acteristics impact performance. The model is found to recre- ate observed scaling behaviour up to 16K cores, and used to predict optimal rank-core affinity strategies, exemplifying the type of problem such a model can be used for.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

At its most fundamental, cognition as displayed by biological agents (such as humans) may be said to consist of the manipulation and utilisation of memory. Recent discussions in the field of cognitive robotics have emphasised the role of embodiment and the necessity of a value or motivation for autonomous behaviour. This work proposes a computational architecture – the Memory-Based Cognitive (MBC) architecture – based upon these considerations for the autonomous development of control of a simple mobile robot. This novel architecture will permit the exploration of theoretical issues in cognitive robotics and animal cognition. Furthermore, the biological inspiration of the architecture is anticipated to result in a mobile robot controller which displays adaptive behaviour in unknown environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The intelligent controlling mechanism of a typical mobile robot is usually a computer system. Research is however now ongoing in which biological neural networks are being cultured and trained to act as the brain of an interactive real world robot – thereby either completely replacing or operating in a cooperative fashion with a computer system. Studying such neural systems can give a distinct insight into biological neural structures and therefore such research has immediate medical implications. The principal aims of the present research are to assess the computational and learning capacity of dissociated cultured neuronal networks with a view to advancing network level processing of artificial neural networks. This will be approached by the creation of an artificial hybrid system (animat) involving closed loop control of a mobile robot by a dissociated culture of rat neurons. This paper details the components of the overall animat closed loop system architecture and reports on the evaluation of the results from preliminary real-life and simulated robot experiments.