999 resultados para hierarchical memory


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Realising high performance image and signal processing
applications on modern FPGA presents a challenging implementation problem due to the large data frames streaming through these systems. Specifically, to meet the high bandwidth and data storage demands of these applications, complex hierarchical memory architectures must be manually specified
at the Register Transfer Level (RTL). Automated approaches which convert high-level operation descriptions, for instance in the form of C programs, to an FPGA architecture, are unable to automatically realise such architectures. This paper
presents a solution to this problem. It presents a compiler to automatically derive such memory architectures from a C program. By transforming the input C program to a unique dataflow modelling dialect, known as Valved Dataflow (VDF), a mapping and synthesis approach developed for this dialect can
be exploited to automatically create high performance image and video processing architectures. Memory intensive C kernels for Motion Estimation (CIF Frames at 30 fps), Matrix Multiplication (128x128 @ 500 iter/sec) and Sobel Edge Detection (720p @ 30 fps), which are unrealisable by current state-of-the-art C-based synthesis tools, are automatically derived from a C description of the algorithm.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

对GOTOBLAS库(GOTO)的实现机制,尤其是其中的一般矩阵乘法部分的实现进行了分析.结合近年来的一些研究成果,讨论了如何高效地实现矩阵相乘操作,把存储层次对程序性能的影响提高到计算模型的高度.对比实验表明,GOTO库的性能远远高于没有考虑存储层次的一般BLAS库.证明了GOTO库性能上的优越性和将存储层次引入计算模型的必要性.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We describe the design of a directory-based shared memory architecture on a hierarchical network of hypercubes. The distributed directory scheme comprises two separate hierarchical networks for handling cache requests and transfers. Further, the scheme assumes a single address space and each processing element views the entire network as contiguous memory space. The size of individual directories stored at each node of the network remains constant throughout the network. Although the size of the directory increases with the network size, the architecture is scalable. The results of the analytical studies demonstrate superior performance characteristics of our scheme compared with those of other schemes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We describe the design of a directory-based shared memory architecture on a hierarchical network of hypercubes. The distributed directory scheme comprises two separate hierarchical networks for handling cache requests and transfers. Further, the scheme assumes a single address space and each processing element views the entire network as contiguous memory space. The size of individual directories stored at each node of the network remains constant throughout the network. Although the size of the directory increases with the network size, the architecture is scalable. The results of the analytical studies demonstrate superior performance characteristics of our scheme compared with those of other schemes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The performance of memory-guided saccades with two different delays (3 and 30 s of memorization) was studied in seven healthy subjects. Double-pulse transcranial magnetic stimulation (dTMS) with an interstimulus interval of 100 ms was applied over the right dorsolateral prefrontal cortex (DLPFC) early (1 s after target presentation) and late (28 s after target presentation). Early stimulation significantly increased in both delays the percentage of error in amplitude (PEA) of contralateral memory-guided saccades compared to the control experiment without stimulation. dTMS applied late in the delay had no significant effect on PEA. Furthermore, we found a significantly smaller effect of early stimulation in the long-delay paradigm. These results suggest a time-dependent hierarchical organization of the spatial working memory with a functional dominance of DLPFC during the early memorization, independent from the memorization delay. For a long memorization delay, however, working memory seems to have an additional, DLPFC-independent component.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Results of numerical experiments are introduced. Experiments were carried out by means of computer simulation on olfactory bulb for the purpose of checking of thinking mechanisms conceptual model, introduced in [2]. Key role of quasisymbol neurons in processes of pattern identification, existence of mental view, functions of cyclic connections between symbol and quasisymbol neurons as short-term memory, important role of synaptic plasticity in learning processes are confirmed numerically. Correctness of fundamental ideas put in base of conceptual model is confirmed on olfactory bulb at quantitative level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a Hierarchical Look-Ahead Trajectory Model (HiLAM) that incorporates the firing pattern of medial entorhinal grid cells in a planning circuit that includes interactions with hippocampus and prefrontal cortex. We show the model’s flexibility in representing large real world environments using odometry information obtained from challenging video sequences. We acquire the visual data from a camera mounted on a small tele-operated vehicle. The camera has a panoramic field of view with its focal point approximately 5 cm above the ground level, similar to what would be expected from a rat’s point of view. Using established algorithms for calculating perceptual speed from the apparent rate of visual change over time, we generate raw dead reckoning information which loses spatial fidelity over time due to error accumulation. We rectify the loss of fidelity by exploiting the loop-closure detection ability of a biologically inspired, robot navigation model termed RatSLAM. The rectified motion information serves as a velocity input to the HiLAM to encode the environment in the form of grid cell and place cell maps. Finally, we show goal directed path planning results of HiLAM in two different environments, an indoor square maze used in rodent experiments and an outdoor arena more than two orders of magnitude larger than the indoor maze. Together these results bridge for the first time the gap between higher fidelity bio-inspired navigation models (HiLAM) and more abstracted but highly functional bio-inspired robotic mapping systems (RatSLAM), and move from simulated environments into real-world studies in rodent-sized arenas and beyond.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distraction in the workplace is increasingly more common in the information age. Several tasks and sources of information compete for a worker's limited cognitive capacities in human-computer interaction (HCI). In some situations even very brief interruptions can have detrimental effects on memory. Nevertheless, in other situations where persons are continuously interrupted, virtually no interruption costs emerge. This dissertation attempts to reveal the mental conditions and causalities differentiating the two outcomes. The explanation, building on the theory of long-term working memory (LTWM; Ericsson and Kintsch, 1995), focuses on the active, skillful aspects of human cognition that enable the storage of task information beyond the temporary and unstable storage provided by short-term working memory (STWM). Its key postulate is called a retrieval structure an abstract, hierarchical knowledge representation built into long-term memory that can be utilized to encode, update, and retrieve products of cognitive processes carried out during skilled task performance. If certain criteria of practice and task processing are met, LTWM allows for the storage of large representations for long time periods, yet these representations can be accessed with the accuracy, reliability, and speed typical of STWM. The main thesis of the dissertation is that the ability to endure interruptions depends on the efficiency in which LTWM can be recruited for maintaing information. An observational study and a field experiment provide ecological evidence for this thesis. Mobile users were found to be able to carry out heavy interleaving and sequencing of tasks while interacting, and they exhibited several intricate time-sharing strategies to orchestrate interruptions in a way sensitive to both external and internal demands. Interruptions are inevitable, because they arise as natural consequences of the top-down and bottom-up control of multitasking. In this process the function of LTWM is to keep some representations ready for reactivation and others in a more passive state to prevent interference. The psychological reality of the main thesis received confirmatory evidence in a series of laboratory experiments. They indicate that after encoding into LTWM, task representations are safeguarded from interruptions, regardless of their intensity, complexity, or pacing. However, when LTWM cannot be deployed, the problems posed by interference in long-term memory and the limited capacity of the STWM surface. A major contribution of the dissertation is the analysis of when users must resort to poorer maintenance strategies, like temporal cues and STWM-based rehearsal. First, one experiment showed that task orientations can be associated with radically different patterns of retrieval cue encodings. Thus the nature of the processing of the interface determines which features will be available as retrieval cues and which must be maintained by other means. In another study it was demonstrated that if the speed of encoding into LTWM, a skill-dependent parameter, is slower than the processing speed allowed for by the task, interruption costs emerge. Contrary to the predictions of competing theories, these costs turned out to involve intrusions in addition to omissions. Finally, it was learned that in rapid visually oriented interaction, perceptual-procedural expectations guide task resumption, and neither STWM nor LTWM are utilized due to the fact that access is too slow. These findings imply a change in thinking about the design of interfaces. Several novel principles of design are presented, basing on the idea of supporting the deployment of LTWM in the main task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In prediction phase, the hierarchical tree structure obtained from the test image is used to predict every central pixel of an image by its four neighboring pixels. The prediction scheme generates the predicted error image, to which the wavelet/sub-band coding algorithm can be applied to obtain efficient compression. In quantization phase, we used a modified SPIHT algorithm to achieve efficiency in memory requirements. The memory constraint plays a vital role in wireless and bandwidth-limited applications. A single reusable list is used instead of three continuously growing linked lists as in case of SPIHT. This method is error resilient. The performance is measured in terms of PSNR and memory requirements. The algorithm shows good compression performance and significant savings in memory. (C) 2006 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A number of neural network models, in which fixed-point and limit-cycle attractors of the underlying dynamics are used to store and associatively recall information, are described. In the first class of models, a hierarchical structure is used to store an exponentially large number of strongly correlated memories. The second class of models uses limit cycles to store and retrieve individual memories. A neurobiologically plausible network that generates low-amplitude periodic variations of activity, similar to the oscillations observed in electroencephalographic recordings, is also described. Results obtained from analytic and numerical studies of the properties of these networks are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Zincblende Mn-rich Mn(Ga)As nanoclusters embedded in GaAs matrices are fabricated by in situ postgrowth annealing diluted magnetic semiconductor (Ga,Mn)As films with Mn concentration ranging from 2.6% to 8% at 650 degrees C. Magnetization measurements show that memory effect and slow magnetic relaxation, the typical characteristics of the spin-glass-like phase, occur below the blocking temperature of 45 K in samples with high Mn concentration, while for samples with low Mn concentration, ferromagnetic order remains up to 360 K. The behavior of low-temperature spin dynamics can be explained by the hierarchical model. (c) 2007 American Institute of Physics.