64 resultados para MEMORY PERFORMANCE
Resumo:
In-Memory Databases (IMDBs), such as SAP HANA, enable new levels of database performance by removing the disk bottleneck and by compressing data in memory. The consequence of this improved performance means that reports and analytic queries can now be processed on demand. Therefore, the goal is now to provide near real-time responses to compute and data intensive analytic queries. To facilitate this, much work has investigated the use of acceleration technologies within the database context. While current research into the application of these technologies has yielded positive results, they have tended to focus on single database tasks or on isolated single user requests. This paper uses SHEPARD, a framework for managing accelerated tasks across shared heterogeneous resources, to introduce acceleration into an IMDB. Results show how, using SHEPARD, multiple simultaneous user queries all receive speed-up by using a shared pool of accelerators. Results also show that offloading analytic tasks onto accelerators can have indirect benefits for other database workloads by reducing contention for CPU resources.
Resumo:
Field-programmable gate arrays are ideal hosts to custom accelerators for signal, image, and data processing but de- mand manual register transfer level design if high performance and low cost are desired. High-level synthesis reduces this design burden but requires manual design of complex on-chip and off-chip memory architectures, a major limitation in applications such as video processing. This paper presents an approach to resolve this shortcoming. A constructive process is described that can derive such accelerators, including on- and off-chip memory storage from a C description such that a user-defined throughput constraint is met. By employing a novel statement-oriented approach, dataflow intermediate models are derived and used to support simple ap- proaches for on-/off-chip buffer partitioning, derivation of custom on-chip memory hierarchies and architecture transformation to ensure user-defined throughput constraints are met with minimum cost. When applied to accelerators for full search motion estima- tion, matrix multiplication, Sobel edge detection, and fast Fourier transform, it is shown how real-time performance up to an order of magnitude in advance of existing commercial HLS tools is enabled whilst including all requisite memory infrastructure. Further, op- timizations are presented that reduce the on-chip buffer capacity and physical resource cost by up to 96% and 75%, respectively, whilst maintaining real-time performance.
Resumo:
Two experiments investigated the consequences of action at encoding and recall on the ability to follow sequences of instructions. Children aged 7–9 years recalled sequences of spoken action commands under presentation and recall conditions that either did or did not involve their physical performance. In both experiments, recall was enhanced by carrying out the instructions as they were being initially presented and also by performing them at recall. In contrast, the accuracy of instruction-following did not improve above spoken presentation alone, either when the instructions were silently read or heard by the child (Experiment 1), or when the child repeated the spoken instructions as they were presented (Experiment 2). These findings suggest that the enactment advantage at presentation does not simply reflect a general benefit of a dual exposure to instructions, and that it is not a result of their self-production at presentation. The benefits of action-based recall were reduced following enactment during presentation, suggesting that the positive effects of action at encoding and recall may have a common origin. It is proposed that the benefits of physical movement arise from the existence of a short-term motor store that maintains the temporal, spatial, and motoric features of either planned or already executed actions.
Resumo:
How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate how important memory, local communication, computation and storage related operations are to an application. The user can either provide a set of four abstract weights or eight fine grain weights based on the knowledge of the application. The weights along with benchmarking data collected from the cloud are used to generate a set of two rankings - one based only on the performance of the VMs and the other takes both performance and costs into account. The rankings are validated on three case study applications using two validation techniques. The case studies on a set of experimental VMs highlight that maximum performance can be achieved by the three top ranked VMs and maximum performance in a cost-effective manner is achieved by at least one of the top three ranked VMs produced by the methodology.