985 resultados para 291602 Memory Structures
Resumo:
We present external memory data structures for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in R-d, compute the aggregate of the weights of the points that lie inside a d-dimensional orthogonal query rectangle. The aggregates we consider in this paper include COUNT, sum, and MAX. First, we develop a structure for answering two-dimensional range-COUNT queries that uses O(N/B) disk blocks and answers a query in O(log(B) N) I/Os, where N is the number of input points and B is the disk block size. The structure can be extended to obtain a near-linear-size structure for answering range-sum queries using O(log(B) N) I/Os, and a linear-size structure for answering range-MAX queries in O(log(B)(2) N) I/Os. Our structures can be made dynamic and extended to higher dimensions. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
This paper explores potential for the RAMpage memory hierarchy to use a microkernel with a small memory footprint, in a specialized cache-speed static RAM (tightly-coupled memory, TCM). Dreamy memory is DRAM kept in low-power mode, unless referenced. Simulations show that a small microkernel suits RAMpage well, in that it achieves significantly better speed and energy gains than a standard hierarchy from adding TCM. RAMpage, in its best 128KB L2 case, gained 11% speed using TCM, and reduced energy 14%. Equivalent conventional hierarchy gains were under 1%. While 1MB L2 was significantly faster against lower-energy cases for the smaller L2, the larger SRAM's energy does not justify the speed gain. Using a 128KB L2 cache in a conventional architecture resulted in a best-case overall run time of 2.58s, compared with the best dreamy mode run time (RAMpage without context switches on misses) of 3.34s, a speed penalty of 29%. Energy in the fastest 128KB L2 case was 2.18J vs. 1.50J, a reduction of 31%. The same RAMpage configuration without dreamy mode took 2.83s as simulated, and used 2.39J, an acceptable trade-off (penalty under 10%) for being able to switch easily to a lower-energy mode.
Resumo:
A specialised reconfigurable architecture for telecommunication base-band processing is augmented with testing resources. The routing network is linked via virtual wire hardware modules to reduce the area occupied by connecting buses. The number of switches within the routing matrices is also minimised, which increases throughput without sacrificing flexibility. The testing algorithm was developed to systematically search for faults in the processing modules and the flexible high-speed routing network within the architecture. The testing algorithm starts by scanning the externally addressable memory space and testing the master controller. The controller then tests every switch in the route-through switch matrix by making loops from the shared memory to each of the switches. The local switch matrix is also tested in the same way. Next the local memory is scanned. Finally, pre-defined test vectors are loaded into local memory to check the processing modules. This algorithm scans all possible paths within the interconnection network exhaustively and reports all faults. Strategies can be inserted to bypass minor faults
Resumo:
Realising memory intensive applications such as image and video processing on FPGA requires creation of complex, multi-level memory hierarchies to achieve real-time performance; however commerical High Level Synthesis tools are unable to automatically derive such structures and hence are unable to meet the demanding bandwidth and capacity constraints of these applications. Current approaches to solving this problem can only derive either single-level memory structures or very deep, highly inefficient hierarchies, leading in either case to one or more of high implementation cost and low performance. This paper presents an enhancement to an existing MC-HLS synthesis approach which solves this problem; it exploits and eliminates data duplication at multiple levels levels of the generated hierarchy, leading to a reduction in the number of levels and ultimately higher performance, lower cost implementations. When applied to synthesis of C-based Motion Estimation, Matrix Multiplication and Sobel Edge Detection applications, this enables reductions in Block RAM and Look Up Table (LUT) cost of up to 25%, whilst simultaneously increasing throughput.
Resumo:
Viime aikoina yleistyneet flash-muistiin perustuvat tallennusvälineet ovat monessa suhteessa kiintolevyä parempia. Flash-muistissa on kuitenkin useita erityispiirteitä, jotka vaikeuttavat sen käyttöönottoa tietokantajärjestelmässä. Flash-muistissa kirjoittaminen on hitaampaa kuin lukeminen. Erityisesti hajanaisten sivujen päivittäminen on hidasta. Hajaluku flash-muistista on huomattavasti nopeampaa kuin kiintolevyltä. Näiden erityispiirteiden vuoksi tietokannan hallintajärjestelmä on optimoitava erikseen flash-muistia varten. Tässä optimoinnissa lähes kaikki tietokannan hallintajärjestelmän osa-alueet on toteutettava uudelleen flash-muistin näkökulmasta. Flash-muistin nopean hajaluvun ansiosta relaatioiden tiedot voidaan sijoitella flash-muistiin vapaammin kuin kiintolevylle. Yleisin tietokannoissa käytetty hakemistorakenne B+-puu ei toimi tehokkaasti flash-muistissa hajapäivitysten suuren määrän vuoksi. Flashmuistia varten on kehitetty useita B+-puun muunnelmia, joissa hajapäivitysten määrää on onnistuttu vähentämään. Puskurin hallintaa voidaan optimoida flash-muistia varten vähentämällä hitaiden kirjoitusten määrää nopeiden lukujen määrän kustannuksella sekä muuttamalla hitaita hajakirjoituksia nopeammiksi peräkkäisten sivujen kirjoituksiksi. B.3 (hardware, memory structures) H.2.2 (database management, physical design)
Resumo:
In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniques for maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables, and an approach for performing parallel addition of N input symbols.
Resumo:
In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniquesfor maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables,and an approach for performing parallel addition of N input symbols.
Resumo:
Cet ouvrage explore en trois volets des aspects du traitement attentionnel de cibles et de distracteurs visuels ainsi que leur mesures électrophysiologiques. Le premier chapitre aborde le traitement attentionnel spécifique à la cible et aux distracteurs durant une recherche visuelle. La division de la N2pc en une NT et une PD remet en question la théorie proposant qu'il existe systématiquement une activité attentionnelle liée à un distracteur saillant, car un distracteur vert ne provoque aucune activité latéralisée propre. Le second chapitre aborde la question de la latéralisation des structures responsables du maintient et de la récupération d'information en mémoire visuelle à court-terme. En utilisant un paradigme de latéralisation de la cible et du distracteur, il nous est possible de vérifier qu'il existe une composante latéralisée négative dans la région temporale, la TCN, propre à la cible lors du rappel en mémoire. De plus, on observe également une composante latéralisée pour le distracteur sur la partie postérieure du crâne. Ces deux éléments convergent pour indiquer qu'il existe une latéralisation des structures activées lors de la récupération de l'information en mémoire visuelle à court-terme en fonction de l'hémichamps où se trouve la cible ou le distracteur. Enfin, dans le troisième chapitre, il est question de l'effet sur le déploiement attentionnel de l'ajout de distracteurs gris de faible saillance autour de cibles potentielles. L'ajout de ces distracteurs augmente la difficulté d'identification de la cible. Cette difficulté provoque un déplacement de l'activité de la N2pc vers la fenêtre de temps associée à la composante Ptc. Un nombre plus important de distracteurs gris entraîne une plus grande proportion de l'activité à être retardée. Également, les distracteurs gris qui sont placés entre les cibles potentielles provoquent un retard plus important que les distracteurs placés hors de cette région. Au cours de cette thèse, la question de la saillance attentionnelle des différentes couleurs durant une recherche visuelle est récurente. Nous observons une plus grande saillance du rouge par rapport au vert quand ils sont distracteurs et le vert est plus difficile à distinguer du gris que le jaune.
Resumo:
INTRODUCTION The orthographic depth hypothesis (Katz and Feldman, 1983) posits that different reading routes are engaged depending on the type of grapheme/phoneme correspondence of the language being read. Shallow orthographies with consistent grapheme/phoneme correspondences favor encoding via non-lexical pathways, where each grapheme is sequentially mapped to its corresponding phoneme. In contrast, deep orthographies with inconsistent grapheme/phoneme correspondences favor lexical pathways, where phonemes are retrieved from specialized memory structures. This hypothesis, however, lacks compelling empirical support. The aim of the present study was to investigate the impact of orthographic depth on reading route selection using a within-subject design. METHOD We presented the same pseudowords (PWs) to highly proficient bilinguals and manipulated the orthographic depth of PW reading by embedding them among two separated German or French language contexts, implicating respectively, shallow or deep orthography. High density electroencephalography was recorded during the task. RESULTS The topography of the ERPs to identical PWs differed 300-360 ms post-stimulus onset when the PWs were read in different orthographic depth context, indicating distinct brain networks engaged in reading during this time window. The brain sources underlying these topographic effects were located within left inferior frontal (German > French), parietal (French > German) and cingular areas (German > French). CONCLUSION Reading in a shallow context favors non-lexical pathways, reflected in a stronger engagement of frontal phonological areas in the shallow versus the deep orthographic context. In contrast, reading PW in a deep orthographic context recruits less routine non-lexical pathways, reflected in a stronger engagement of visuo-attentional parietal areas in the deep versus shallow orthographic context. These collective results support a modulation of reading route by orthographic depth.
Resumo:
Memory analysis techniques have become sophisticated enough to model, with a high degree of accuracy, the manipulation of simple memory structures (finite structures, single/double linked lists and trees). However, modern programming languages provide extensive library support including a wide range of generic collection objects that make use of complex internal data structures. While these data structures ensure that the collections are efficient, often these representations cannot be effectively modeled by existing methods (either due to excessive analysis runtime or due to the inability to represent the required information). This paper presents a method to represent collections using an abstraction of their semantics. The construction of the abstract semantics for the collection objects is done in a manner that allows individual elements in the collections to be identified. Our construction also supports iterators over the collections and is able to model the position of the iterators with respect to the elements in the collection. By ordering the contents of the collection based on the iterator position, the model can represent a notion of progress when iteratively manipulating the contents of a collection. These features allow strong updates to the individual elements in the collection as well as strong updates over the collections themselves.
Resumo:
To store, update and retrieve data from database management systems (DBMS), software architects use tools, like call-level interfaces (CLI), which provide standard functionalities to interact with DBMS. However, the emerging of NoSQL paradigm, and particularly new NoSQL DBMS providers, lead to situations where some of the standard functionalities provided by CLI are not supported, very often due to their distance from the relational model or due to design constraints. As such, when a system architect needs to evolve, namely from a relational DBMS to a NoSQL DBMS, he must overcome the difficulties conveyed by the features not provided by NoSQL DBMS. Choosing the wrong NoSQL DBMS risks major issues with components requesting non-supported features. This paper focuses on how to deploy features that are not so commonly supported by NoSQL DBMS (like Stored Procedures, Transactions, Save Points and interactions with local memory structures) by implementing them in standard CLI.
Resumo:
This paper addresses the problem of joint identification of infinite-frequency added mass and fluid memory models of marine structures from finite frequency data. This problem is relevant for cases where the code used to compute the hydrodynamic coefficients of the marine structure does not give the infinite-frequency added mass. This case is typical of codes based on 2D-potential theory since most 3D-potential-theory codes solve the boundary value associated with the infinite frequency. The method proposed in this paper presents a simpler alternative approach to other methods previously presented in the literature. The advantage of the proposed method is that the same identification procedure can be used to identify the fluid-memory models with or without having access to the infinite-frequency added mass coefficient. Therefore, it provides an extension that puts the two identification problems into the same framework. The method also exploits the constraints related to relative degree and low-frequency asymptotic values of the hydrodynamic coefficients derived from the physics of the problem, which are used as prior information to refine the obtained models.
Resumo:
Combining the electronic properties of graphene(1,2) and molybdenum disulphide (MoS2)(3-6) in hybrid heterostructures offers the possibility to create devices with various functionalities. Electronic logic and memory devices have already been constructed from graphene-MoS2 hybrids(7,8), but they do not make use of the photosensitivity of MoS2, which arises from its optical-range bandgap(9). Here, we demonstrate that graphene-on-MoS2 binary heterostructures display remarkable dual optoelectronic functionality, including highly sensitive photodetection and gate-tunable persistent photoconductivity. The responsivity of the hybrids was found to be nearly 1 x 10(10) A W-1 at 130 K and 5 x 10(8) A W-1 at room temperature, making them the most sensitive graphene-based photodetectors. When subjected to time-dependent photoillumination, the hybrids could also function as a rewritable optoelectronic switch or memory, where the persistent state shows almost no relaxation or decay within experimental timescales, indicating near-perfect charge retention. These effects can be quantitatively explained by gate-tunable charge exchange between the graphene and MoS2 layers, and may lead to new graphene-based optoelectronic devices that are naturally scalable for large-area applications at room temperature.