992 resultados para Software Transactional Memory (STM)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The memory hierarchy is the main bottleneck in modern computer systems as the gap between the speed of the processor and the memory continues to grow larger. The situation in embedded systems is even worse. The memory hierarchy consumes a large amount of chip area and energy, which are precious resources in embedded systems. Moreover, embedded systems have multiple design objectives such as performance, energy consumption, and area, etc. Customizing the memory hierarchy for specific applications is a very important way to take full advantage of limited resources to maximize the performance. However, the traditional custom memory hierarchy design methodologies are phase-ordered. They separate the application optimization from the memory hierarchy architecture design, which tend to result in local-optimal solutions. In traditional Hardware-Software co-design methodologies, much of the work has focused on utilizing reconfigurable logic to partition the computation. However, utilizing reconfigurable logic to perform the memory hierarchy design is seldom addressed. In this paper, we propose a new framework for designing memory hierarchy for embedded systems. The framework will take advantage of the flexible reconfigurable logic to customize the memory hierarchy for specific applications. It combines the application optimization and memory hierarchy design together to obtain a global-optimal solution. Using the framework, we performed a case study to design a new software-controlled instruction memory that showed promising potential.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Considerable research and development has been invested in software Distributed Shared Memory (DSM). The primary focus of this work has traditionally been on high performance and consistency protocols. Unfortunately, clusters present a number of challenges for any DSM systems not solvable through consistency protocols alone. These challenges relate to the ability of DSM systems to adjust to load fluctuations, computers being added/removed from the cluster, to deal with faults, and the ability to use DSM objects larger than the available physical memory. This paper introduces the Synergy DSM System and its integration with the virtual memory, group communication and process migration services of the Genesis Cluster Operating System.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Dynamic Data eXchange (DDX) is our third generation platform for building distributed robot controllers. DDX allows a coalition of programs to share data at run-time through an efficient shared memory mechanism managed by a store. Further, stores on multiple machines can be linked by means of a global catalog and data is moved between the stores on an as needed basis by multi-casting. Heterogeneous computer systems are handled. We describe the architecture of DDX and the standard clients we have developed that let us rapidly build complex control systems with minimal coding.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The generation of a correlation matrix from a large set of long gene sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. The generation is not only computationally intensive but also requires significant memory resources as, typically, few gene sequences can be simultaneously stored in primary memory. The standard practice in such computation is to use frequent input/output (I/O) operations. Therefore, minimizing the number of these operations will yield much faster run-times. This paper develops an approach for the faster and scalable computing of large-size correlation matrices through the full use of available memory and a reduced number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different problems with different correlation matrix sizes. The significant performance improvement of the approach over the existing approaches is demonstrated through benchmark examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pavlovian fear conditioning is a robust technique for examining behavioral and cellular components of fear learning and memory. In fear conditioning, the subject learns to associate a previously neutral stimulus with an inherently noxious co-stimulus. The learned association is reflected in the subjects' behavior upon subsequent re-exposure to the previously neutral stimulus or the training environment. Using fear conditioning, investigators can obtain a large amount of data that describe multiple aspects of learning and memory. In a single test, researchers can evaluate functional integrity in fear circuitry, which is both well characterized and highly conserved across species. Additionally, the availability of sensitive and reliable automated scoring software makes fear conditioning amenable to high-throughput experimentation in the rodent model; thus, this model of learning and memory is particularly useful for pharmacological and toxicological screening. Due to the conserved nature of fear circuitry across species, data from Pavlovian fear conditioning are highly translatable to human models. We describe equipment and techniques needed to perform and analyze conditioned fear data. We provide two examples of fear conditioning experiments, one in rats and one in mice, and the types of data that can be collected in a single experiment. © 2012 Springer Science+Business Media, LLC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In-memory databases have become a mainstay of enterprise computing offering significant performance and scalability boosts for online analytical and (to a lesser extent) transactional processing as well as improved prospects for integration across different applications through an efficient shared database layer. Significant research and development has been undertaken over several years concerning data management considerations of in-memory databases. However, limited insights are available on the impacts of applications and their supportive middleware platforms and how they need to evolve to fully function through, and leverage, in-memory database capabilities. This paper provides a first, comprehensive exposition into how in-memory databases impact Business Pro- cess Management, as a mission-critical and exemplary model-driven integration and orchestration middleware. Through it, we argue that in-memory databases will render some prevalent uses of legacy BPM middleware obsolete, but also open up exciting possibilities for tighter application integration, better process automation performance and some entirely new BPM capabilities such as process-based application customization. To validate the feasibility of an in-memory BPM, we develop a surprisingly simple BPM runtime embedded into SAP HANA and providing for BPMN-based process automation capabilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research explored how small and medium enterprises can achieve success with software as a service (SaaS) applications from cloud. Based upon an empirical investigation of six growth oriented and early technology adopting small and medium enterprises, this study proposes a SaaS for small and medium enterprise success model with two approaches: one for basic and one for advanced benefits. The basic model explains the effective use of SaaS for achieving informational and transactional benefits. The advanced model explains the enhanced use of software as a service for achieving strategic and transformational benefits. Both models explicate the information systems capabilities and organizational complementarities needed for achieving success with SaaS.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The literature contains many examples of digital procedures for the analytical treatment of electroencephalograms, but there is as yet no standard by which those techniques may be judged or compared. This paper proposes one method of generating an EEG, based on a computer program for Zetterberg's simulation. It is assumed that the statistical properties of an EEG may be represented by stationary processes having rational transfer functions and achieved by a system of software fillers and random number generators.The model represents neither the neurological mechanism response for generating the EEG, nor any particular type of EEG record; transient phenomena such as spikes, sharp waves and alpha bursts also are excluded. The basis of the program is a valid ‘partial’ statistical description of the EEG; that description is then used to produce a digital representation of a signal which if plotted sequentially, might or might not by chance resemble an EEG, that is unimportant. What is important is that the statistical properties of the series remain those of a real EEG; it is in this sense that the output is a simulation of the EEG. There is considerable flexibility in the form of the output, i.e. its alpha, beta and delta content, which may be selected by the user, the same selected parameters always producing the same statistical output. The filtered outputs from the random number sequences may be scaled to provide realistic power distributions in the accepted EEG frequency bands and then summed to create a digital output signal, the ‘stationary EEG’. It is suggested that the simulator might act as a test input to digital analytical techniques for the EEG, a simulator which would enable at least a substantial part of those techniques to be compared and assessed in an objective manner. The equations necessary to implement the model are given. The program has been run on a DEC1090 computer but is suitable for any microcomputer having more than 32 kBytes of memory; the execution time required to generate a 25 s simulated EEG is in the region of 15 s.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The design, implementation and evaluation are described of a dual-microcomputer system based on the concept of shared memory. Shared memory is useful for passing large blocks of data and it also provides a means to hold and work with shared data. In addition to the shared memory, a separate bus between the I/O ports of the microcomputers is provided. This bus is utilized for interprocessor synchronization. Software routines helpful in applying the dual-microcomputer system to realistic problems are presented. Performance evaluation of the system is carried out using benchmarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We all have fresh in our memory what happened to the IT sector only a few years ago when the IT-bubble burst. The upswing of productivity in this sector slowed down, investors lost large investments, many found themselves looking for a new job, and countless dreams fell apart. Product developers in the IT sector have experienced a large number of organizational restructurings since the IT boom, including rapid growth, downsizing processes, and structural reforms. Organizational restructurings seem to be a complex and continuous phenomenon people in this sector have to deal with. How do software product developers retrospectively construct their work in relation to organizational restructurings? How do organizational restructurings bring about specific social processes in product development? This working paper focuses on these questions. The overall aim is to develop an understanding of how software product developers construct their work during organizational restructurings. The theoretical frame of reference is based on a social constructionist approach and discourse analysis. This approach offers more or less radical and critical alternatives to mainstream organizational theory. Writings from this perspective attempt to investigate and understand sociocultural processes by which various realities are created. Therefore these studies aim at showing how people participate in constituting the social world (Gergen & Thatchenkery, 1996); knowledge of the world is seen to be constructed between people in daily interaction, in which language plays a central role. This means that interaction, especially the ways of talking and writing about product development during organizational restructurings, become the target of concern. This study consists of 25 in-depth interviews following a pilot study based on 57 semi-structured interviews. In this working paper I analyze 9 in-depth interviews. The interviews were conducted in eight IT firms. The analysis explores how discourses are constructed and function, as well as the consequences that follow from different discourses. The analysis shows that even though the product developers have experienced many organizational restructurings, some of which have been far-reaching, their accounts build strongly on a stability discourse. According to this discourse product development is, perhaps surprisingly, not influenced to a great extent by organizational restructurings. This does not mean that product development is static. According to the social constructionist approach, product development is constantly being reproduced and maintained in ongoing processes. In other words stable effects are also ongoing achievements and these are of particular interest in this study. The product developers maintain rather than change the product development through ongoing processes of construction, even when they experience continuous extensive organizational restructurings. The discourse of stability exists alongside other discourses, some which contradict each other. Together they direct product development and generate meanings. The product developers consequently take an active role in the construction of their work during organizational restructurings. When doing this they also negotiate credible positions for themselves

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today's SoCs are complex designs with multiple embedded processors, memory subsystems, and application specific peripherals. The memory architecture of embedded SoCs strongly influences the power and performance of the entire system. Further, the memory subsystem constitutes a major part (typically up to 70%) of the silicon area for the current day SoC. In this article, we address the on-chip memory architecture exploration for DSP processors which are organized as multiple memory banks, where banks can be single/dual ported with non-uniform bank sizes. In this paper we propose two different methods for physical memory architecture exploration and identify the strengths and applicability of these methods in a systematic way. Both methods address the memory architecture exploration for a given target application by considering the application's data access characteristics and generates a set of Pareto-optimal design points that are interesting from a power, performance and VLSI area perspective. To the best of our knowledge, this is the first comprehensive work on memory space exploration at physical memory level that integrates data layout and memory exploration to address the system objectives from both hardware design and application software development perspective. Further we propose an automatic framework that explores the design space identifying 100's of Pareto-optimal design points within a few hours of running on a standard desktop configuration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fast content addressable data access mechanisms have compelling applications in today's systems. Many of these exploit the powerful wildcard matching capabilities provided by ternary content addressable memories. For example, TCAM based implementations of important algorithms in data mining been developed in recent years; these achieve an an order of magnitude speedup over prevalent techniques. However, large hardware TCAMs are still prohibitively expensive in terms of power consumption and cost per bit. This has been a barrier to extending their exploitation beyond niche and special purpose systems. We propose an approach to overcome this barrier by extending the traditional virtual memory hierarchy to scale up the user visible capacity of TCAMs while mitigating the power consumption overhead. By exploiting the notion of content locality (as opposed to spatial locality), we devise a novel combination of software and hardware techniques to provide an abstraction of a large virtual ternary content addressable space. In the long run, such abstractions enable applications to disassociate considerations of spatial locality and contiguity from the way data is referenced. If successful, ideas for making content addressability a first class abstraction in computing systems can open up a radical shift in the way applications are optimized for memory locality, just as storage class memories are soon expected to shift away from the way in which applications are typically optimized for disk access locality.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most Java programmers would agree that Java is a language that promotes a philosophy of “create and go forth”. By design, temporary objects are meant to be created on the heap, possibly used and then abandoned to be collected by the garbage collector. Excessive generation of temporary objects is termed “object churn” and is a form of software bloat that often leads to performance and memory problems. To mitigate this problem, many compiler optimizations aim at identifying objects that may be allocated on the stack. However, most such optimizations miss large opportunities for memory reuse when dealing with objects inside loops or when dealing with container objects. In this paper, we describe a novel algorithm that detects bloat caused by the creation of temporary container and String objects within a loop. Our analysis determines which objects created within a loop can be reused. Then we describe a source-to-source transformation that efficiently reuses such objects. Empirical evaluation indicates that our solution can reduce upto 40% of temporary object allocations in large programs, resulting in a performance improvement that can be as high as a 20% reduction in the run time, specifically when a program has a high churn rate or when the program is memory intensive and needs to run the GC often.