Biblioteca Digital

985 resultados para Software architectures

A FIFO-based multicast network and its use in multicomputers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an implementation of a multicast network of processors. The processors are connected in a fully connected network and it is possible to broadcast data in a single instruction. The network works at the processor-memory speed and therefore provides a fast communication link among processors. A number of interesting architectures are possible using such a network. We show some of these architectures which have been implemented and are functional. We also show the system software calls which allow programming of these machines in parallel mode.

Lean Thinking in Software Development : Impacts of Kanban on Projects

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The history of software development in a somewhat systematical way has been performed for half a century. Despite this time period, serious failures in software development projects still occur. The pertinent mission of software project management is to continuously achieve more and more successful projects. The application of agile software methods and more recently the integration of Lean practices contribute to this trend of continuous improvement in the software industry. One such area warranting proper empirical evidence is the operational efficiency of projects. In the field of software development, Kanban as a process management method has gained momentum recently, mostly due to its linkages to Lean thinking. However, only a few empirical studies investigate the impacts of Kanban on projects in that particular area. The aim of this doctoral thesis is to improve the understanding of how Kanban impacts on software projects. The research is carried out in the area of Lean thinking, which contains a variety of concepts including Kanban. This article-type thesis conducts a set of case studies expanded with the research strategy of quasi-controlled experiment. The data-gathering techniques of interviews, questionnaires, and different types of observations are used to study the case projects, and thereby to understand the impacts of Kanban on software development projects. The research papers of the thesis are refereed, international journal and conference publications. The results highlight new findings regarding the application of Kanban in the software context. The key findings of the thesis suggest that Kanban is applicable to software development. Despite its several benefits reported in this thesis, the empirical evidence implies that Kanban is not all-encompassing but requires additional practices to keep development projects performing appropriately. Implications for research are given, as well. In addition to these findings, the thesis contributes in the area of plan-driven software development by suggesting implications both for research and practitioners. As a conclusion, Kanban can benefit software development projects but additional practices would increase its potential for the projects.

Compiler-assisted power optimization for clustered VLIW architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.

Analyzing Cache Performance Bottlenecks of STM Applications and addressing them with Compiler's help

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.

A Case Study in Matching Service Descriptions to Implementations in an Existing System

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of companies are trying to migrate large monolithic software systems to Service Oriented Architectures. A common approach to do this is to first identify and describe desired services (i.e., create a model), and then to locate portions of code within the existing system that implement the described services. In this paper we describe a detailed case study we undertook to match a model to an open-source business application. We describe the systematic methodology we used, the results of the exercise, as well as several observations that throw light on the nature of this problem. We also suggest and validate heuristics that are likely to be useful in partially automating the process of matching service descriptions to implementations.

Comparative-Study of Photonic Switching architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The basic photonic switching elements of practical importance are outlined. A detailed comparative study of photonic switching architectures is presented both for guided wave fabrics and free-space fabrics. The required equations for comparative study are obtained, after considering the parameters like bend losses, effects of waveguide crossings, etc. The potential areas of application of photonic switching are pointed out.

Scheduling expression trees with reusable registers on delayed-load architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we look at the problem of scheduling expression trees with reusable registers on delayed load architectures. Reusable registers come into the picture when the compiler has a data-flow analyzer which is able to estimate the extent of use of the registers. Earlier work considered the same problem without allowing for register variables. Subsequently, Venugopal considered non-reusable registers in the tree. We further extend these efforts to consider a much more general form of the tree. We describe an approximate algorithm for the problem. We formally prove that the code schedule produced by this algorithm will, in the worst case, generate one interlock and use just one more register than that used by the optimal schedule. Spilling is minimized. The approximate algorithm is simple and has linear complexity.

Low-power pipelined LMS adaptive filter architectures with minimal adaptation delay

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of delayed coefficient adaptation in the least mean square (LMS) algorithm has enabled the design of pipelined architectures for real-time transversal adaptive filtering. However, the convergence speed of this delayed LMS (DLMS) algorithm, when compared with that of the standard LMS algorithm, is degraded and worsens with increase in the adaptation delay. Existing pipelined DLMS architectures have large adaptation delay and hence degraded convergence speed. We in this paper, first present a pipelined DLMS architecture with minimal adaptation delay for any given sampling rate. The architecture is synthesized by using a number of function preserving transformations on the signal flow graph representation of the DLMS algorithm. With the use of carry-save arithmetic, the pipelined architecture can support high sampling rates, limited only by the delay of a full adder and a 2-to-1 multiplexer. In the second part of this paper, we extend the synthesis methodology described in the first part, to synthesize pipelined DLMS architectures whose power dissipation meets a specified budget. This low-power architecture exploits the parallelism in the DLMS algorithm to meet the required computational throughput. The architecture exhibits a novel tradeoff between algorithmic performance (convergence speed) and power dissipation. (C) 1999 Elsevier Science B.V. All rights resented.

Mapping adaptive resonance theory onto ring and mesh architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, parallel computers have been attracting attention for simulating artificial neural networks (ANN). This is due to the inherent parallelism in ANN. This work is aimed at studying ways of parallelizing adaptive resonance theory (ART), a popular neural network algorithm. The core computations of ART are separated and different strategies of parallelizing ART are discussed. We present mapping strategies for ART 2-A neural network onto ring and mesh architectures. The required parallel architecture is simulated using a parallel architectural simulator, PROTEUS and parallel programs are written using a superset of C for the algorithms presented. A simulation-based scalability study of the algorithm-architecture match is carried out. The various overheads are identified in order to suggest ways of improving the performance. Our main objective is to find out the performance of the ART2-A network on different parallel architectures. (C) 1999 Elsevier Science B.V. All rights reserved.

Integrating CDS/ISIS Databases with Greenstone Digital Library Software (GSDL)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CDS/ISIS is an advanced non-numerical information storage and retrieval software developed by UNESCO since 1985 to satisfy the need expressed by many institutions, especially in developing countries, to be able to streamline their information processing activities by using modern (and relatively inexpensive) technologies [1]. CDS/ISIS is available for MS-DOS, Windows and Unix operating system platforms. The formatting language of CDS/ISIS is one of its several strengths. It is not only used for formatting records for display but is also used for creating customized indexes. CDS/ISIS by itself does not facilitate in publishing its databases on the Internet nor does it facilitate in publishing on CD-ROMs. However, numbers of open source tools are now available, which enables in publishing CDS/ISIS databases on the Internet and also on CD-ROMs. In this paper, we have discussed the ways and means of integrating CDS/ISIS databases with GSDL, an open source digital library (DL) software.

Scheduling expression trees for delayed-load architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we consider the problem of scheduling expression trees on delayed-load architectures. The problem tackled here takes root from the one considered in [Proceedings of the ACM SIGPLAN '91 Conf. on Programming Language Design and Implementation, 1991. p. 256] in which the leaves of the expression trees all refer to memory locations. A generalization of this involves the situation in which the trees may contain register variables, with the registers being used only at the leaves. Solutions to this generalization are given in [ACM Trans. Prog. Lang. Syst. 17 (1995) 740, Microproc. Microprog. 40 (1994) 577]. This paper considers the most general case in which the registers are reusable. This problem is tackled in [Comput. Lang, 21 (1995) 49] which gives an approximate solution to the problem under certain assumptions about the contiguity of the evaluation order: Here we propose an optimal solution (which may involve even a non-contiguous evaluation of the tree). The schedule generated by the algorithm given in this paper is optimal in the sense that it is an interlock-free schedule which uses the minimum number of registers required. An extension to the algorithm incorporates spilling. The problem as stated in this paper is an instruction scheduling problem. However, the problem could also be rephrased as an operations research problem with a difference in terminology. (C) 2002 Elsevier Science B.V. All rights reserved.

Simultaneous MultiStreaming for complexity-effective VLIW architectures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Very Long Instruction Word (VLIW) architectures exploit instruction level parallelism (ILP) with the help of the compiler to achieve higher instruction throughput with minimal hardware. However, control and data dependencies between operations limit the available ILP, which not only hinders the scalability of VLIW architectures, but also result in code size expansion. Although speculation and predicated execution mitigate ILP limitations due to control dependencies to a certain extent, they increase hardware cost and exacerbate code size expansion. Simultaneous multistreaming (SMS) can significantly improve operation throughput by allowing interleaved execution of operations from multiple instruction streams. In this paper we study SMS for VLIW architectures and quantify the benefits associated with it using a case study of the MPEG-2 video decoder. We also propose the notion of virtual resources for VLIW architectures, which decouple architectural resources (resources exposed to the compiler) from the microarchitectural resources, to limit code size expansion. Our results for a VLIW architecture demonstrate that: (1) SMS delivers much higher throughput than that achieved by speculation and predicated execution, (2) the increase in performance due to the addition of speculation and predicated execution support over SMS averages around 12%. The minor increase in performance might not warrant the additional hardware complexity involved, and (3) the notion of virtual resources is very effective in reducing no-operations (NOPs) and consequently reduce code size with little or no impact on performance.

Realizing a flexible constraint length Viterbi decoder for software radio on a de Bruijn interconnection network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Building flexible constraint length Viterbi decoders requires us to be able to realize de Bruijn networks of various sizes on the physically provided interconnection network. This paper considers the case when the physical network is itself a de Bruijn network and presents a scalable technique for realizing any n-node de Bruijn network on an N-node de Bruijn network, where n < N. The technique ensures that the length of the longest path realized on the network is minimized and that each physical connection is utilized to send only one data item, both of which are desirable in order to reduce the hardware complexity of the network and to obtain the best possible performance.

A conceptual vision of interoperability of CAD tools and software to track evolution of design process

Relevância:

20.00% 20.00%

Publicador:

Java Memory Model aware Software validation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Java Memory Model (JMM) provides a semantics of Java multithreading for any implementation platform. The JMM is defined in a declarative fashion with an allowed program execution being defined in terms of existence of "commit sequences" (roughly, the order in which actions in the execution are committed). In this work, we develop OpMM, an operational under-approximation of the JMM. The immediate motivation of this work lies in integrating a formal specification of the JMM with software model checkers. We show how our operational memory model description can be integrated into a Java Path Finder (JPF) style model checker for Java programs.

«
1
2
...
16
17
18
19
20
21
22
...
65
66
»