102 resultados para Software Architecture

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Precision, sophistication and economic factors in many areas of scientific research that demand very high magnitude of compute power is the order of the day. Thus advance research in the area of high performance computing is getting inevitable. The basic principle of sharing and collaborative work by geographically separated computers is known by several names such as metacomputing, scalable computing, cluster computing, internet computing and this has today metamorphosed into a new term known as grid computing. This paper gives an overview of grid computing and compares various grid architectures. We show the role that patterns can play in architecting complex systems, and provide a very pragmatic reference to a set of well-engineered patterns that the practicing developer can apply to crafting his or her own specific applications. We are not aware of pattern-oriented approach being applied to develop and deploy a grid. There are many grid frameworks that are built or are in the process of being functional. All these grids differ in some functionality or the other, though the basic principle over which the grids are built is the same. Despite this there are no standard requirements listed for building a grid. The grid being a very complex system, it is mandatory to have a standard Software Architecture Specification (SAS). We attempt to develop the same for use by any grid user or developer. Specifically, we analyze the grid using an object oriented approach and presenting the architecture using UML. This paper will propose the usage of patterns at all levels (analysis. design and architectural) of the grid development.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We share our experience in planning, designing and deploying a wireless sensor network of one square kilometre area. Environmental data such as soil moisture, temperature, barometric pressure, and relative humidity are collected in this area situated in the semi-arid region of Karnataka, India. It is a hope that information derived from this data will benefit the marginal farmer towards improving his farming practices. Soon after establishing the need for such a project, we begin by showing the big picture of such a data gathering network, the software architecture we have used, the range measurements needed for determining the sensor density, and the packaging issues that seem to play a crucial role in field deployments. Our field deployment experiences include designing with intermittent grid power, enhancing software tools to aid quicker and effective deployment, and flash memory corruption. The first results on data gathering look encouraging.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The prevalent virtualization technologies provide QoS support within the software layers of the virtual machine monitor(VMM) or the operating system of the virtual machine(VM). The QoS features are mostly provided as extensions to the existing software used for accessing the I/O device because of which the applications sharing the I/O device experience loss of performance due to crosstalk effects or usable bandwidth. In this paper we examine the NIC sharing effects across VMs on a Xen virtualized server and present an alternate paradigm that improves the shared bandwidth and reduces the crosstalk effect on the VMs. We implement the proposed hardwaresoftware changes in a layered queuing network (LQN) model and use simulation techniques to evaluate the architecture. We find that simple changes in the device architecture and associated system software lead to application throughput improvement of up to 60%. The architecture also enables finer QoS controls at device level and increases the scalability of device sharing across multiple virtual machines. We find that the performance improvement derived using LQN model is comparable to that reported by similar but real implementations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the architecture of a SoC fabric onto which applications described in a HLL are synthesized. The fabric is a homogeneous layout of computation, storage and communication resources on silicon. Through a process of composition of resources (as opposed to decomposition of applications), application specific computational structures are defined on the fabric at runtime to realize different modules of the applications in hardware. Applications synthesized on this fabric offers performance comparable to ASICs while retaining the programmability of processing cores. We outline the application synthesis methodology through examples, and compare our results with software implementations on traditional platforms with unbounded resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today's SoCs are complex designs with multiple embedded processors, memory subsystems, and application specific peripherals. The memory architecture of embedded SoCs strongly influences the power and performance of the entire system. Further, the memory subsystem constitutes a major part (typically up to 70%) of the silicon area for the current day SoC. In this article, we address the on-chip memory architecture exploration for DSP processors which are organized as multiple memory banks, where banks can be single/dual ported with non-uniform bank sizes. In this paper we propose two different methods for physical memory architecture exploration and identify the strengths and applicability of these methods in a systematic way. Both methods address the memory architecture exploration for a given target application by considering the application's data access characteristics and generates a set of Pareto-optimal design points that are interesting from a power, performance and VLSI area perspective. To the best of our knowledge, this is the first comprehensive work on memory space exploration at physical memory level that integrates data layout and memory exploration to address the system objectives from both hardware design and application software development perspective. Further we propose an automatic framework that explores the design space identifying 100's of Pareto-optimal design points within a few hours of running on a standard desktop configuration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present the design of ``e-SURAKSHAK,'' a novel cyber-physical health care management system of Wireless Embedded Internet Devices (WEIDs) that sense vital health parameters. The system is capable of sensing body temperature, heart rate, oxygen saturation level and also allows noninvasive blood pressure (NIBP) measurement. End to end internet connectivity is provided by using 6LoWPAN based wireless network that uses the 802.15.4 radio. A service oriented architecture (SOA) 1] is implemented to extract meaningful information and present it in an easy-to-understand form to the end-user instead of raw data made available by sensors. A central electronic database and health care management software are developed. Vital health parameters are measured and stored periodically in the database. Further, support for real-time measurement of health parameters is provided through a web based GUI. The system has been implemented completely and demonstrated with multiple users and multiple WEIDs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have implemented a Firewall with this architecture in reconflgurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results in both speed and area improvement when it is implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields. High throughput classification invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly in terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for the worst case packet size. The Firewall rule update involves only memory re-initialization in software without any hardware change.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have Implemented a Firewall with this architecture in reconfigurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using, our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results In both speed and area Improvement when It is Implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields.High throughput classification Invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly In terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware Implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for file worst case packet size. The Firewall rule update Involves only memory re-initialiization in software without any hardware change.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simultaneous consideration of both performance and reliability issues is important in the choice of computer architectures for real-time aerospace applications. One of the requirements for such a fault-tolerant computer system is the characteristic of graceful degradation. A shared and replicated resources computing system represents such an architecture. In this paper, a combinatorial model is used for the evaluation of the instruction execution rate of a degradable, replicated resources computing system such as a modular multiprocessor system. Next, a method is presented to evaluate the computation reliability of such a system utilizing a reliability graph model and the instruction execution rate. Finally, this computation reliability measure, which simultaneously describes both performance and reliability, is applied as a constraint in an architecture optimization model for such computing systems. Index Terms-Architecture optimization, computation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents on overview of the issues in precisely defining, specifying and evaluating the dependability of software, particularly in the context of computer controlled process systems. Dependability is intended to be a generic term embodying various quality factors and is useful for both software and hardware. While the developments in quality assurance and reliability theories have proceeded mostly in independent directions for hardware and software systems, we present here the case for developing a unified framework of dependability—a facet of operational effectiveness of modern technological systems, and develop a hierarchical systems model helpful in clarifying this view. In the second half of the paper, we survey the models and methods available for measuring and improving software reliability. The nature of software “bugs”, the failure history of the software system in the various phases of its lifecycle, the reliability growth in the development phase, estimation of the number of errors remaining in the operational phase, and the complexity of the debugging process have all been considered to varying degrees of detail. We also discuss the notion of software fault-tolerance, methods of achieving the same, and the status of other measures of software dependability such as maintainability, availability and safety.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The literature contains many examples of digital procedures for the analytical treatment of electroencephalograms, but there is as yet no standard by which those techniques may be judged or compared. This paper proposes one method of generating an EEG, based on a computer program for Zetterberg's simulation. It is assumed that the statistical properties of an EEG may be represented by stationary processes having rational transfer functions and achieved by a system of software fillers and random number generators.The model represents neither the neurological mechanism response for generating the EEG, nor any particular type of EEG record; transient phenomena such as spikes, sharp waves and alpha bursts also are excluded. The basis of the program is a valid ‘partial’ statistical description of the EEG; that description is then used to produce a digital representation of a signal which if plotted sequentially, might or might not by chance resemble an EEG, that is unimportant. What is important is that the statistical properties of the series remain those of a real EEG; it is in this sense that the output is a simulation of the EEG. There is considerable flexibility in the form of the output, i.e. its alpha, beta and delta content, which may be selected by the user, the same selected parameters always producing the same statistical output. The filtered outputs from the random number sequences may be scaled to provide realistic power distributions in the accepted EEG frequency bands and then summed to create a digital output signal, the ‘stationary EEG’. It is suggested that the simulator might act as a test input to digital analytical techniques for the EEG, a simulator which would enable at least a substantial part of those techniques to be compared and assessed in an objective manner. The equations necessary to implement the model are given. The program has been run on a DEC1090 computer but is suitable for any microcomputer having more than 32 kBytes of memory; the execution time required to generate a 25 s simulated EEG is in the region of 15 s.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we develop compilation techniques for the realization of applications described in a High Level Language (HLL) onto a Runtime Reconfigurable Architecture. The compiler determines Hyper Operations (HyperOps) that are subgraphs of a data flow graph (of an application) and comprise elementary operations that have strong producer-consumer relationship. These HyperOps are hosted on computation structures that are provisioned on demand at runtime. We also report compiler optimizations that collectively reduce the overheads of data-driven computations in runtime reconfigurable architectures. On an average, HyperOps offer a 44% reduction in total execution time and a 18% reduction in management overheads as compared to using basic blocks as coarse grained operations. We show that HyperOps formed using our compiler are suitable to support data flow software pipelining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the architecture and the VHDL design of an integer 2-D DCT used in the H.264/AVC. The 2-D DCT computation is performed by exploiting it’s orthogonality and separability property. The symmetry of the forward and inverse transform is used in this implementation. To reduce the computation overhead for the addition, subtraction and multiplication operations, we analyze the suitability of carry-free position independent residue number system (RNS) for the implementation of 2-D DCT. The implementation has been carried out in VHDL for Altera FPGA. We used the negative number representation in RNS, bit width analysis of the transforms and dedicated registers present in the Logic element of the FPGA to optimize the area. The complexity and efficiency analysis show that the proposed architecture could provide higher through-put.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the architecture of a fault-tolerant, special-purpose multi-microprocessor system for solving Partial Differential Equations (PDEs). The modular nature of the architecture allows the use of hundreds of Processing Elements (PEs) for high throughput. Its performance is evaluated by both analytical and simulation methods. The results indicate that the system can achieve high operation rates and is not sensitive to inter-processor communication delay.