974 resultados para Computing cost
Resumo:
Technology scaling has enabled drastic growth in the computational and storage capacity of integrated circuits (ICs). This constant growth drives an increasing demand for high-bandwidth communication between and within ICs. In this dissertation we focus on low-power solutions that address this demand. We divide communication links into three subcategories depending on the communication distance. Each category has a different set of challenges and requirements and is affected by CMOS technology scaling in a different manner. We start with short-range chip-to-chip links for board-level communication. Next we will discuss board-to-board links, which demand a longer communication range. Finally on-chip links with communication ranges of a few millimeters are discussed.
Electrical signaling is a natural choice for chip-to-chip communication due to efficient integration and low cost. IO data rates have increased to the point where electrical signaling is now limited by the channel bandwidth. In order to achieve multi-Gb/s data rates, complex designs that equalize the channel are necessary. In addition, a high level of parallelism is central to sustaining bandwidth growth. Decision feedback equalization (DFE) is one of the most commonly employed techniques to overcome the limited bandwidth problem of the electrical channels. A linear and low-power summer is the central block of a DFE. Conventional approaches employ current-mode techniques to implement the summer, which require high power consumption. In order to achieve low-power operation we propose performing the summation in the charge domain. This approach enables a low-power and compact realization of the DFE as well as crosstalk cancellation. A prototype receiver was fabricated in 45nm SOI CMOS to validate the functionality of the proposed technique and was tested over channels with different levels of loss and coupling. Measurement results show that the receiver can equalize channels with maximum 21dB loss while consuming about 7.5mW from a 1.2V supply. We also introduce a compact, low-power transmitter employing passive equalization. The efficacy of the proposed technique is demonstrated through implementation of a prototype in 65nm CMOS. The design achieves up to 20Gb/s data rate while consuming less than 10mW.
An alternative to electrical signaling is to employ optical signaling for chip-to-chip interconnections, which offers low channel loss and cross-talk while providing high communication bandwidth. In this work we demonstrate the possibility of building compact and low-power optical receivers. A novel RC front-end is proposed that combines dynamic offset modulation and double-sampling techniques to eliminate the need for a short time constant at the input of the receiver. Unlike conventional designs, this receiver does not require a high-gain stage that runs at the data rate, making it suitable for low-power implementations. In addition, it allows time-division multiplexing to support very high data rates. A prototype was implemented in 65nm CMOS and achieved up to 24Gb/s with less than 0.4pJ/b power efficiency per channel. As the proposed design mainly employs digital blocks, it benefits greatly from technology scaling in terms of power and area saving.
As the technology scales, the number of transistors on the chip grows. This necessitates a corresponding increase in the bandwidth of the on-chip wires. In this dissertation, we take a close look at wire scaling and investigate its effect on wire performance metrics. We explore a novel on-chip communication link based on a double-sampling architecture and dynamic offset modulation technique that enables low power consumption and high data rates while achieving high bandwidth density in 28nm CMOS technology. The functionality of the link is demonstrated using different length minimum-pitch on-chip wires. Measurement results show that the link achieves up to 20Gb/s of data rate (12.5Gb/s/$\mu$m) with better than 136fJ/b of power efficiency.
Resumo:
This thesis describes a compositional framework for developing situation awareness applications: applications that provide ongoing information about a user's changing environment. The thesis describes how the framework is used to develop a situation awareness application for earthquakes. The applications are implemented as Cloud computing services connected to sensors and actuators. The architecture and design of the Cloud services are described and measurements of performance metrics are provided. The thesis includes results of experiments on earthquake monitoring conducted over a year. The applications developed by the framework are (1) the CSN --- the Community Seismic Network --- which uses relatively low-cost sensors deployed by members of the community, and (2) SAF --- the Situation Awareness Framework --- which integrates data from multiple sources, including the CSN, CISN --- the California Integrated Seismic Network, a network consisting of high-quality seismometers deployed carefully by professionals in the CISN organization and spread across Southern California --- and prototypes of multi-sensor platforms that include carbon monoxide, methane, dust and radiation sensors.
Resumo:
The classical Rayleigh quotient iteration (RQI) allows one to compute a one-dimensional invariant subspace of a symmetric matrix A. Here we propose a generalization of the RQI which computes a p-dimensional invariant subspace of A. Cubic convergence is preserved and the cost per iteration is low compared to other methods proposed in the literature.
Resumo:
We report on practical experience using the Oxford BSP Library to parallelize a large electromagnetic code, the British Aerospace finite-difference time-domain code EMMA T:FD3D. The Oxford BS Library is one of the first realizations of the Bulk Synchronous Parallel computational model to be targeted at numerically intensive scientific (typically Fortran) computing. The BAe EMMA code is one of the first large-scale applications to be parallelized using this library, and it is an important demonstration of the cost effectiveness of the BSP approach. We illustrate how BSP cost-modelling techniques can be used to predict and optimize performance for single-source programs across different parallel platforms. We provide predicted and observed performance figures for an industrial-strength, single-source parallel code for a variety of real parallel architectures: shared memory multiprocessors, workstation clusters and massively parallel platforms.
Resumo:
This paper examines the relative efficiency of UK credit unions. Radial and non-radial measures of input cost efficiency plus associated scale efficiency measures are computed for a selection of input output specifications. Both measures highlighted that UK credit unions have considerable scope for efficiency gains. It was mooted that the documented high levels of inefficiency may be indicative of the fact that credit unions, based on clearly defined and non-overlapping common bonds, are not in competition with each other for market share. Credit unions were also highlighted as suffering from a considerable degree of scale inefficiency with the majority of scale inefficient credit unions subject to decreasing returns to scale. The latter aspect highlights that the UK Government's goal of larger credit unions must be accompanied by greater regulatory freedom if inefficiency is to be avoided. One of the advantages of computing non-radial measures is that an insight into potential over- or under-expenditure on specific inputs can be obtained through a comparison of the non-radial measure of efficiency with the associated radial measure. Two interesting findings emerged, the first that UK credit unions over-spend on dividend payments and the second that they under-spend on labour costs.
Resumo:
A methodology to estimate the cost implications of design decisions by integrating cost as a design parameter at an early design stage is presented. The model is developed on a hierarchical basis, the manufacturing cost of aircraft fuselage panels being analysed in this paper. The manufacturing cost modelling is original and relies on a genetic-causal method where the drivers of each element of cost are identified relative to the process capability. The cost model is then extended to life cycle costing by computing the Direct Operating Cost as a function of acquisition cost and fuel burn, and coupled with a semi-empirical numerical analysis using Engineering Sciences Data Unit reference data to model the structural integrity of the fuselage shell with regard to material failure and various modes of buckling. The main finding of the paper is that the traditional minimum weight condition is a dated and sub-optimal approach to airframe structural design.
Resumo:
The process of making replicas of heritage has traditionally been developed by public agencies, corporations and museums and is not commonly used in schools. Currently there are technologies that allow creating cheap replicas. The new 3D reconstruction software, based on photographs and low cost 3D printers allow to make replicas at a cost much lower than traditional. This article describes the process of creating replicas of the sculpture Goslar Warrior of artist Henry Moore, located in Santa Cruz de Tenerife. To make this process, first, a digital model have been created using Autodesk Recap 360, Autodesk 123D Catch and Autodesk Meshmixer MarkerBot MakerWare applications. Physical replication, has been reproduced in polylactic acid (PLA) by MakerBot Replicator 2 3D printer. In addition, a cost analysis using, in one hand, the printer mentioned, and in the other hand, 3D printing services both online and local, is included. Finally, there has been a specific action with 141 students and 12 high school teachers, who filled a questionnary about the use of sculptural replicas in education.
Resumo:
Continuing achievements in hardware technology are bringing ubiquitous computing closer to reality. The notion of a connected, interactive and autonomous environment is common to all sensor networks, biosystems and radio frequency identification (RFID) devices, and the emergence of significant deployments and sophisticated applications can be expected. However, as more information is collected and transmitted, security issues will become vital for such a fully connected environment. In this study the authors consider adding security features to low-cost devices such as RFID tags. In particular, the authors consider the implementation of a digital signature architecture that can be used for device authentication, to prevent tag cloning, and for data authentication to prevent transmission forgery. The scheme is built around the signature variant of the cryptoGPS identification scheme and the SHA-1 hash function. When implemented on 130 nm CMOS the full design uses 7494 gates and consumes 4.72 mu W of power, making it smaller and more power efficient than previous low-cost digital signature designs. The study also presents a low-cost SHA-1 hardware architecture which is the smallest standardised hash function design to date.
Resumo:
A novel cost-effective and low-latency wormhole router for packet-switched NoC designs, tailored for FPGA, is presented. This has been designed to be scalable at system level to fully exploit the characteristics and constraints of FPGA based systems, rather than custom ASIC technology. A key feature is that it achieves a low packet propagation latency of only two cycles per hop including both router pipeline delay and link traversal delay - a significant enhancement over existing FPGA designs - whilst being very competitive in terms of performance and hardware complexity. It can also be configured in various network topologies including 1-D, 2-D, and 3-D. Detailed design-space exploration has been carried for a range of scaling parameters, with the results of various design trade-offs being presented and discussed. By taking advantage of abundant buildin reconfigurable logic and routing resources, we have been able to create a new scalable on-chip FPGA based router that exhibits high dimensionality and connectivity. The architecture proposed can be easily migrated across many FPGA families to provide flexible, robust and cost-effective NoC solutions suitable for the implementation of high-performance FPGA computing systems. © 2011 IEEE.
Resumo:
Quantum-dot cellular automata (QCA) is potentially a very attractive alternative to CMOS for future digital designs. Circuit designs in QCA have been extensively studied. However, how to properly evaluate the QCA circuits has not been carefully considered. To date, metrics and area-delay cost functions directly mapped from CMOS technology have been used to compare QCA designs, which is inappropriate due to the differences between these two technologies. In this paper, several cost metrics specifically aimed at QCA circuits are studied. It is found that delay, the number of QCA logic gates, and the number and type of crossovers, are important metrics that should be considered when comparing QCA designs. A family of new cost functions for QCA circuits is proposed. As fundamental components in QCA computing arithmetic, QCA adders are reviewed and evaluated with the proposed cost functions. By taking the new cost metrics into account, previous best adders become unattractive and it has been shown that different optimization goals lead to different “best” adders.
Resumo:
Bag of Distributed Tasks (BoDT) can benefit from decentralised execution on the Cloud. However, there is a trade-off between the performance that can be achieved by employing a large number of Cloud VMs for the tasks and the monetary constraints that are often placed by a user. The research reported in this paper is motivated towards investigating this trade-off so that an optimal plan for deploying BoDT applications on the cloud can be generated. A heuristic algorithm, which considers the user's preference of performance and cost is proposed and implemented. The feasibility of the algorithm is demonstrated by generating execution plans for a sample application. The key result is that the algorithm generates optimal execution plans for the application over 91% of the time.
Resumo:
Tese dout., Engenharia electrónica e computação - Processamento de sinal, Universidade do Algarve, 2008
Resumo:
In this paper we present a concept of an agent-based strategy to allocate services on a Cloud system without overloading nodes and maintaining the system stability with minimum cost. To provide a base for our research we specify an abstract model of cloud resources utilization, including multiple types of resources as well as considerations for the service migration costs. We also present an early version of simulation environment and a prototype of agent-based load balancer implemented in functional language Scala and Akka framework.