978 resultados para Compute Unified Device Architecture (CUDA)
Resumo:
Modern wireline and wireless communication devices are multimode and multifunctional communication devices. In order to support multiple standards on a single platform, it is necessary to develop a reconfigurable architecture that can provide the required flexibility and performance. The Channel decoder is one of the most compute intensive and essential elements of any communication system. Most of the standards require a reconfigurable Channel decoder that is capable of performing Viterbi decoding and Turbo decoding. Furthermore, the Channel decoder needs to support different configurations of Viterbi and Turbo decoders. In this paper, we propose a reconfigurable Channel decoder that can be reconfigured for standards such as WCDMA, CDMA2000, IEEE802.11, DAB, DVB and GSM. Different parameters like code rate, constraint length, polynomials and truncation length can be configured to map any of the above mentioned standards. A multiprocessor approach has been followed to provide higher throughput and scalable power consumption in various configurations of the reconfigurable Viterbi decoder and Turbo decoder. We have proposed A Hybrid register exchange approach for multiprocessor architecture to minimize power consumption.
Resumo:
We propose a unified model for large signal and small signal non-quasi-static analysis of long channel symmetric double gate MOSFET. The model is physics based and relies only on the very basic approximation needed for a charge-based model. It is based on the EKV formalism Enz C, Vittoz EA. Charge based MOS transistor modeling. Wiley; 2006] and is valid in all regions of operation and thus suitable for RF circuit design. Proposed model is verified with professional numerical device simulator and excellent agreement is found. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
FACTS controllers are emerging as viable and economic solutions to the problems of large interconnected ne networks, which can endanger the system security. These devices are characterized by their fast response, absence of inertia, and minimum maintenance requirements. Thyristor controlled equipment like Thyristor Controlled Series Capacitor (TCSC), Static Var Compensator (SVC), Thyristor Controlled Phase angle Regulator (TCPR) etc. which involve passive elements result in devices of large sizes with substantial cost and significant labour for installation. An all solid-state device using GTOs leads to reduction in equipment size and has improved performance. The Unified Power Flow Controller (UPFC) is a versatile controller which can be used to control the active and reactive power in the Line independently. The concept of UPFC makes it possible to handle practically all power flow control and transmission line compensation problems, using solid-state controllers, which provide functional flexibility, generally not attainable by conventional thyristor controlled systems. In this paper, we present the development of a control scheme for the series injected voltage of the UPFC to damp the power oscillations and improve transient stability in a power system. (C) 1998 Elsevier Science Ltd. All rights reserved.
Resumo:
Soft error has become one of the major areas of attention with the device scaling and large scale integration. Lot of variants for superscalar architecture were proposed with focus on program re-execution, thread re-execution and instruction re-execution. In this paper we proposed a fault tolerant micro-architecture of pipelined RISC. The proposed architecture, Floating Resources Extended pipeline (FREP), re-executes the instructions using extended pipeline stages. The instructions are re-executed by hybrid architecture with a suitable combination of space and time redundancy.
Resumo:
In this paper we explore an implementation of a high-throughput, streaming application on REDEFINE-v2, which is an enhancement of REDEFINE. REDEFINE is a polymorphic ASIC combining the flexibility of a programmable solution with the execution speed of an ASIC. In REDEFINE Compute Elements are arranged in an 8x8 grid connected via a Network on Chip (NoC) called RECONNECT, to realize the various macrofunctional blocks of an equivalent ASIC. For a 1024-FFT we carry out an application-architecture design space exploration by examining the various characterizations of Compute Elements in terms of the size of the instruction store. We further study the impact by using application specific, vectorized FUs. By setting up different partitions of the FFT algorithm for persistent execution on REDEFINE-v2, we derive the benefits of setting up pipelined execution for higher performance. The impact of the REDEFINE-v2 micro-architecture for any arbitrary N-point FFT (N > 4096) FFT is also analyzed. We report the various algorithm-architecture tradeoffs in terms of area and execution speed with that of an ASIC implementation. In addition we compare the performance gain with respect to a GPP.
Resumo:
Precision, sophistication and economic factors in many areas of scientific research that demand very high magnitude of compute power is the order of the day. Thus advance research in the area of high performance computing is getting inevitable. The basic principle of sharing and collaborative work by geographically separated computers is known by several names such as metacomputing, scalable computing, cluster computing, internet computing and this has today metamorphosed into a new term known as grid computing. This paper gives an overview of grid computing and compares various grid architectures. We show the role that patterns can play in architecting complex systems, and provide a very pragmatic reference to a set of well-engineered patterns that the practicing developer can apply to crafting his or her own specific applications. We are not aware of pattern-oriented approach being applied to develop and deploy a grid. There are many grid frameworks that are built or are in the process of being functional. All these grids differ in some functionality or the other, though the basic principle over which the grids are built is the same. Despite this there are no standard requirements listed for building a grid. The grid being a very complex system, it is mandatory to have a standard Software Architecture Specification (SAS). We attempt to develop the same for use by any grid user or developer. Specifically, we analyze the grid using an object oriented approach and presenting the architecture using UML. This paper will propose the usage of patterns at all levels (analysis. design and architectural) of the grid development.
Resumo:
Over the last century, the silicon revolution has enabled us to build faster, smaller and more sophisticated computers. Today, these computers control phones, cars, satellites, assembly lines, and other electromechanical devices. Just as electrical wiring controls electromechanical devices, living organisms employ "chemical wiring" to make decisions about their environment and control physical processes. Currently, the big difference between these two substrates is that while we have the abstractions, design principles, verification and fabrication techniques in place for programming with silicon, we have no comparable understanding or expertise for programming chemistry.
In this thesis we take a small step towards the goal of learning how to systematically engineer prescribed non-equilibrium dynamical behaviors in chemical systems. We use the formalism of chemical reaction networks (CRNs), combined with mass-action kinetics, as our programming language for specifying dynamical behaviors. Leveraging the tools of nucleic acid nanotechnology (introduced in Chapter 1), we employ synthetic DNA molecules as our molecular architecture and toehold-mediated DNA strand displacement as our reaction primitive.
Abstraction, modular design and systematic fabrication can work only with well-understood and quantitatively characterized tools. Therefore, we embark on a detailed study of the "device physics" of DNA strand displacement (Chapter 2). We present a unified view of strand displacement biophysics and kinetics by studying the process at multiple levels of detail, using an intuitive model of a random walk on a 1-dimensional energy landscape, a secondary structure kinetics model with single base-pair steps, and a coarse-grained molecular model that incorporates three-dimensional geometric and steric effects. Further, we experimentally investigate the thermodynamics of three-way branch migration. Our findings are consistent with previously measured or inferred rates for hybridization, fraying, and branch migration, and provide a biophysical explanation of strand displacement kinetics. Our work paves the way for accurate modeling of strand displacement cascades, which would facilitate the simulation and construction of more complex molecular systems.
In Chapters 3 and 4, we identify and overcome the crucial experimental challenges involved in using our general DNA-based technology for engineering dynamical behaviors in the test tube. In this process, we identify important design rules that inform our choice of molecular motifs and our algorithms for designing and verifying DNA sequences for our molecular implementation. We also develop flexible molecular strategies for "tuning" our reaction rates and stoichiometries in order to compensate for unavoidable non-idealities in the molecular implementation, such as imperfectly synthesized molecules and spurious "leak" pathways that compete with desired pathways.
We successfully implement three distinct autocatalytic reactions, which we then combine into a de novo chemical oscillator. Unlike biological networks, which use sophisticated evolved molecules (like proteins) to realize such behavior, our test tube realization is the first to demonstrate that Watson-Crick base pairing interactions alone suffice for oscillatory dynamics. Since our design pipeline is general and applicable to any CRN, our experimental demonstration of a de novo chemical oscillator could enable the systematic construction of CRNs with other dynamic behaviors.
Resumo:
We present, for the first time to our knowledge, a generalized lookahead logic algorithm for number conversion from signed-digit to complement representation. By properly encoding the signed-digits, all the operations are performed by binary logic, and unified logical expressions can be obtained for conversion from modified-signed-digit (MSD) to 2's complement, trinary signed-digit (TSD) to 3's complement, and quarternary signed-digit (QSD) to 4's complement. For optical implementation, a parallel logical array module using an electron-trapping device is employed and experimental results are shown. This optical module is suitable for implementing complex logic functions in the form of the sum of the product. The algorithm and architecture are compatible with a general-purpose optoelectronic computing system. (C) 2001 Society of Photo-Optical Instrumentation Engineers.
Resumo:
Surface-architecture-controlled ZnO nanowires were grown using a vapor transport method on various ZnO buffer film coated c-plane sapphire substrates with or without Au catalysts. The ZnO nanowires that were grown showed two different types of geometric properties: corrugated ZnO nanowires having a relatively smaller diameter and a strong deep-level emission photoluminescence (PL) peak and smooth ZnO nanowires having a relatively larger diameter and a weak deep-level emission PL peak. The surface morphology and size-dependent tunable electronic transport properties of the ZnO nanowires were characterized using a nanowire field effect transistor (FET) device structure. The FETs made from smooth ZnO nanowires with a larger diameter exhibited negative threshold voltages, indicating n-channel depletion-mode behavior, whereas those made from corrugated ZnO nanowires with a smaller diameter had positive threshold voltages, indicating n-channel enhancement-mode behavior.
Resumo:
This paper describes a unified approach to modelling the polysilicon thin film transistor (TFT) for the purposes of circuit design. The approach uses accurate methods of predicting the channel conductance and then fitting the resulting data with a polynomial. Two methods are proposed to find the channel conductance: a device model and measurement. The approach is suitable because the TFT does not have a well defined threshold voltage. The polynomial conductance is then integrated generally to find the drain current and channel charge, necessary for a complete circuit model. © 1991 The Japan Society of Applied Physics.
Resumo:
The aim of this research is to provide a unified modelling-based method to help with the evaluation of organization design and change decisions. Relevant literature regarding model-driven organization design and change is described. This helps identify the requirements for a new modelling methodology. Such a methodology is developed and described. The three phases of the developed method include the following. First, the use of CIMOSA-based multi-perspective enterprise modelling to understand and capture the most enduring characteristics of process-oriented organizations and externalize various types of requirement knowledge about any target organization. Second, the use of causal loop diagrams to identify dynamic causal impacts and effects related to the issues and constraints on the organization under study. Third, the use of simulation modelling to quantify the effects of each issue in terms of organizational performance. The design and case study application of a unified modelling method based on CIMOSA (computer integrated manufacturing open systems architecture) enterprise modelling, causal loop diagrams, and simulation modelling, is explored to illustrate its potential to support systematic organization design and change. Further application of the proposed methodology in various company and industry sectors, especially in manufacturing sectors, would be helpful to illustrate complementary uses and relative benefits and drawbacks of the methodology in different types of organization. The proposed unified modelling-based method provides a systematic way of enabling key aspects of organization design and change. The case company, its relevant data, and developed models help to explore and validate the proposed method. The application of CIMOSA-based unified modelling method and integrated application of these three modelling techniques within a single solution space constitutes an advance on previous best practice. Also, the purpose and application domain of the proposed method offers an addition to knowledge. © IMechE 2009.
Resumo:
We present a simple and semi-physical analytical description of the current-voltage characteristics of amorphous oxide semiconductor thin-film transistors in the above-threshold and sub-threshold regions. Both regions are described by single unified expression that employs the same set of model parameter values directly extracted from measured terminal characteristics. The model accurately reproduces measured characteristics of amorphous semiconductor thin film transistors in general, yielding a scatter of < 4%. © 1980-2012 IEEE.
Resumo:
We demonstrate the production of copper phthalocyanine (CuPc) based p-type hybrid permeable-base transistors, which operate at low voltages having high common-base current gains. These transistors are prepared by evaporating a thin metal layer (Ag or Al) that acts as base on top of a Si substrate that acts as collector. In the sequence CuPc and Au are thermally sublimated to produce the emitter, constituting a quite simple device production procedure with the additional advantage of allowing higher integration due to its vertical architecture.
Resumo:
Current low-level networking abstractions on modern operating systems are commonly implemented in the kernel to provide sufficient performance for general purpose applications. However, it is desirable for high performance applications to have more control over the networking subsystem to support optimizations for their specific needs. One approach is to allow networking services to be implemented at user-level. Unfortunately, this typically incurs costs due to scheduling overheads and unnecessary data copying via the kernel. In this paper, we describe a method to implement efficient application-specific network service extensions at user-level, that removes the cost of scheduling and provides protected access to lower-level system abstractions. We present a networking implementation that, with minor modifications to the Linux kernel, passes data between "sandboxed" extensions and the Ethernet device without copying or processing in the kernel. Using this mechanism, we put a customizable networking stack into a user-level sandbox and show how it can be used to efficiently process and forward data via proxies, or intermediate hosts, in the communication path of high performance data streams. Unlike other user-level networking implementations, our method makes no special hardware requirements to avoid unnecessary data copies. Results show that we achieve a substantial increase in throughput over comparable user-space methods using our networking stack implementation.
Resumo:
A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of pre-attentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.