694 resultados para CMOS processs
Resumo:
Enterprise Ireland (Project CFTD07325). European Commission (EU Framework 7 project Nanofunction, (Beyond CMOS Nanodevices for Adding Functionalities to CMOS) www.Nanofunction.eu EU ICT Network of Excellence, Grant No.257375)
Resumo:
In the last decade, we have witnessed the emergence of large, warehouse-scale data centres which have enabled new internet-based software applications such as cloud computing, search engines, social media, e-government etc. Such data centres consist of large collections of servers interconnected using short-reach (reach up to a few hundred meters) optical interconnect. Today, transceivers for these applications achieve up to 100Gb/s by multiplexing 10x 10Gb/s or 4x 25Gb/s channels. In the near future however, data centre operators have expressed a need for optical links which can support 400Gb/s up to 1Tb/s. The crucial challenge is to achieve this in the same footprint (same transceiver module) and with similar power consumption as today’s technology. Straightforward scaling of the currently used space or wavelength division multiplexing may be difficult to achieve: indeed a 1Tb/s transceiver would require integration of 40 VCSELs (vertical cavity surface emitting laser diode, widely used for short‐reach optical interconnect), 40 photodiodes and the electronics operating at 25Gb/s in the same module as today’s 100Gb/s transceiver. Pushing the bit rate on such links beyond today’s commercially available 100Gb/s/fibre will require new generations of VCSELs and their driver and receiver electronics. This work looks into a number of state‐of-the-art technologies and investigates their performance restraints and recommends different set of designs, specifically targeting multilevel modulation formats. Several methods to extend the bandwidth using deep submicron (65nm and 28nm) CMOS technology are explored in this work, while also maintaining a focus upon reducing power consumption and chip area. The techniques used were pre-emphasis in rising and falling edges of the signal and bandwidth extensions by inductive peaking and different local feedback techniques. These techniques have been applied to a transmitter and receiver developed for advanced modulation formats such as PAM-4 (4 level pulse amplitude modulation). Such modulation format can increase the throughput per individual channel, which helps to overcome the challenges mentioned above to realize 400Gb/s to 1Tb/s transceivers.
Resumo:
In order to widely use Ge and III-V materials instead of Si in advanced CMOS technology, the process and integration of these materials has to be well established so that their high mobility benefit is not swamped by imperfect manufacturing procedures. In this dissertation number of key bottlenecks in realization of Ge devices are investigated; We address the challenge of the formation of low resistivity contacts on n-type Ge, comparing conventional and advanced rapid thermal annealing (RTA) and laser thermal annealing (LTA) techniques respectively. LTA appears to be a feasible approach for realization of low resistivity contacts with an incredibly sharp germanide-substrate interface and contact resistivity in the order of 10 -7 Ω.cm2. Furthermore the influence of RTA and LTA on dopant activation and leakage current suppression in n+/p Ge junction were compared. Providing very high active carrier concentration > 1020 cm-3, LTA resulted in higher leakage current compared to RTA which provided lower carrier concentration ~1019 cm-3. This is an indication of a trade-off between high activation level and junction leakage current. High ION/IOFF ratio ~ 107 was obtained, which to the best of our knowledge is the best reported value for n-type Ge so far. Simulations were carried out to investigate how target sputtering, dose retention, and damage formation is generated in thin-body semiconductors by means of energetic ion impacts and how they are dependent on the target physical material properties. Solid phase epitaxy studies in wide and thin Ge fins confirmed the formation of twin boundary defects and random nucleation growth, like in Si, but here 600 °C annealing temperature was found to be effective to reduce these defects. Finally, a non-destructive doping technique was successfully implemented to dope Ge nanowires, where nanowire resistivity was reduced by 5 orders of magnitude using PH3 based in-diffusion process.
Resumo:
Localized molecular orbitals (LMOs) are much more compact representations of electronic degrees of freedom than canonical molecular orbitals (CMOs). The most compact representation is provided by nonorthogonal localized molecular orbitals (NOLMOs), which are linearly independent but are not orthogonal. Both LMOs and NOLMOs are thus useful for linear-scaling calculations of electronic structures for large systems. Recently, NOLMOs have been successfully applied to linear-scaling calculations with density functional theory (DFT) and to reformulating time-dependent density functional theory (TDDFT) for calculations of excited states and spectroscopy. However, a challenge remains as NOLMO construction from CMOs is still inefficient for large systems. In this work, we develop an efficient method to accelerate the NOLMO construction by using predefined centroids of the NOLMO and thereby removing the nonlinear equality constraints in the original method ( J. Chem. Phys. 2004 , 120 , 9458 and J. Chem. Phys. 2000 , 112 , 4 ). Thus, NOLMO construction becomes an unconstrained optimization. Its efficiency is demonstrated for the selected saturated and conjugated molecules. Our method for fast NOLMO construction should lead to efficient DFT and NOLMO-TDDFT applications to large systems.
Resumo:
Double gate fully depleted silicon-on-insulator (DGSOI) is recognized as a possible solution when the physical gate length L-G reduces to 25nm for the 65nm node on the ITRS CMOS roadmap. In this paper, scaling guidelines are introduced to optimally design a nanoscale DGSOI. For this reason, the sensitivity of gain, f(T) and f(max) to each of the key geometric and technological parameters of the DGSOI are assessed and quantified using MixedMode simulation. The impact of the parasitic resistance and capacitance on analog device performance is systematically analysed. By comparing analog performance with a single gate (SG), it has been found that intrinsic gain in DGSOI is 4 times higher but its fT was found to be comparable to that of SGSOI at different regions of transistor operation. However, the extracted fmax in SG SOI was higher (similar to 40%) compared to DGSOI due to its lower capacitance.
Resumo:
This paper details the implementation and operational performance of a minimum-power 2.45-GHz pulse receiver and a companion on-off keyed transmitter for use in a semi-active duplex RF biomedical transponder. A 50-Ohm microstrip stub-matched zero-bias diode detector forms the heart of a body-worn receiver that has a CMOS baseband amplifier consuming 20 microamps from +3 V and achieves a tangential sensitivity of -53 dBm. The base transmitter generates 0.5 W of peak RF output power into 50 Ohms. Both linear and right-hand circularly polarized Tx-Rx antenna sets were employed in system reliability trials carried out in a hospital Coronary Care Unit, For transmitting antenna heights between 0.3 and 2.2 m above floor level, transponder interrogations were 95% reliable within the 67-m-sq area of the ward, falling to an average of 46 % in the surrounding rooms and corridors. Overall, the circular antenna set gave the higher reliability and lower propagation power decay index.
Resumo:
A new reconfigurable subpixel interpolation architecture for multistandard (e.g., MPEG-2, MPEG-4, H.264, and AVS) video motion estimation (ME) is presented. This exploits the mixed use of parallel and serial-input FIR filters to achieve high throughput rate and efficient silicon utilization. Silicon design studies show that this can be implemented using 34.8 × 10 3 gates with area and performance that compares very favorably with specific fixed solutions, e.g., for the H.264 standard alone. This can support SDTV and HDTV applications when implemented in 0.18 µm CMOS technology, with further performance enhancements achievable at 0.13 µm and below. © 2009 IEEE.
Resumo:
A novel most significant digit first CORDIC architecture is presented that is suitable for the VLSI design of systolic array processor cells for performing QR decomposition. This is based on an on-line CORDIC algorithm with a constant scale factor and a latency independent of the wordlength. This has been derived through the extension of previously published CORDIC algorithms. It is shown that simplifying the calculation of convergence bounds also greatly simplifies the derivation of suitable VLSI architectures. Design studies, based on a 0.35-µ CMOS standard cell process, indicate that 20 such QR processor cells operating at rates suitable for radar beamfoming can be readily accommodated on a single chip.
Resumo:
Continuing achievements in hardware technology are bringing ubiquitous computing closer to reality. The notion of a connected, interactive and autonomous environment is common to all sensor networks, biosystems and radio frequency identification (RFID) devices, and the emergence of significant deployments and sophisticated applications can be expected. However, as more information is collected and transmitted, security issues will become vital for such a fully connected environment. In this study the authors consider adding security features to low-cost devices such as RFID tags. In particular, the authors consider the implementation of a digital signature architecture that can be used for device authentication, to prevent tag cloning, and for data authentication to prevent transmission forgery. The scheme is built around the signature variant of the cryptoGPS identification scheme and the SHA-1 hash function. When implemented on 130 nm CMOS the full design uses 7494 gates and consumes 4.72 mu W of power, making it smaller and more power efficient than previous low-cost digital signature designs. The study also presents a low-cost SHA-1 hardware architecture which is the smallest standardised hash function design to date.
Resumo:
As a potential alternative to CMOS technology, QCA provides an interesting paradigm in both communication and computation. However, QCAs unique four-phase clocking scheme and timing constraints present serious timing issues for interconnection and feedback. In this work, a cut-set retiming design procedure is proposed to resolve these QCA timing issues. The proposed design procedure can accommodate QCAs unique characteristics by performing delay-transfer and time-scaling to reallocate the existing delays so as to achieve efficient clocking zone assignment. Cut-set retiming makes it possible to effectively design relatively complex QCA circuits that include feedback. It utilizes the similar characteristics of synchronization, deep pipelines and local interconnections common to both QCA and systolic architectures. As a case study, a systolic Montgomery modular multiplier is designed to illustrate the procedure. Furthermore, a nonsystolic architecture, an S27 benchmark circuit, is designed and compared with previous designs. The comparison shows that the cut-set retiming method achieves a more efficient design, with a reduction of 22%, 44%, and 46% in terms of cell count, area, and latency, respectively.
Resumo:
Quantum-dot Cellular Automata (QCA) technology is a promising potential alternative to CMOS technology. To explore the characteristics of QCA and suitable design methodologies, digital circuit design approaches have been investigated. Due to the inherent wire delay in QCA, pipelined architectures appear to be a particularly suitable design technique. Also, because of the pipeline nature of QCA technology, it is not suitable for complicated control system design. Systolic arrays take advantage of pipelining, parallelism and simple local control. Therefore, an investigation into these architectures in QCA technology is provided in this paper. Two case studies, (a matrix multiplier and a Galois Field multiplier) are designed and analyzed based on both multilayer and coplanar crossings. The performance of these two types of interconnections are compared and it is found that even though coplanar crossings are currently more practical, they tend to occupy a larger design area and incur slightly more delay. A general semi-conductor QCA systolic array design methodology is also proposed. It is found that by applying a systolic array structure in QCA design, significant benefits can be achieved particularly with large systolic arrays, even more so than when applied in CMOS-based technology.
Resumo:
A new, front-end image processing chip is presented for real-time small object detection. It has been implemented using a 0.6 µ, 3.3 V CMOS technology and operates on 10-bit input data at 54 megasamples per second. It occupies an area of 12.9 mm×13.6 mm (including pads), dissipates 1.5 W, has 92 I/O pins and is to be housed in a 160-pin ceramic quarter flat-pack. It performs both one- and two-dimensional FIR filtering and a multilayer perceptron (MLP) neural network function using a reconfigurable array of 21 multiplication-accumulation cells which corresponds to a window size of 7×3. The chip can cope with images of 2047 pixels per line and can be cascaded to cope with larger window sizes. The chip performs two billion fixed point multiplications and additions per second.
Resumo:
A 64-point Fourier transform chip is described that performs a forward or inverse, 64-point Fourier transform on complex two's complement data supplied at a rate of 13.5MHz and can operate at clock rates of up to 40MHz, under worst-case conditions. It uses a 0.6µm double-level metal CMOS technology, contains 535k transistors and uses an internal 3.3V power supply. It has an area of 7.8×8mm, dissipates 0.9W, has 48 pins and is housed in a 84 pin PLCC plastic package. The chip is based on a FFT architecture developed from first principles through a detailed investigation of the structure of the relevant DFT matrix and through mapping repetitive blocks within this matrix onto a regular silicon structure.
Resumo:
A high-performance VLSI architecture to perform combined multiply-accumulate, divide, and square root operations is proposed. The circuit is highly regular, requires only minimal control, and can be reconfigured for every cycle. The execution time for each operation is the same. The combination of redundancy and pipelining results in a throughput independent of the wordsize of the array. With current CMOS technology, throughput rates in excess of 80 million operations per second are achievable.