942 resultados para 291605 Processor Architectures
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Advanced doping technologies are key for the continued scaling of semiconductor devices and the maintenance of device performance beyond the 14 nm technology node. Due to limitations of conventional ion-beam implantation with thin body and 3D device geometries, techniques which allow precise control over dopant diffusion and concentration, in addition to excellent conformality on 3D device surfaces, are required. Spin-on doping has shown promise as a conventional technique for doping new materials, particularly through application with other dopant methods, but may not be suitable for conformal doping of nanostructures. Additionally, residues remain after most spin-on-doping processes which are often difficult to remove. In-situ doping of nanostructures is especially common for bottom-up grown nanostructures but problems associated with concentration gradients and morphology changes are commonly experienced. Monolayer doping (MLD) has been shown to satisfy the requirements for extended defect-free, conformal and controllable doping on many materials ranging from traditional silicon and germanium devices to emerging replacement materials such as III-V compounds but challenges still remain, especially with regard to metrology and surface chemistry at such small feature sizes. This article summarises and critically assesses developments over the last number of years regarding the application of gas and solution phase techniques to dope silicon-, germanium- and III-V-based materials and nanostructures to obtain shallow diffusion depths coupled with high carrier concentrations and abrupt junctions.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
MEDEIROS, Adelardo A. D.A survey of control architectures for autonomous mobile robots. J. Braz. Comp. Soc., Campinas, v. 4, n. 3, abr. 1998 .Disponível em:
Resumo:
MEDEIROS, Adelardo A. D.A survey of control architectures for autonomous mobile robots. J. Braz. Comp. Soc., Campinas, v. 4, n. 3, abr. 1998 .Disponível em:
Resumo:
Scientific applications rely heavily on floating point data types. Floating point operations are complex and require complicated hardware that is both area and power intensive. The emergence of massively parallel architectures like Rigel creates new challenges and poses new questions with respect to floating point support. The massively parallel aspect of Rigel places great emphasis on area efficient, low power designs. At the same time, Rigel is a general purpose accelerator and must provide high performance for a wide class of applications. This thesis presents an analysis of various floating point unit (FPU) components with respect to Rigel, and attempts to present a candidate design of an FPU that balances performance, area, and power and is suitable for massively parallel architectures like Rigel.
Resumo:
The explosion in mobile data traffic is a driver for future network operator technologies, given its large potential to affect both network performance and generated revenue. The concept of distributed mobility management (DMM) has emerged in order to overcome efficiency-wise limitations in centralized mobility approaches, proposing not only the distribution of anchoring functions but also dynamic mobility activation sensitive to the applications needs. Nevertheless, there is not an acceptable solution for IP multicast in DMM environments, as the first proposals based on MLD Proxy are prone to tunnel replication problem or service disruption. We propose the application of PIM-SM in mobility entities as an alternative solution for multicast support in DMM, and introduce an architecture enabling mobile multicast listeners support over distributed anchoring frameworks in a network-efficient way. The architecture aims at providing operators with flexible options to provide multicast mobility, supporting three modes: the first one introduces basic IP multicast support in DMM; the second improves subscription time through extensions to the mobility protocol, obliterating the dependence on MLD protocol; and the third enables fast listener mobility by avoiding potentially slow multicast tree convergence latency in larger infrastructures, by benefiting from mobility tunnels. The different modes were evaluated by mathematical analysis regarding disruption time and packet loss during handoff against several parameters, total and tunneling packet delivery cost, and regarding packet and signaling overhead.
Resumo:
The philosophy of minimalism in robotics promotes gaining an understanding of sensing and computational requirements for solving a task. This minimalist approach lies in contrast to the common practice of first taking an existing sensory motor system, and only afterwards determining how to apply the robotic system to the task. While it may seem convenient to simply apply existing hardware systems to the task at hand, this design philosophy often proves to be wasteful in terms of energy consumption and cost, along with unnecessary complexity and decreased reliability. While impressive in terms of their versatility, complex robots such as the PR2 (which cost hundreds of thousands of dollars) are impractical for many common applications. Instead, if a specific task is required, sensing and computational requirements can be determined specific to that task, and a clever hardware implementation can be built to accomplish the task. Since this minimalist hardware would be designed around accomplishing the specified task, significant reductions in hardware complexity can be obtained. This can lead to huge advantages in battery life, cost, and reliability. Even if cost is of no concern, battery life is often a limiting factor in many applications. Thus, a minimalist hardware system is critical in achieving the system requirements. In this thesis, we will discuss an implementation of a counting, tracking, and actuation system as it relates to ergodic bodies to illustrate a minimalist design methodology.
Resumo:
Carbon-rich, conjugated organic scaffolding is a popular basis for functional materials, especially for electronic and photonic applications. However, synthetic methods for generating these types of materials lack diversity and, in many cases, efficiency; the insistence of investigators focusing on the properties of the end product, rather than the process in which it was created, has led to the current state of the relatively homogeneous synthetic chemistry of functional organic materials. Because of this, there is plenty of room for improvement at the most basic level. Problems endemic to the preparation of carbon-rich scaffolding can, in many cases, be solved with modern advances in synthetic methodology. We seek to apply this synthesis-focused paradigm to solve problems in the preparation of carbon-rich scaffolds. Herein, the development and utilization of three methodologies: iridium-catalyzed arene C-H borylation; zinc- mediated alkynylations; and Lewis acid promoted Mo nitride-alkyne metathesis, are presented as improvements for the preparation of carbon-rich architectures. In addition, X-ray crystallographic analysis of two classes of compounds are presented. First, an analysis of carbazole-containing arylene ethynylene macrocycles showcases the significance of alkyl chain identity on solid-state morphology. Second, a class of rigid zwitterionic metal-organic compounds display an unusual propensity to crystallize in the absence of inversion symmetry. Hirshfeld surface analysis of these crystalline materials demonstrates that subtle intermolecular interactions are responsible for the overall packing motifs in this class of compounds.
Resumo:
Cache-coherent non uniform memory access (ccNUMA) architecture is a standard design pattern for contemporary multicore processors, and future generations of architectures are likely to be NUMA. NUMA architectures create new challenges for managed runtime systems. Memory-intensive applications use the system’s distributed memory banks to allocate data, and the automatic memory manager collects garbage left in these memory banks. The garbage collector may need to access remote memory banks, which entails access latency overhead and potential bandwidth saturation for the interconnection between memory banks. This dissertation makes five significant contributions to garbage collection on NUMA systems, with a case study implementation using the Hotspot Java Virtual Machine. It empirically studies data locality for a Stop-The-World garbage collector when tracing connected objects in NUMA heaps. First, it identifies a locality richness which exists naturally in connected objects that contain a root object and its reachable set— ‘rooted sub-graphs’. Second, this dissertation leverages the locality characteristic of rooted sub-graphs to develop a new NUMA-aware garbage collection mechanism. A garbage collector thread processes a local root and its reachable set, which is likely to have a large number of objects in the same NUMA node. Third, a garbage collector thread steals references from sibling threads that run on the same NUMA node to improve data locality. This research evaluates the new NUMA-aware garbage collector using seven benchmarks of an established real-world DaCapo benchmark suite. In addition, evaluation involves a widely used SPECjbb benchmark and Neo4J graph database Java benchmark, as well as an artificial benchmark. The results of the NUMA-aware garbage collector on a multi-hop NUMA architecture show an average of 15% performance improvement. Furthermore, this performance gain is shown to be as a result of an improved NUMA memory access in a ccNUMA system. Fourth, the existing Hotspot JVM adaptive policy for configuring the number of garbage collection threads is shown to be suboptimal for current NUMA machines. The policy uses outdated assumptions and it generates a constant thread count. In fact, the Hotspot JVM still uses this policy in the production version. This research shows that the optimal number of garbage collection threads is application-specific and configuring the optimal number of garbage collection threads yields better collection throughput than the default policy. Fifth, this dissertation designs and implements a runtime technique, which involves heuristics from dynamic collection behavior to calculate an optimal number of garbage collector threads for each collection cycle. The results show an average of 21% improvements to the garbage collection performance for DaCapo benchmarks.
Resumo:
Recent efforts to develop large-scale neural architectures have paid relatively little attention to the use of self-organizing maps (SOMs). Part of the reason is that most conventional SOMs use a static encoding representation: Each input is typically represented by the fixed activation of a single node in the map layer. This not only carries information in an inefficient and unreliable way that impedes building robust multi-SOM neural architectures, but it is also inconsistent with rhythmic oscillations in biological neural networks. Here I develop and study an alternative encoding scheme that instead uses limit cycle attractors of multi-focal activity patterns to represent input patterns/sequences. Such a fundamental change in representation raises several questions: Can this be done effectively and reliably? If so, will map formation still occur? What properties would limit cycle SOMs exhibit? Could multiple such SOMs interact effectively? Could robust architectures based on such SOMs be built for practical applications? The principal results of examining these questions are as follows. First, conditions are established for limit cycle attractors to emerge in a SOM through self-organization when encoding both static and temporal sequence inputs. It is found that under appropriate conditions a set of learned limit cycles are stable, unique, and preserve input relationships. In spite of the continually changing activity in a limit cycle SOM, map formation continues to occur reliably. Next, associations between limit cycles in different SOMs are learned. It is shown that limit cycles in one SOM can be successfully retrieved by another SOM’s limit cycle activity. Control timings can be set quite arbitrarily during both training and activation. Importantly, the learned associations generalize to new inputs that have never been seen during training. Finally, a complete neural architecture based on multiple limit cycle SOMs is presented for robotic arm control. This architecture combines open-loop and closed-loop methods to achieve high accuracy and fast movements through smooth trajectories. The architecture is robust in that disrupting or damaging the system in a variety of ways does not completely destroy the system. I conclude that limit cycle SOMs have great potentials for use in constructing robust neural architectures.
Resumo:
Nanostructures are highly attractive for future electrical energy storage devices because they enable large surface area and short ion transport time through thin electrode layers for high power devices. Significant enhancement in power density of batteries has been achieved by nano-engineered structures, particularly anode and cathode nanostructures spatially separated far apart by a porous membrane and/or a defined electrolyte region. A self-aligned nanostructured battery fully confined within a single nanopore presents a powerful platform to determine the rate performance and cyclability limits of nanostructured storage devices. Atomic layer deposition (ALD) has enabled us to create and evaluate such structures, comprised of nanotubular electrodes and electrolyte confined within anodic aluminum oxide (AAO) nanopores. The V2O5- V2O5 symmetric nanopore battery displays exceptional power-energy performance and cyclability when tested as a massively parallel device (~2billion/cm2), each with ~1m3 volume (~1fL). Cycled between 0.2V and 1.8V, this full cell has capacity retention of 95% at 5C rate and 46% at 150C, with more than 1000 charge/discharge cycles. These results demonstrate the promise of ultrasmall, self-aligned/regular, densely packed nanobattery structures as a testbed to study ionics and electrodics at the nanoscale with various geometrical modifications and as a building block for high performance energy storage systems[1, 2]. Further increase of full cell output potential is also demonstrated in asymmetric full cell configurations with various low voltage anode materials. The asymmetric full cell nanopore batteries, comprised of V2O5 as cathode and prelithiated SnO2 or anatase phase TiO2 as anode, with integrated nanotubular metal current collectors underneath each nanotubular storage electrode, also enabled by ALD. By controlling the amount of lithium ion prelithiated into SnO2 anode, we can tune full cell output voltage in the range of 0.3V and 3V. This asymmetric nanopore battery array displays exceptional rate performance and cyclability. When cycled between 1V and 3V, it has capacity retention of approximately 73% at 200C rate compared to 1C, with only 2% capacity loss after more than 500 charge/discharge cycles. With increased full cell output potential, the asymmetric V2O5-SnO2 nanopore battery shows significantly improved energy and power density. This configuration presents a more realistic test - through its asymmetric (vs symmetric) configuration – of performance and cyclability in nanoconfined environment. This dissertation covers (1) Ultra small electrochemical storage platform design and fabrication, (2) Electron and ion transport in nanostructured electrodes inside a half cell configuration, (3) Ion transport between anode and cathode in confined nanochannels in symmetric full cells, (4) Scale up energy and power density with geometry optimization and low voltage anode materials in asymmetric full cell configurations. As a supplement, selective growth of ALD to improve graphene conductance will also be discussed[3]. References: 1. Liu, C., et al., (Invited) A Rational Design for Batteries at Nanoscale by Atomic Layer Deposition. ECS Transactions, 2015. 69(7): p. 23-30. 2. Liu, C.Y., et al., An all-in-one nanopore battery array. Nature Nanotechnology, 2014. 9(12): p. 1031-1039. 3. Liu, C., et al., Improving Graphene Conductivity through Selective Atomic Layer Deposition. ECS Transactions, 2015. 69(7): p. 133-138.
Resumo:
In the multi-core CPU world, transactional memory (TM)has emerged as an alternative to lock-based programming for thread synchronization. Recent research proposes the use of TM in GPU architectures, where a high number of computing threads, organized in SIMT fashion, requires an effective synchronization method. In contrast to CPUs, GPUs offer two memory spaces: global memory and local memory. The local memory space serves as a shared scratch-pad for a subset of the computing threads, and it is used by programmers to speed-up their applications thanks to its low latency. Prior work from the authors proposed a lightweight hardware TM (HTM) support based in the local memory, modifying the SIMT execution model and adding a conflict detection mechanism. An efficient implementation of these features is key in order to provide an effective synchronization mechanism at the local memory level. After a quick description of the main features of our HTM design for GPU local memory, in this work we gather together a number of proposals designed with the aim of improving those mechanisms with high impact on performance. Firstly, the SIMT execution model is modified to increase the parallelism of the application when transactions must be serialized in order to make forward progress. Secondly, the conflict detection mechanism is optimized depending on application characteristics, such us the read/write sets, the probability of conflict between transactions and the existence of read-only transactions. As these features can be present in hardware simultaneously, it is a task of the compiler and runtime to determine which ones are more important for a given application. This work includes a discussion on the analysis to be done in order to choose the best configuration solution.