5 resultados para Intel 8086 (Microprocessador)
em CentAUR: Central Archive University of Reading - UK
Resumo:
The aim of this book is to provide and introduction to microprocessor systems, their operation and design. It covers those topics needed by engineers and computer scientists who are interested in applying microprocessors in practical situations, namely computer hardware including logic and interfacing, software, in particular high level and assembly language programming, and the design and testing of such systems. The fundamental principles of micrprocessor systems are described and these are illustrated with reference to two microprocessors, the 32-bit MC68020 from Motorola and a single chip microcomputer, the 8051 from Intel; and in addition, interfacing to the general purpose STE bus is described. The details of the processors and the bus are concentrated in three chapters, thus allowing the presentation of the material to be independent of the microprocessors if that is desired, and permitting the specific details to be found easily.
Resumo:
The 3rd World Chess Software Championship took place in Yokohama, Japan during August 2013. It pits chess engines against each other on a common hardware platform - in this instance, the Intel i7 2740 Ivy Bridge with 16GB RAM supporting a potential eight processing threads. It was narrowly won by HIARCS from JUNIOR and PANDIX with JONNY, SHREDDER and MERLIN taking the remaining places. Games, occasionally annotated, are available here.
Resumo:
We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. The modified algorithm runs more than 50 times faster on the CELL’s Synergistic Processing Elements than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60% of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.
A benchmark-driven modelling approach for evaluating deployment choices on a multi-core architecture
Resumo:
The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.