364 resultados para Cpu


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Symmetric multi-processor (SMP) systems, or multiple-CPU servers, are suitable for implementing parallel algorithms because they employ dedicated communication devices to enhance the inter-processor communication bandwidth, so that a better performance can be obtained. However, the cost for a multiple-CPU server is high and therefore, the server is usually shared among many users. The work-load due to other users will certainly affect the performance of the parallel programs so it is desirable to derive a method to optimize parallel programs under different loading conditions. In this paper, we present a simple method, which can be applied in SPMD type parallel programs, to improve the speedup by controlling the number of threads within the programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Managing heat produced by computer processors is an important issue today, especially when the size of processors is decreasing rapidly while the number of transistors in the processor is increasing rapidly. This poster describes a preliminary study of the process of adding carbon nanotubes (CNTs) to a standard silicon paste covering a CPU. Measurements were made in two rounds of tests to compare the rate of cool-down with and without CNTs present. The silicon paste acts as an interface between the CPU and the heat sink, increasing the heat transfer rate away from the CPU. To the silicon paste was added 0.05% by weight of CNTs. These were not aligned. A series of K-type thermocouples was used to measure the temperature as a function of time in the vicinity of the CPU, following its shut-off. An Omega data acquisition system was attached to the thermocouples. The CPU temperature was not measured directly because attachment of a thermocouple would have prevented its automatic shut-off A thermocouple in the paste containing the CNTs actually reached a higher temperature than the standard paste, an effect easily explained. But the rate of cooling with the CNTs was about 4.55% better.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a methodology for optimizing the execution of data parallel (sub-)tasks on CPU and GPU cores of the same heterogeneous architecture. The methodology is based on two main components: i) an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized; and ii) an autonomic module which uses the analytical performance model to implement the data parallel computations in a completely autonomic way, requiring no programmer intervention to optimize the computation across CPU and GPU cores. The analytical performance model uses a small set of simple parameters to devise a partitioning-between CPU and GPU cores-of the tasks derived from structured data parallel patterns/algorithmic skeletons. The model takes into account both hardware related and application dependent parameters. It computes the percentage of tasks to be executed on CPU and GPU cores such that both kinds of cores are exploited and performance figures are optimized. The autonomic module, implemented in FastFlow, executes a generic map (reduce) data parallel pattern scheduling part of the tasks to the GPU and part to CPU cores so as to achieve optimal execution time. Experimental results on state-of-the-art CPU/GPU architectures are shown that assess both performance model properties and autonomic module effectiveness. © 2013 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data streams are usually generated in an online fashion characterized by huge volume, rapid unpredictable rates, and fast changing data characteristics. It has been hence recognized that mining over streaming data requires the problem of limited computational resources to be adequately addressed. Since the arrival rate of data streams can significantly increase and exceed the CPU capacity, the machinery must adapt to this change to guarantee the timeliness of the results. We present an online algorithm to approximate a set of frequent patterns from a sliding window over the underlying data stream - given apriori CPU capacity. The algorithm automatically detects overload situations and can adaptively shed unprocessed data to guarantee the timely results. We theoretically prove, using probabilistic and deterministic techniques, that the error on the output results is bounded within a pre-specified threshold. The empirical results on various datasets also confirmed the feasiblity of our proposal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

“Play on your CPU” è un'applicazione mobile per ambienti Android e iOS. Lo scopo di tale progetto è quello di fornire uno strumento per la didattica dei principi di funzionamento dei processori a chiunque sia interessato ad approfondire o conoscere questi concetti.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A parallel algorithm for image noise removal is proposed. The algorithm is based on peer group concept and uses a fuzzy metric. An optimization study on the use of the CUDA platform to remove impulsive noise using this algorithm is presented. Moreover, an implementation of the algorithm on multi-core platforms using OpenMP is presented. Performance is evaluated in terms of execution time and a comparison of the implementation parallelised in multi-core, GPUs and the combination of both is conducted. A performance analysis with large images is conducted in order to identify the amount of pixels to allocate in the CPU and GPU. The observed time shows that both devices must have work to do, leaving the most to the GPU. Results show that parallel implementations of denoising filters on GPUs and multi-cores are very advisable, and they open the door to use such algorithms for real-time processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A parallel algorithm to remove impulsive noise in digital images using heterogeneous CPU/GPU computing is proposed. The parallel denoising algorithm is based on the peer group concept and uses an Euclidean metric. In order to identify the amount of pixels to be allocated in multi-core and GPUs, a performance analysis using large images is presented. A comparison of the parallel implementation in multi-core, GPUs and a combination of both is performed. Performance has been evaluated in terms of execution time and Megapixels/second. We present several optimization strategies especially effective for the multi-core environment, and demonstrate significant performance improvements. The main advantage of the proposed noise removal methodology is its computational speed, which enables efficient filtering of color images in real-time applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the increasing energy consumption in cloud data centers, energy saving has become a vital objective in designing the underlying cloud infrastructures. A precise energy consumption model is the foundation of many energy-saving strategies. This paper focuses on exploring the energy consumption of virtual machines running various CPU-intensive activities in the cloud server using two types of models: traditional time-series models, such as ARMA and ES, and time-series segmentation models, such as sliding windows model and bottom-up model. We have built a cloud environment using OpenStack, and conducted extensive experiments to analyze and compare the prediction accuracy of these strategies. The results indicate that the performance of ES model is better than the ARMA model in predicting the energy consumption of known activities. When predicting the energy consumption of unknown activities, sliding windows segmentation model and bottom-up segmentation model can all have satisfactory performance but the former is slightly better than the later.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Games and related virtual environments have been a much-hyped area of the entertainment industry. The classic quote is that games are now approaching the size of Hollywood box office sales [1]. Books are now appearing that talk up the influence of games on business [2], and it is one of the key drivers of present hardware development. Some of this 3D technology is now embedded right down at the operating system level via the Windows Presentation Foundations – hit Windows/Tab on your Vista box to find out... In addition to this continued growth in the area of games, there are a number of factors that impact its development in the business community. Firstly, the average age of gamers is approaching the mid thirties. Therefore, a number of people who are in management positions in large enterprises are experienced in using 3D entertainment environments. Secondly, due to the pressure of demand for more computational power in both CPU and Graphical Processing Units (GPUs), your average desktop, any decent laptop, can run a game or virtual environment. In fact, the demonstrations at the end of this paper were developed at the Queensland University of Technology (QUT) on a standard Software Operating Environment, with an Intel Dual Core CPU and basic Intel graphics option. What this means is that the potential exists for the easy uptake of such technology due to 1. a broad range of workers being regularly exposed to 3D virtual environment software via games; 2. present desktop computing power now strong enough to potentially roll out a virtual environment solution across an entire enterprise. We believe such visual simulation environments can have a great impact in the area of business process modeling. Accordingly, in this article we will outline the communication capabilities of such environments, giving fantastic possibilities for business process modeling applications, where enterprises need to create, manage, and improve their business processes, and then communicate their processes to stakeholders, both process and non-process cognizant. The article then concludes with a demonstration of the work we are doing in this area at QUT.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The loosely-coupled and dynamic nature of web services architectures has many benefits, but also leads to an increased vulnerability to denial of service attacks. While many papers have surveyed and described these vulnerabilities, they are often theoretical and lack experimental data to validate them, and assume an obsolete state of web services technologies. This paper describes experiments involving several denial of service vulnerabilities in well-known web services platforms, including Java Metro, Apache Axis, and Microsoft .NET. The results both confirm and deny the presence of some of the most well-known vulnerabilities in web services technologies. Specifically, major web services platforms appear to cope well with attacks that target memory exhaustion. However, attacks targeting CPU-time exhaustion are still effective, regardless of the victim’s platform.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract Computer simulation is a versatile and commonly used tool for the design and evaluation of systems with different degrees of complexity. Power distribution systems and electric railway network are areas for which computer simulations are being heavily applied. A dominant factor in evaluating the performance of a software simulator is its processing time, especially in the cases of real-time simulation. Parallel processing provides a viable mean to reduce the computing time and is therefore suitable for building real-time simulators. In this paper, we present different issues related to solving the power distribution system with parallel computing based on a multiple-CPU server and we will concentrate, in particular, on the speedup performance of such an approach.