985 resultados para intel processor


Relevância:

60.00% 60.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Cell Broadband Engine (BE) Architecture is a new heterogeneous multi-core architecture targeted at compute-intensive workloads. The architecture of the Cell BE has several features that are unique in high-performance general-purpose processors, most notably the extensive support for vectorization, scratch pad memories and explicit programming of direct memory accesses (DMAs) and mailbox communication. While these features strongly increase programming complexity, it is generally claimed that significant speedups can be obtained by using Cell BE processors. This paper presents our experiences with using the Cell BE architecture to accelerate Clustal W, a bio-informatics program for multiple sequence alignment. We report on how we apply the unique features of the Cell BE to Clustal W and how important each is in obtaining high performance. By making extensive use of vectorization and by parallelizing the application across all cores, we demonstrate a speedup of 24.4 times when using 16 synergistic processor units on a QS21 Cell Blade compared to single-thread execution on the power processing unit. As the Cell BE exploits a large number of slim cores, our highly optimized implementation is just 3.8 times faster than a 3-thread version running on an Intel Core2 Duo, as the latter processor exploits a small number of fat cores.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fueled by increasing human appetite for high computing performance, semiconductor technology has now marched into the deep sub-micron era. As transistor size keeps shrinking, more and more transistors are integrated into a single chip. This has increased tremendously the power consumption and heat generation of IC chips. The rapidly growing heat dissipation greatly increases the packaging/cooling costs, and adversely affects the performance and reliability of a computing system. In addition, it also reduces the processor's life span and may even crash the entire computing system. Therefore, dynamic thermal management (DTM) is becoming a critical problem in modern computer system design. Extensive theoretical research has been conducted to study the DTM problem. However, most of them are based on theoretically idealized assumptions or simplified models. While these models and assumptions help to greatly simplify a complex problem and make it theoretically manageable, practical computer systems and applications must deal with many practical factors and details beyond these models or assumptions. The goal of our research was to develop a test platform that can be used to validate theoretical results on DTM under well-controlled conditions, to identify the limitations of existing theoretical results, and also to develop new and practical DTM techniques. This dissertation details the background and our research efforts in this endeavor. Specifically, in our research, we first developed a customized test platform based on an Intel desktop. We then tested a number of related theoretical works and examined their limitations under the practical hardware environment. With these limitations in mind, we developed a new reactive thermal management algorithm for single-core computing systems to optimize the throughput under a peak temperature constraint. We further extended our research to a multicore platform and developed an effective proactive DTM technique for throughput maximization on multicore processor based on task migration and dynamic voltage frequency scaling technique. The significance of our research lies in the fact that our research complements the current extensive theoretical research in dealing with increasingly critical thermal problems and enabling the continuous evolution of high performance computing systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A small group of companies including Intel, Microsoft, and Cisco have used "platform leadership" with great effect as a means for driving innovation and accelerating market growth within their respective industries. Prior research in this area emphasizes that trust plays a critical role in the success of this strategy. However, many of the categorizations of trust discussed in the literature tend to ignore or undervalue the fact that trust and power are often functionally equivalent, and that the coercion of weaker partners is sometimes misdiagnosed as collaboration. In this paper, I use case study data focusing on Intel's shift from ceramic/wire-bonded packaging to organic/C4 packaging to characterize the relationships between Intel and its suppliers, and to determine if these links are based on power in addition to trust. The case study shows that Intel's platform leadership strategy is built on a balance of both trust and a relatively benevolent form of power that is exemplified by the company's "open kimono" principle, through which Intel insists that suppliers share detailed financial data and highly proprietary technical information to achieve mutually advantageous objectives. By explaining more completely the nature of these inter-firm linkages, this paper usefully extends our understanding of how platform leadership is maintained by Intel, and contributes to the literature by showing how trust and power can be used simultaneously within an inter-firm relationship in a way that benefits all of the stakeholders.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An Application Specific Instruction-set Processor (ASIP) is a specialized processor tailored to run a particular application/s efficiently. However, when there are multiple candidate applications in the application’s domain it is difficult and time consuming to find optimum set of applications to be implemented. Existing ASIP design approaches perform this selection manually based on a designer’s knowledge. We help in cutting down the number of candidate applications by devising a classification method to cluster similar applications based on the special-purpose operations they share. This provides a significant reduction in the comparison overhead while resulting in customized ASIP instruction sets which can benefit a whole family of related applications. Our method gives users the ability to quantify the degree of similarity between the sets of shared operations to control the size of clusters. A case study involving twelve algorithms confirms that our approach can successfully cluster similar algorithms together based on the similarity of their component operations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel method, designated the holographic spectrum reconstruction (HSR) method, is proposed for achieving simultaneous display of the spectrum and image of an object in a single plane. A study of the scaling behaviour of both the spectrum and the image has been carried out and based on this study, it is demonstrated that a lensless coherent optical processor can be realized.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the modern business environment, meeting due dates and avoiding delay penalties are very important goals that can be accomplished by minimizing total weighted tardiness. We consider a scheduling problem in a system of parallel processors with the objective of minimizing total weighted tardiness. Our aim in the present work is to develop an efficient algorithm for solving the parallel processor problem as compared to the available heuristics in the literature and we propose the ant colony optimization approach for this problem. An extensive experimentation is conducted to evaluate the performance of the ACO approach on different problem sizes with the varied tardiness factors. Our experimentation shows that the proposed ant colony optimization algorithm is giving promising results compared to the best of the available heuristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous studies have shown that buffering packets in DRAM is a performance bottleneck. In order to understand the impediments in accessing the DRAM, we developed a detailed Petri net model of IP forwarding application on IXP2400 that models the different levels of the memory hierarchy. The cell based interface used to receive and transmit packets in a network processor leads to some small size DRAM accesses. Such narrow accesses to the DRAM expose the bank access latency, reducing the bandwidth that can be realized. With real traces up to 30% of the accesses are smaller than the cell size, resulting in 7.7% reduction in DRAM bandwidth. To overcome this problem, we propose buffering these small chunks of data in the on chip scratchpad memory. This scheme also exploits greater degree of parallelism between different levels of the memory hierarchy. Using real traces from the internet, we show that the transmit rate can be improved by an average of 21% over the base scheme without the use of additional hardware. Further, the impact of different traffic patterns on the network processor resources is studied. Under real traffic conditions, we show that the data bus which connects the off-chip packet buffer to the micro-engines, is the obstacle in achieving higher throughput.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a three player quantum `Dilemma' game each player takes independent decisions to maximize his/her individual gain. The optimal strategy in the quantum version of this game has a higher payoff compared to its classical counterpart. However, this advantage is lost if the initial qubits provided to the players are from a noisy source. We have experimentally implemented the three player quantum version of the `Dilemma' game as described by Johnson, [N.F. Johnson, Phys. Rev. A 63 (2001) 020302(R)] using nuclear magnetic resonance quantum information processor and have experimentally verified that the payoff of the quantum game for various levels of corruption matches the theoretical payoff. (c) 2007 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of scheduling divisible loads in distributed computing systems, in presence of processor release time is considered. The objective is to find the optimal sequence of load distribution and the optimal load fractions assigned to each processor in the system such that the processing time of the entire processing load is a minimum. This is a difficult combinatorial optimization problem and hence genetic algorithms approach is presented for its solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hall thrusters, such as Stationary Plasma Thruster (SPT), have been widely used on board modern satellites placed in geo-synchronows orbits for reasons such as orbit maintenance, repositioning and attitude control. In order to study the performance of the stationary plasma thruster, the thrust produced by it has been measured, using a thrust balance with strain gauge sensors under vacuum conditions, by activating the thruster. This activation of thruster has been carried out by switching ON and switching OFF of the necessary power supplies and control of other feed system such as the propellant flow in a particular sequence. Hitherto, these operations were done manually in the required sequence. This paper reports the attempt made to automate the sequential operation of the power supplies and the necessary control valves of the feed system using Intel 8051 microcontroller. This automation has made thrust measurements easier and more sophisticated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a nonlinear preprocessor for enhancing the performance of processors used for direction-of-arrival (DOA) estimation in heavy-tailed non-Gaussian noise. The preprocessor based on the phenomenon of suprathreshold stochastic resonance (SSR), provides SNR gain. The preprocessed data is used for DOA estimation by the MUSIC algorithm. Simulation results are presented to show that the SSR preprocessor provides a significant improvement in the performance of MUSIC in heavy-tailed noise at low SNR.