966 resultados para 100606 Processor Architectures


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a three player quantum `Dilemma' game each player takes independent decisions to maximize his/her individual gain. The optimal strategy in the quantum version of this game has a higher payoff compared to its classical counterpart. However, this advantage is lost if the initial qubits provided to the players are from a noisy source. We have experimentally implemented the three player quantum version of the `Dilemma' game as described by Johnson, [N.F. Johnson, Phys. Rev. A 63 (2001) 020302(R)] using nuclear magnetic resonance quantum information processor and have experimentally verified that the payoff of the quantum game for various levels of corruption matches the theoretical payoff. (c) 2007 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of social networking has exploded, with millions of people using various web- and mobile-based services around the world. This increase in social networking use has led to user anxiety related to privacy and the unauthorised exposure of personal information. Large-scale sharing in virtual spaces means that researchers, designers and developers now need to re-consider the issues and challenges of maintaining privacy when using social networking services. This paper provides a comprehensive survey of the current state-of-the-art privacy in social networks for both desktop and mobile uses and devices from various architectural vantage points. The survey will assist researchers and analysts in academia and industry to move towards mitigating many of the privacy issues in social networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Flexible constraint length channel decoders are required for software defined radios. This paper presents a novel scalable scheme for realizing flexible constraint length Viterbi decoders on a de Bruijn interconnection network. Architectures for flexible decoders using the flattened butterfly and shuffle-exchange networks are also described. It is shown that these networks provide favourable substrates for realizing flexible convolutional decoders. Synthesis results for the three networks are provided and a comparison is performed. An architecture based on a 2D-mesh, which is a topology having a nominally lesser silicon area requirement, is also considered as a fourth point for comparison. It is found that of all the networks considered, the de Bruijn network offers the best tradeoff in terms of area versus throughput.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of scheduling divisible loads in distributed computing systems, in presence of processor release time is considered. The objective is to find the optimal sequence of load distribution and the optimal load fractions assigned to each processor in the system such that the processing time of the entire processing load is a minimum. This is a difficult combinatorial optimization problem and hence genetic algorithms approach is presented for its solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a nonlinear preprocessor for enhancing the performance of processors used for direction-of-arrival (DOA) estimation in heavy-tailed non-Gaussian noise. The preprocessor based on the phenomenon of suprathreshold stochastic resonance (SSR), provides SNR gain. The preprocessed data is used for DOA estimation by the MUSIC algorithm. Simulation results are presented to show that the SSR preprocessor provides a significant improvement in the performance of MUSIC in heavy-tailed noise at low SNR.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We provide a comparative performance analysis of network architectures for beacon enabled Zigbee sensor clusters using the CSMA/CA MAC defined in the IEEE 802.15.4 standard, and organised as (i) a star topology, and (ii) a two-hop topology. We provide analytical models for obtaining performance measures such as mean network delay, and mean node lifetime. We find that the star topology is substantially superior both in delay performance and lifetime performance than the two-hop topology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Processor architects have a challenging task of evaluating a large design space consisting of several interacting parameters and optimizations. In order to assist architects in making crucial design decisions, we build linear regression models that relate Processor performance to micro-architecture parameters, using simulation based experiments. We obtain good approximate models using an iterative process in which Akaike's information criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We used this procedure to establish the relationship of the CPI performance response to 26 key micro-architectural parameters using a detailed cycle-by-cycle superscalar processor simulator The resulting models provide a significance ordering on all micro-architectural parameters and their interactions, and explain the performance variations of micro-architectural techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The author presents adaptive control techniques for controlling the flow of real-time jobs from the peripheral processors (PPs) to the central processor (CP) of a distributed system with a star topology. He considers two classes of flow control mechanisms: (1) proportional control, where a certain proportion of the load offered to each PP is sent to the CP, and (2) threshold control, where there is a maximum rate at which each PP can send jobs to the CP. The problem is to obtain good algorithms for dynamically adjusting the control level at each PP in order to prevent overload of the CP, when the load offered by the PPs is unknown and varying. The author formulates the problem approximately as a standard system control problem in which the system has unknown parameters that are subject to change. Using well-known techniques (e.g., naive-feedback-controller and stochastic approximation techniques), he derives adaptive controls for the system control problem. He demonstrates the efficacy of these controls in the original problem by using the control algorithms in simulations of a queuing model of the CP and the load controls.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an implementation of a multicast network of processors. The processors are connected in a fully connected network and it is possible to broadcast data in a single instruction. The network works at the processor-memory speed and therefore provides a fast communication link among processors. A number of interesting architectures are possible using such a network. We show some of these architectures which have been implemented and are functional. We also show the system software calls which allow programming of these machines in parallel mode.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Extended Hypercube is a new approach in multiprocessor architectures, which reduces the communication burden on the processor elements. We propose a scheme for implementing such an architecture using INMOS transputers as the processor and controller elements to achieve a very high computation to communication ratio.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is proposed. It involves column-row permutations and is well-suited to map onto the linear array topology of the SIMD architectures. The efficiency of the algorithm is compared with the other existing algorithms. The interconnectivity and the memory requirement of the linear array are discussed and the complexity of its layout area is derived. The parallel version of the algorithm mapped onto the linear array is then introduced and is explained with the help of an example. The optimality of the parallel algorithm is proved by deriving the time complexities of the algorithm on a single processor and the linear array.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we develop a multithreaded VLSI processor linear array architecture to render complex environments based on the radiosity approach. The processing elements are identical and multithreaded. They work in Single Program Multiple Data (SPMD) mode. A new algorithm to do the radiosity computations based on the progressive refinement approach[2] is proposed. Simulation results indicate that the architecture is latency tolerant and scalable. It is shown that a linear array of 128 uni-threaded processing elements sustains a throughput close to 0.4 million patches/sec.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of optimally scheduling a processor executing a multilayer protocol in an intelligent Network Interface Controller (NIC). In particular, we assume a typical LAN environment with class 4 transport service, a connectionless network service, and a class 1 link level protocol. We develop a queuing model for the problem. In the most general case this becomes a cyclic queuing network in which some queues have dedicated servers, and the others have a common schedulable server. We use sample path arguments and Markov decision theory to determine optimal service schedules. The optimal throughputs are compared with those obtained with simple policies. The optimal policy yields upto 25% improvement in some cases. In some other cases, the optimal policy does only slightly better than much simpler policies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we address a scheduling problem for minimising total weighted tardiness. The motivation for the paper comes from the automobile gear manufacturing process. We consider the bottleneck operation of heat treatment stage of gear manufacturing. Real life scenarios like unequal release times, incompatible job families, non-identical job sizes and allowance for job splitting have been considered. A mathematical model taking into account dynamic starting conditions has been developed. Due to the NP-hard nature of the problem, a few heuristic algorithms have been proposed. The performance of the proposed heuristic algorithms is evaluated: (a) in comparison with optimal solution for small size problem instances, and (b) in comparison with `estimated optimal solution' for large size problem instances. Extensive computational analyses reveal that the proposed heuristic algorithms are capable of consistently obtaining near-optimal solutions (that is, statistically estimated one) in very reasonable computational time.