217 resultados para Automatic mesh generation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a System-C based framework we are developing, to explore the impact of various architectural and microarchitectural level parameters of the on-chip interconnection network elements on its power and performance. The framework enables one to choose from a variety of architectural options like topology, routing policy, etc., as well as allows experimentation with various microarchitectural options for the individual links like length, wire width, pitch, pipelining, supply voltage and frequency. The framework also supports a flexible traffic generation and communication model. We provide preliminary results of using this framework to study the power, latency and throughput of a 4x4 multi-core processing array using mesh, torus and folded torus, for two different communication patterns of dense and sparse linear algebra. The traffic consists of both Request-Response messages (mimicing cache accesses)and One-Way messages. We find that the average latency can be reduced by increasing the pipeline depth, as it enables higher link frequencies. We also find that there exists an optimum degree of pipelining which minimizes energy-delay product.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The IEEE 802.16/WiMAX standard has fully embraced multi-antenna technology and can, thus, deliver robust and high transmission rates and higher system capacity. Nevertheless,due to its inherent form-factor constraints and cost concerns, a WiMAX mobile station (MS) should preferably contain fewer radio frequency (RF) chains than antenna elements.This is because RF chains are often substantially more expensive than antenna elements. Thus, antenna selection, wherein a subset of antennas is dynamically selected to connect to the limited RF chains for transceiving, is a highly appealing performance enhancement technique for multi-antenna WiMAX terminals.In this paper, a novel antenna selection protocol tailored for next-generation IEEE 802.16 mobile stations is proposed. As demonstrated by the extensive OPNET simulations, the proposed protocol delivers a significant performance improvement over conventional 802.16 terminals that lack the antenna selection capability. Moreover, the new protocol leverages the existing signaling methods defined in 802.16, thereby incurring a negligible signaling overhead and requiring only diminutive modifications of the standard. To the best of our knowledge, this paper represents the first effort to support antenna selection capability in IEEE 802.16 mobile stations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article we study the problem of joint congestion control, routing and MAC layer scheduling in multi-hop wireless mesh network, where the nodes in the network are subjected to maximum energy expenditure rates. We model link contention in the wireless network using the contention graph and we model energy expenditure rate constraint of nodes using the energy expenditure rate matrix. We formulate the problem as an aggregate utility maximization problem and apply duality theory in order to decompose the problem into two sub-problems namely, network layer routing and congestion control problem and MAC layer scheduling problem. The source adjusts its rate based on the cost of the least cost path to the destination where the cost of the path includes not only the prices of the links in it but also the prices associated with the nodes on the path. The MAC layer scheduling of the links is carried out based on the prices of the links. We study the e�ects of energy expenditure rate constraints of the nodes on the optimal throughput of the network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We derive bounds on leptonic double mass insertions of the type delta(l)(i4)delta(l)(4j) in four generational MSSM, using the present limits on l(i) -> l(j) + gamma. Two main features distinguish the rates of these processes in MSSM4 from MSSM3: (a) tan beta is restricted to be very small less than or similar to 3 and (b) the large masses for the fourth generation leptons. In spite of small tan beta, there is an enhancement in amplitudes with LLRR (4 delta(ll)(i4)delta(rr)(4j)) type insertions which pick up the mass of the fourth generation lepton, m(tau'). We find these bounds to be at least two orders of magnitude more stringent than those in MSSM3. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wetlands are the most productive and biologically diverse but very fragile ecosystems. They are vulnerable to even small changes in their biotic and abiotic factors. In recent years, there has been concern over the continuous degradation of wetlands due to unplanned developmental activities. This necessitates inventorying, mapping, and monitoring of wetlands to implement sustainable management approaches. The principal objective of this work is to evolve a strategy to identify and monitor wetlands using temporal remote sensing (RS) data. Pattern classifiers were used to extract wetlands automatically from NIR bands of MODIS, Landsat MSS and Landsat TM remote sensing data. MODIS provided data for 2002 to 2007, while for 1973 and 1992 IR Bands of Landsat MSS and TM (79m and 30m spatial resolution) data were used. Principal components of IR bands of MODIS (250 m) were fused with IRS LISS-3 NIR (23.5 m). To extract wetlands, statistical unsupervised learning of IR bands for the respective temporal data was performed using Bayesian approach based on prior probability, mean and covariance. Temporal analysis of wetlands indicates a sharp decline of 58% in Greater Bangalore attributing to intense urbanization processes, evident from a 466% increase in built-up area from 1973 to 2007.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Technology scaling has caused Negative Bias Temperature Instability (NBTI) to emerge as a major circuit reliability concern. Simultaneously leakage power is becoming a greater fraction of the total power dissipated by logic circuits. As both NBTI and leakage power are highly dependent on vectors applied at the circuit’s inputs, they can be minimized by applying carefully chosen input vectors during periods when the circuit is in standby or idle mode. Unfortunately input vectors that minimize leakage power are not the ones that minimize NBTI degradation, so there is a need for a methodology to generate input vectors that minimize both of these variables.This paper proposes such a systematic methodology for the generation of input vectors which minimize leakage power under the constraint that NBTI degradation does not exceed a specified limit. These input vectors can be applied at the primary inputs of a circuit when it is in standby/idle mode and are such that the gates dissipate only a small amount of leakage power and also allow a large majority of the transistors on critical paths to be in the “recovery” phase of NBTI degradation. The advantage of this methodology is that allowing circuit designers to constrain NBTI degradation to below a specified limit enables tighter guardbanding, increasing performance. Our methodology guarantees that the generated input vector dissipates the least leakage power among all the input vectors that satisfy the degradation constraint. We formulate the problem as a zero-one integer linear program and show that this formulation produces input vectors whose leakage power is within 1% of a minimum leakage vector selected by a search algorithm and simultaneously reduces NBTI by about 5.75% of maximum circuit delay as compared to the worst case NBTI degradation. Our paper also proposes two new algorithms for the identification of circuit paths that are affected the most by NBTI degradation. The number of such paths identified by our algorithms are an order of magnitude fewer than previously proposed heuristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Scalable Networks on Chips (NoCs) are needed to match the ever-increasing communication demands of large-scale Multi-Processor Systems-on-chip (MPSoCs) for multi media communication applications. The heterogeneous nature of application specific on-chip cores along with the specific communication requirements among the cores calls for the design of application-specific NoCs for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular networks-on-chip. The proposed method exploits a priori knowledge of the applications communication characteristic to generate an optimized network topology and corresponding routing tables.