909 resultados para Legacy object oriented code
Resumo:
This paper presents a low-ML-decoding-complexity, full-rate, full-diversity space-time block code (STBC) for a 2 transmit antenna, 2 receive antenna multiple-input multipleoutput (MIMO) system, with coding gain equal to that of the best and well known Golden code for any QAM constellation.Recently, two codes have been proposed (by Paredes, Gershman and Alkhansari and by Sezginer and Sari), which enjoy a lower decoding complexity relative to the Golden code, but have lesser coding gain. The 2 × 2 STBC presented in this paper has lesser decoding complexity for non-square QAM constellations,compared with that of the Golden code, while having the same decoding complexity for square QAM constellations. Compared with the Paredes-Gershman-Alkhansari and Sezginer-Sari codes, the proposed code has the same decoding complexity for nonrectangular QAM constellations. Simulation results, which compare the codeword error rate (CER) performance, are presented.
Resumo:
We consider single-source single-sink (ss-ss) multi-hop relay networks, with slow-fading links and single-antenna half-duplex relay nodes. While two-hop cooperative relay networks have been studied in great detail in terms of the diversity-multiplexing tradeoff (DMT), few results are available for more general networks. In this paper, we identify two families of networks that are multi-hop generalizations of the two-hop network: K-Parallel-Path (KPP)networks and layered networks.KPP networks, can be viewed as the union of K node-disjoint parallel relaying paths, each of length greater than one. KPP networks are then generalized to KPP(I) networks, which permit interference between paths and to KPP(D) networks, which possess a direct link from source to sink. We characterize the DMT of these families of networks completely for K > 3. Layered networks are networks comprising of layers of relays with edges existing only between adjacent layers, with more than one relay in each layer. We prove that a linear DMT between the maximum diversity dmax and the maximum multiplexing gain of 1 is achievable for single-antenna fully-connected layered networks. This is shown to be equal to the optimal DMT if the number of relaying layers is less than 4.For multiple-antenna KPP and layered networks, we provide an achievable DMT, which is significantly better than known lower bounds for half duplex networks.For arbitrary multi-terminal wireless networks with multiple source-sink pairs, the maximum achievable diversity is shown to be equal to the min-cut between the corresponding source and the sink, irrespective of whether the network has half-duplex or full-duplex relays. For arbitrary ss-ss single-antenna directed acyclic networks with full-duplex relays, we prove that a linear tradeoff between maximum diversity and maximum multiplexing gain is achievable.Along the way, we derive the optimal DMT of a generalized parallel channel and derive lower bounds for the DMT of triangular channel matrices, which are useful in DMT computation of various protocols. We also give alternative and often simpler proofs of several existing results and show that codes achieving full diversity on a MIMO Rayleigh fading channel achieve full diversity on arbitrary fading channels. All protocols in this paper are explicit and use only amplify-and-forward (AF) relaying. We also construct codes with short block-lengths based on cyclic division algebras that achieve the optimal DMT for all the proposed schemes.Two key implications of the results in the paper are that the half-duplex constraint does not entail any rate loss for a large class of cooperative networks and that simple AF protocols are often sufficient to attain the optimal DMT
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
NMR spectra of molecules oriented in liquid-crystalline matrix provide information on the structure and orientation of the molecules. Thermotropic liquid crystals used as an orienting media result in the spectra of spins that are generally strongly coupled. The number of allowed transitions increases rapidly with the increase in the number of interacting spins. Furthermore, the number of single quantum transitions required for analysis is highly redundant. In the present study, we have demonstrated that it is possible to separate the subspectra of a homonuclear dipolar coupled spin system on the basis of the spin states of the coupled heteronuclei by multiple quantum (MQ)−single quantum (SQ) correlation experiments. This significantly reduces the number of redundant transitions, thereby simplifying the analysis of the complex spectrum. The methodology has been demonstrated on the doubly 13C labeled acetonitrile aligned in the liquid-crystal matrix and has been applied to analyze the complex spectrum of an oriented six spin system.
Resumo:
In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be introduced otherwise. But scheduling these spill instructions in the compact schedule is a difficult task. Several heuristics have been proposed to schedule spill code. These heuristics may generate more spill code than necessary, and scheduling them may necessitate increasing the initiation interval. We model the problem of register allocation with spill code generation and scheduling in software pipelined loops as a 0-1 integer linear program. The formulation minimizes the increase in initiation interval (II) by optimally placing spill code and simultaneously minimizes the amount of spill code produced. To the best of our knowledge, this is the first integrated formulation for register allocation, optimal spill code generation and scheduling for software pipelined loops. The proposed formulation performs better than the existing heuristics by preventing an increase in II in 11.11% of the loops and generating 18.48% less spill code on average among the loops extracted from Perfect Club and SPEC benchmarks with a moderate increase in compilation time.
Resumo:
Due to the importance of collective communications in scientific parallel applications, many strategies have been devised for optimizing collective communications for different kinds of parallel environments. There has been an increasing interest to evolve efficient broadcast algorithms for computational grids. In this paper, we present application-oriented adaptive techniques that take into account resource characteristics as well as the application's usage of broadcasts for deriving efficient broadcast trees. In particular, we consider two broadcast parameters used in the application, namely, the broadcast message sizes and the time interval between the broadcasts. The results indicate that our adaptive strategies can provide 20% average improvement in performance over the popular MPICH-G2's MPI_Bcast implementation for loaded network conditions.
Resumo:
In this second part of a two part series of papers, we construct a new class of Space-Time Block Codes (STBCs) for point-to-point MIMO channel and Distributed STBCs (DSTBCs) for the amplify-and-forward relay channel that give full-diversity with Partial Interference Cancellation (PIC) and PIC with Successive Interference Cancellation (PIC-SIC) decoders. The proposed class of STBCs include most of the known full-diversity low complexity PIC/PIC-SIC decodable STBCs as special cases. We also show that a number of known full-diversity PIC/PIC-SIC decodable STBCs that were constructed for the point-topoint MIMO channel can be used as full-diversity PIC/PIC-SIC decodable DSTBCs in relay networks. For the same decoding complexity, the proposed STBCs and DSTBCs achieve higher rates than the known low decoding complexity codes. Simulation results show that the new codes have a better bit error rate performance than the low ML decoding complexity codes available in the literature.
Resumo:
Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, this increased fetch bandwidth cannot be exploited unless pipeline stages further downstream correspondingly improve. In particular,register renaming a large number of instructions per cycle is diDcult. A large instruction window, needed to receive multiple basic blocks per cycle, will slow down dependence resolution and instruction issue. This paper addresses these and related issues by proposing (i) partitioning of the instruction window into multiple blocks, each holding a dynamic code sequence; (ii) logical partitioning of the registerjle into a global file and several local jles, the latter holding registers local to a dynamic code sequence; (iii) the dynamic recording and reuse of register renaming information for registers local to a dynamic code sequence. Performance studies show these mechanisms improve performance over traditional superscalar processors by factors ranging from 1.5 to a little over 3 for the SPEC Integer programs. Next, it is observed that several of the loops in the benchmarks display vector-like behavior during execution, even if the static loop bodies are likely complex for compile-time vectorization. A dynamic loop vectorization mechanism that builds on top of the above mechanisms is briefly outlined. The mechanism vectorizes up to 60% of the dynamic instructions for some programs, albeit the average number of iterations per loop is quite small.
Resumo:
Zn1−xMgxO (x = 0.3) thin films have been fabricated on Pt/TiO2/SiO2/Si substrates using multimagnetron sputtering technique. The films with wurtzite structure showed a (002) preferred orientation. Ferroelectricity in Zn1−xMgxO films was established from the temperature dependent dielectric constant and the polarization hysteresis loop. The temperature dependent study of dielectric constant at different frequencies exhibited a dielectric anomaly at 110 °C. The resistivity versus temperature characteristics showed an anomalous increase in the vicinity of the dielectric transition temperature. The Zn1−xMgxO thin films exhibit well-defined polarization hysteresis loop, with a remanent polarization of 0.2 μC/cm2 and coercive field of 8 kV/cm at room temperature.
Resumo:
Highly (110) preferred orientated antiferroelectric PbZrO3 (PZ) and La-modified PZ thin films have been fabricated on Pt/Ti/SiO2/Si substrates using sol-gel process. Dielectric properties, electric field induced ferroelectric polarization, and the temperature dependence of the dielectric response have been explored as a function of composition. The Tc has been observed to decrease by ∼ 17 °C per 1 mol % of La doping. Double hysteresis loops were seen with zero remnant polarization and with coercive fields in between 176 and 193 kV/cm at 80 °C for antiferroelectric to ferroelectric phase transformation. These slim loops have been explained by the high orientation of the films along the polar direction of the antiparallel dipoles of a tetragonal primitive cell and by the strong electrostatic interaction between La ions and oxygen ions in an ABO3 perovskite unit cell. High quality films exhibited very low loss factor less than 0.015 at room temperature and pure PZ; 1 and 2 mol % La doped PZs have shown the room temperature dielectric constant of 135, 219, and 142 at the frequency of 10 kHz. The passive layer effects in these films have been explained by Curie constants and Curie temperatures. The ac conductivity and the corresponding Arrhenius plots have been shown and explained in terms of doping effect and electrode resistance.
Resumo:
This paper describes the simulation of a control scheme using the principle of field orientation for the control of a voltage source inverter-fed induction motor. The control principle is explained, followed by an algorithm to simulate various components of the system in the digital computer. The dynamic response of the system for the load disturbance and set-point variations have been studied. Also, the results of the simulation showing the behavior of field coordinates for such disturbances are given.