179 resultados para Transformations


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30% of resources going unused on average for programs of the Parboil2 suite that we used in our work. Current GPUs therefore allow concurrent execution of kernels to improve utilization. In this work, we study concurrent execution of GPU kernels using multiprogram workloads on current NVIDIA Fermi GPUs. On two-program workloads from the Parboil2 benchmark suite we find concurrent execution is often no better than serialized execution. We identify that the lack of control over resource allocation to kernels is a major serialization bottleneck. We propose transformations that convert CUDA kernels into elastic kernels which permit fine-grained control over their resource usage. We then propose several elastic-kernel aware concurrency policies that offer significantly better performance and concurrency compared to the current CUDA policy. We evaluate our proposals on real hardware using multiprogrammed workloads constructed from benchmarks in the Parboil 2 suite. On average, our proposals increase system throughput (STP) by 1.21x and improve the average normalized turnaround time (ANTT) by 3.73x for two-program workloads when compared to the current CUDA concurrency implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Porous fungus-like ZnO nanostructures have been synthesized by simple thermal annealing of the hydrothermally synthesized sheet-like ZnS(en)(0.5) complex precursor in air at 600 degrees C. Structural and morphological changes occurring during ZnS(en)(0.5) -> ZnS -> ZnO transformations have been observed closely by annealing the as-synthesized precursor at 100-600 degrees C. Wurtzite ZnS nanosheets and ZnS-ZnO composites are obtained at temperatures of 400 degrees C and 500 degrees C, respectively. Thermal decomposition and oxidation of the ZnS(en) 0.5 nanosheets have been confirmed by differential scanning calorimetry and thermo-gravimetric analysis. The visible light driven photocatalytic degradation of methylene blue dye has been demonstrated in the synthesized samples. ZnS-ZnO composite shows the highest dye degradation efficiency of 74% due to the formation of surface complex as well as higher visible light absorption as a result of band-gap narrowing effect. The porous ZnO nanostructures show efficient visible photoluminescence (PL) emission with a colour coordinate of (0.29, 0.35), which is close to that of white light (0.33, 0.33). The efficient visible PL emission as well as visible light driven photocatalytic activity of the materials synthesized in the present work might be very attractive for their applications in future optoelectronic devices, including in white light emitting devices.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cu2SnS3 films have been processed by the sol-gel route. Differential Scanning Calorimetry (DSC) study was done to observe the phase transformations and to ascertain the deposition temperature. X-ray diffraction (XRD) confirms the phase formation of Cu2SnS3. The texture coefficient analysis shows the preferential orientation of the (112) facet. Scanning electron microscopy reveals the morphology of the film Energy Dispersive Spectroscopy (EDS) was used for compositional studies. Raman spectrum shows the peaks corresponding to the tetragonal phase of Cu2SnS3.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Experimental and simulation studies have uncovered at least two anomalous concentration regimes in water-dimethyl sulfoxide (DMSO) binary mixture whose precise origin has remained a subject of debate. In order to facilitate time domain experimental investigation of the dynamics of such binary mixtures, we explore strength or extent of influence of these anomalies in dipolar solvation dynamics by carrying out long molecular dynamics simulations over a wide range of DMSO concentration. The solvation time correlation function so calculated indeed displays strong composition dependent anomalies, reflected in pronounced non-exponential kinetics and non-monotonous composition dependence of the average solvation time constant. In particular, we find remarkable slow-down in the solvation dynamics around 10%-20% and 35%-50% mole percentage. We investigate microscopic origin of these two anomalies. The population distribution analyses of different structural morphology elucidate that these two slowing down are reflections of intriguing structural transformations in water-DMSO mixture. The structural transformations themselves can be explained in terms of a change in the relative coordination number of DMSO and water molecules, from 1DMSO:2H(2)O to 1H(2)O:1DMSO and 1H(2)O:2DMSO complex formation. Thus, while the emergence of first slow down (at 15% DMSO mole percentage) is due to the percolation among DMSO molecules supported by the water molecules (whose percolating network remains largely unaffected), the 2nd anomaly (centered on 40%-50%) is due to the formation of the network structure where the unit of 1DMSO:1H(2)O and 2DMSO:1H(2)O dominates to give rise to rich dynamical features. Through an analysis of partial solvation dynamics an interesting negative cross-correlation between water and DMSO is observed that makes an important contribution to relaxation at intermediate to longer times.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Frohlich, Morchio and Strocchi long ago proved that the Lorentz invariance is spontaneously broken in QED because of infrared effects. We develop a simple model where the consequences of this breakdown can be explicitly and easily calculated. For this purpose, the superselected U(1) charge group of QED is extended to a superselected ``Sky'' group containing direction-dependent gauge transformations at infinity. It is the analog of the Spi group of gravity. As Lorentz transformations do not commute with Sky, they are spontaneously broken. These Abelian considerations and model are extended to non-Abelian gauge symmetries. Basic issues regarding the observability of twisted non-Abelian gauge symmetries and of the asymptotic ADM symmetries of quantum gravity are raised.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Software transactional memory(STM) is a promising programming paradigm for shared memory multithreaded programs. While STM offers the promise of being less error-prone and more programmer friendly compared to traditional lock-based synchronization, it also needs to be competitive in performance in order for it to be adopted in mainstream software. A major source of performance overheads in STM is transactional aborts. Conflict resolution and aborting a transaction typically happens at the transaction level which has the advantage that it is automatic and application agnostic. However it has a substantial disadvantage in that STM declares the entire transaction as conflicting and hence aborts it and re-executes it fully, instead of partially re-executing only those part(s) of the transaction, which have been affected due to the conflict. This "Re-execute Everything" approach has a significant adverse impact on STM performance. In order to mitigate the abort overheads, we propose a compiler aided Selective Reconciliation STM (SR-STM) scheme, wherein certain transactional conflicts can be reconciled by performing partial re-execution of the transaction. Ours is a selective hybrid approach which uses compiler analysis to identify those data accesses which are legal and profitable candidates for reconciliation and applies partial re-execution only to these candidates selectively while other conflicting data accesses are handled by the default STM approach of abort and full re-execution. We describe the compiler analysis and code transformations required for supporting selective reconciliation. We find that SR-STM is effective in reducing the transactional abort overheads by improving the performance for a set of five STAMP benchmarks by 12.58% on an average and up to 22.34%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Electrophilic halogen-induced reactions of unactivated olefins are an important class of transformations, whose catalytic enantioselective variants have surfaced during the past few years as effective means of olefin heterodifunctionalization. This article covers important developments in the area of enantioselective halocyclizations, specifically in the context of the synthesis of nitrogenous heterocycles.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multi-GPU machines are being increasingly used in high-performance computing. Each GPU in such a machine has its own memory and does not share the address space either with the host CPU or other GPUs. Hence, applications utilizing multiple GPUs have to manually allocate and manage data on each GPU. Existing works that propose to automate data allocations for GPUs have limitations and inefficiencies in terms of allocation sizes, exploiting reuse, transfer costs, and scalability. We propose a scalable and fully automatic data allocation and buffer management scheme for affine loop nests on multi-GPU machines. We call it the Bounding-Box-based Memory Manager (BBMM). BBMM can perform at runtime, during standard set operations like union, intersection, and difference, finding subset and superset relations on hyperrectangular regions of array data (bounding boxes). It uses these operations along with some compiler assistance to identify, allocate, and manage data required by applications in terms of disjoint bounding boxes. This allows it to (1) allocate exactly or nearly as much data as is required by computations running on each GPU, (2) efficiently track buffer allocations and hence maximize data reuse across tiles and minimize data transfer overhead, and (3) and as a result, maximize utilization of the combined memory on multi-GPU machines. BBMM can work with any choice of parallelizing transformations, computation placement, and scheduling schemes, whether static or dynamic. Experiments run on a four-GPU machine with various scientific programs showed that BBMM reduces data allocations on each GPU by up to 75% compared to current allocation schemes, yields performance of at least 88% of manually written code, and allows excellent weak scaling.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pressure-induced phase transformations (PIPTs) occur in a wide range of materials. In general, the bonding characteristics, before and after the PIPT, remain invariant in most materials, and the bond rearrangement is usually irreversible due to the strain induced under pressure. A reversible PIPT associated with a substantial bond rearrangement has been found in a metal-organic framework material, namely tmenH(2)]Er(HCOO)(4)](2) (tmenH(2)(2+) = N,N,N',N'-tetramethylethylenediammonium). The transition is first-order and is accompanied by a unit cell volume change of about 10%. High-pressure single-crystal X-ray diffraction studies reveal the complex bond rearrangement through the transition. The reversible nature of the transition is confirmed by means of independent nanoindentation measurements on single crystals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Programming for parallel architectures that do not have a shared address space is extremely difficult due to the need for explicit communication between memories of different compute devices. A heterogeneous system with CPUs and multiple GPUs, or a distributed-memory cluster are examples of such systems. Past works that try to automate data movement for distributed-memory architectures can lead to excessive redundant communication. In this paper, we propose an automatic data movement scheme that minimizes the volume of communication between compute devices in heterogeneous and distributed-memory systems. We show that by partitioning data dependences in a particular non-trivial way, one can generate data movement code that results in the minimum volume for a vast majority of cases. The techniques are applicable to any sequence of affine loop nests and works on top of any choice of loop transformations, parallelization, and computation placement. The data movement code generated minimizes the volume of communication for a particular configuration of these. We use a combination of powerful static analyses relying on the polyhedral compiler framework and lightweight runtime routines they generate, to build a source-to-source transformation tool that automatically generates communication code. We demonstrate that the tool is scalable and leads to substantial gains in efficiency. On a heterogeneous system, the communication volume is reduced by a factor of 11X to 83X over state-of-the-art, translating into a mean execution time speedup of 1.53X. On a distributed-memory cluster, our scheme reduces the communication volume by a factor of 1.4X to 63.5X over state-of-the-art, resulting in a mean speedup of 1.55X. In addition, our scheme yields a mean speedup of 2.19X over hand-optimized UPC codes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sugganahalli, a rural vernacular community in a warm-humid region in South India, is under transition towards adopting modern construction practices. Vernacular local building elements like rubble walls and mud roofs are given way to burnt brick walls and reinforced cement concrete (RCC)/tin roofs. Over 60% of Indian population is rural, and implications of such transitions on thermal comfort and energy in buildings are crucial to understand. Vernacular architecture evolves adopting local resources in response to the local climate adopting passive solar designs. This paper investigates the effectiveness of passive solar elements on the indoor thermal comfort by adopting modern climate-responsive design strategies. Dynamic simulation models validated by measured data have also been adopted to determine the impact of the transition from vernacular to modern material-configurations. Age-old traditional design considerations were found to concur with modern understanding into bio-climatic response and climate-responsiveness. Modern transitions were found to increase the average indoor temperatures in excess of 7 degrees C. Such transformations tend to shift the indoor conditions to a psychrometric zone that is likely to require active air-conditioning. Also, the surveyed thermal sensation votes were found to lie outside the extended thermal comfort boundary for hot developing countries provided by Givoni in the bio-climatic chart.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Antisite disorder is observed to have significant impact on the magnetic properties of the double perovskite Y2CoMnO6 which has been recently identified as a multiferroic. A paramagnetic-ferromagnetic phase transition occurs in this material at T-c approximate to 75 K. At 2K, it displays a strong ferromagnetic hysteresis with a significant coercive field of H-c approximate to 15 kOe. Sharp steps are observed in the hysteresis curves recorded below 8K. In the temperature range 2K <= T <= 5K, the hysteresis loops are anomalous as the virgin curve lies outside the main loop. The field-cooling conditions as well as the rate of field-sweep are found to influence the steps. Quantitative analysis of the neutron diffraction data shows that at room temperature, Y2CoMnO6 consists of 62% of monoclinic P2(1)/n with nearly 70% antisite disorder and 38% Pnma. The bond valence sums indicate the presence of other valence states for Co and Mn which arise from disorder. We explain the origin of steps by using a model for pinning of magnetization at the antiphase boundaries created by antisite disorder. The steps in magnetization closely resemble the martensitic transformations found in intermetallics and display first-order characteristics as revealed in the Arrott's plots. (C) 2014 AIP Publishing LLC.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivated by the recent proposal for the S-matrix in AdS(3) x S-3 with mixed three form fluxes, we study classical folded string spinning in AdS(3) with both Ramond and Neveu-Schwarz three form fluxes. We solve the equations of motion of these strings and obtain their dispersion relation to the leading order in the Neveu-Schwarz flux b. We show that dispersion relation for the spinning strings with large spin S acquires a term given by -root lambda/2 pi b(2) log(2) S in addition to the usual root lambda/pi log S term where root lambda is proportional to the square of the radius of AdS(3). Using SO(2, 2) transformations and re-parmetrizations we show that these spinning strings can be related to light like Wilson loops in AdS(3) with Neveu-Schwarz flux b. We observe that the logarithmic divergence in the area of the light like Wilson loop is also deformed by precisely the same coefficient of the b(2) log(2) S term in the dispersion relation of the spinning string. This result indicates that the coefficient of b(2) log(2) S has a property similar to the coefficient of the log S term, known as cusp-anomalous dimension, and can possibly be determined to all orders in the coupling lambda using the recent proposal for the S-matrix.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Retaining the morphology of gallium oxide nanostructures during structural transformations or after doping with lanthanide ions is not facile. Here we report on the sonochemical synthesis of nearly monodisperse similar to 550 nm long nano-spindles of undoped and La-doped alpha-GaOOH. The transformation of as-prepared undoped and La-doped alpha-GaOOH powders into the corresponding undoped and La-doped Ga2O3 phases (alpha and beta) was achieved by carrying out controlled annealing at elevated temperatures under optimized conditions. The formation of gallium oxide nano-spindles is explained by invoking the phenomenon of oriented attachment, as amply supported by electron microscopy. Interestingly, the morphology of the gallium oxide nano-spindles remained conserved even after doping them with more than 1.4 at% of La3+ ions. Such robust structural stability could be attributed to the oriented attachment-type growth observed in the nano-spindles. The as-prepared samples and the corresponding annealed ones were thoroughly characterized by powder X-ray diffraction (PXRD), electron microscopy (SEM, TEM, and STEM-EDS) and X-ray photoelectron spectroscopy (XPS). Finally, photoluminescence from the single-crystalline undoped and La-doped beta-Ga2O3 was explored.