274 resultados para parallel processing
Resumo:
Software transactional memory (STM) has been proposed as a promising programming paradigm for shared memory multi-threaded programs as an alternative to conventional lock based synchronization primitives. Typical STM implementations employ a conflict detection scheme, which works with uniform access granularity, tracking shared data accesses either at word/cache line or at object level. It is well known that a single fixed access tracking granularity cannot meet the conflicting goals of reducing false conflicts without impacting concurrency adversely. A fine grained granularity while improving concurrency can have an adverse impact on performance due to lock aliasing, lock validation overheads, and additional cache pressure. On the other hand, a coarse grained granularity can impact performance due to reduced concurrency. Thus, in general, a fixed or uniform granularity access tracking (UGAT) scheme is application-unaware and rarely matches the access patterns of individual application or parts of an application, leading to sub-optimal performance for different parts of the application(s). In order to mitigate the disadvantages associated with UGAT scheme, we propose a Variable Granularity Access Tracking (VGAT) scheme in this paper. We propose a compiler based approach wherein the compiler uses inter-procedural whole program static analysis to select the access tracking granularity for different shared data structures of the application based on the application's data access pattern. We describe our prototype VGAT scheme, using TL2 as our STM implementation. Our experimental results reveal that VGAT-STM scheme can improve the application performance of STAMP benchmarks from 1.87% to up to 21.2%.
Resumo:
The design and operation of the minimum cost classifier, where the total cost is the sum of the measurement cost and the classification cost, is computationally complex. Noting the difficulties associated with this approach, decision tree design directly from a set of labelled samples is proposed in this paper. The feature space is first partitioned to transform the problem to one of discrete features. The resulting problem is solved by a dynamic programming algorithm over an explicitly ordered state space of all outcomes of all feature subsets. The solution procedure is very general and is applicable to any minimum cost pattern classification problem in which each feature has a finite number of outcomes. These techniques are applied to (i) voiced, unvoiced, and silence classification of speech, and (ii) spoken vowel recognition. The resulting decision trees are operationally very efficient and yield attractive classification accuracies.
Resumo:
Editors' note:Flexible, large-area display and sensor arrays are finding growing applications in multimedia and future smart homes. This article first analyzes and compares current flexible devices, then discusses the implementation, requirements, and testing of flexible sensor arrays.—Jiun-Lang Huang (National Taiwan University) and Kwang-Ting (Tim) Cheng (University of California, Santa Barbara)
Resumo:
A finite element method for solving multidimensional population balance systems is proposed where the balance of fluid velocity, temperature and solute partial density is considered as a two-dimensional system and the balance of particle size distribution as a three-dimensional one. The method is based on a dimensional splitting into physical space and internal property variables. In addition, the operator splitting allows to decouple the equations for temperature, solute partial density and particle size distribution. Further, a nodal point based parallel finite element algorithm for multi-dimensional population balance systems is presented. The method is applied to study a crystallization process assuming, for simplicity, a size independent growth rate and neglecting agglomeration and breakage of particles. Simulations for different wall temperatures are performed to show the effect of cooling on the crystal growth. Although the method is described in detail only for the case of d=2 space and s=1 internal property variables it has the potential to be extendable to d+s variables, d=2, 3 and s >= 1. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Current scientific research is characterized by increasing specialization, accumulating knowledge at a high speed due to parallel advances in a multitude of sub-disciplines. Recent estimates suggest that human knowledge doubles every two to three years – and with the advances in information and communication technologies, this wide body of scientific knowledge is available to anyone, anywhere, anytime. This may also be referred to as ambient intelligence – an environment characterized by plentiful and available knowledge. The bottleneck in utilizing this knowledge for specific applications is not accessing but assimilating the information and transforming it to suit the needs for a specific application. The increasingly specialized areas of scientific research often have the common goal of converting data into insight allowing the identification of solutions to scientific problems. Due to this common goal, there are strong parallels between different areas of applications that can be exploited and used to cross-fertilize different disciplines. For example, the same fundamental statistical methods are used extensively in speech and language processing, in materials science applications, in visual processing and in biomedicine. Each sub-discipline has found its own specialized methodologies making these statistical methods successful to the given application. The unification of specialized areas is possible because many different problems can share strong analogies, making the theories developed for one problem applicable to other areas of research. It is the goal of this paper to demonstrate the utility of merging two disparate areas of applications to advance scientific research. The merging process requires cross-disciplinary collaboration to allow maximal exploitation of advances in one sub-discipline for that of another. We will demonstrate this general concept with the specific example of merging language technologies and computational biology.
Resumo:
Instruction reuse is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an optimizing compiler, they do not succeed many a time due to limited knowledge of run-time data. In this paper we examine instruction reuse of integer ALU and load instructions in network processing applications. Specifically, this paper attempts to answer the following questions: (1) How much of instruction reuse is inherent in network processing applications?, (2) Can reuse be improved by reducing interference in the reuse buffer?, (3) What characteristics of network applications can be exploited to improve reuse?, and (4) What is the effect of reuse on resource contention and memory accesses? We propose an aggregation scheme that combines the high-level concept of network traffic i.e. "flows" with a low level microarchitectural feature of programs i.e. repetition of instructions and data along with an architecture that exploits temporal locality in incoming packet data to improve reuse. We find that for the benchmarks considered, 1% to 50% of instructions are reused while the speedup achieved varies between 1% and 24%. As a side effect, instruction reuse reduces memory traffic and can therefore be considered as a scheme for low power.
Resumo:
Rapid solidification, mechanical alloying and devitrificaiton of precursor metallic glasses are all possible routes for the synthesis of nanocrystals and nanocomposites, though their efficacy is system dependent. In a comprehensive study of alloys across the Ti-Ni phase diagram, nanocrystals of Ti and Ni and nanocomposites of alpha -Ti and Ti sub 2 Ni, Ti sub 2 Ni and TiNi and beta -Ti and glass have been produced. By the addition of Al, devitrification of metallic glasses created by mechanical alloying led to nanocrystalline intermetallic compounds. The evolution of these nanocrystalline microstructures has been rationalized on the basis of thermodynamic and kinetic considerations involving the metastable phase diagram for this system.
Resumo:
A new scheme for minimizing handover failure probability in mobile cellular communication systems is presented. The scheme involves a reassignment of priorities for handover requests enqueued in adjacent cells to release a channel for a handover request which is about to fail. Performance evaluation of the new scheme carried out by computer simulation of a four-cell highway cellular system has shown a considerable reduction in handover failure probability
Resumo:
Laminar separation bubbles are thought to be highly non-parallel, and hence global stability studies start from this premise. However, experimentalists have always realized that the flow is more parallel than is commonly believed, for pressure-gradient-induced bubbles, and this is why linear parallel stability theory has been successful in describing their early stages of transition. The present experimental/numerical study re-examines this important issue and finds that the base flow in such a separation bubble becomes nearly parallel due to a strong-interaction process between the separated boundary layer and the outer potential flow. The so-called dead-air region or the region of constant pressure is a simple consequence of this strong interaction. We use triple-deck theory to qualitatively explain these features. Next, the implications of global analysis for the linear stability of separation bubbles are considered. In particular we show that in the initial portion of the bubble, where the flow is nearly parallel, local stability analysis is sufficient to capture the essential physics. It appears that the real utility of the global analysis is perhaps in the rear portion of the bubble, where the flow is highly non-parallel, and where the secondary/nonlinear instability stages are likely to dominate the dynamics.
Resumo:
Boron addition to conventional titanium alloys below the eutectic limit refines the cast microstructure and improves mechanical properties. The present work explores the influence of hypoeutectic boron addition on the microstructure and texture evolution in Ti-6Al-4V alloy under beta extrusion. The beta extruded microstructure of Ti-6Al-4V is characterized by shear bands parallel to the extrusion direction. In contrast, the extruded Ti-6Al-4V-0.1B alloy shows a regular beta worked microstructure consisting of fine prior beta grains and acicular alpha-lamellae with no signs of the microstructural instability. Crystallographic texture after extrusion was almost identical for the two alloys indicating the similarity in their transformation behavior, which is attributed to complete dynamic recrystallization during beta processing. Microstructural features as well as crystallographic texture indicate dominant grain boundary related deformation processes for the boron modified alloy that leads to homogeneous deformation without instability formation. The absence of shear bands has significant technological importance as far as the secondary processing of boron added alloys in (alpha + beta)-phase field are concerned. (C) 2012 Elsevier B.V. All rights reserved.