934 resultados para Parallel programming (computer science)
Resumo:
<p>Annotation of programs using embedded Domain-Specific Languages (embedded DSLs), such as the program annotation facility for the Java programming language, is a well-known practice in computer science. In this paper we argue for and propose a specialized approach for the usage of embedded Domain-Specific Modelling Languages (embedded DSMLs) in Model-Driven Engineering (MDE) processes that in particular supports automated many-step model transformation chains. It can happen that information defined at some point, using an embedded DSML, is not required in the next immediate transformation step, but in a later one. We propose a new approach of model annotation enabling flexible many-step transformation chains. The approach utilizes a combination of embedded DSMLs, trace models and a megamodel. We demonstrate our approach based on an example MDE process and an industrial case study.</p>
Resumo:
In this paper, a novel video-based multimodal biometric verification scheme using the subspace-based low-level feature fusion of face and speech is developed for specific speaker recognition for perceptual human--computer interaction (HCI). In the proposed scheme, human face is tracked and face pose is estimated to weight the detected facelike regions in successive frames, where ill-posed faces and false-positive detections are assigned with lower credit to enhance the accuracy. In the audio modality, mel-frequency cepstral coefficients are extracted for voice-based biometric verification. In the fusion step, features from both modalities are projected into nonlinear Laplacian Eigenmap subspace for multimodal speaker recognition and combined at low level. The proposed approach is tested on the video database of ten human subjects, and the results show that the proposed scheme can attain better accuracy in comparison with the conventional multimodal fusion using latent semantic analysis as well as the single-modality verifications. The experiment on MATLAB shows the potential of the proposed scheme to attain the real-time performance for perceptual HCI applications.
Resumo:
Nano- and meso-scale simulation of chemical ordering kinetics in nano-layered L1(0)-AB binary intermetallics was performed. In the nano- (atomistic) scale Monte Carlo (MC) technique with vacancy mechanism of atomic migration implemented with diverse models for the system energetics was used. The meso-scale microstructure evolution was, in turn, simulated by means of a MC procedure applied to a system built of meso-scale voxels ordered in particular L1(0) variants. The voxels were free to change the L1(0) variant and interacted with antiphase-boundary energies evaluated within the nano-scale simulations. The study addressed FePt thin layers considered as a material for ultra-high-density magnetic storage media and revealed metastability of the L1(0) c-variant superstructure with monoatomic planes parallel to the (001)-oriented layer surface and off-plane easy magnetization. The layers, originally perfectly ordered in the c-variant, showed discontinuous precipitation of a- and b-L1(0)-variant domains running in parallel with homogeneous disordering (i.e. generation of antisite defects). The domains nucleated heterogeneously on the free monoatomic Fe surface of the layer, grew inwards its volume and relaxed towards an equilibrium microstructure of the system. Two
Resumo:
Workspace analysis and optimization are important in a manipulator design. As the complete workspace of a 6-DOF manipulator is embedded into a 6-imensional space, it is difficult to quantify and qualify it. Most literatures only considered the 3-D sub workspaces of the complete 6-D workspace. In this paper, a finite-partition approach of the Special Euclidean group SE(3) is proposed based on the topology properties of SE(3), which is the product of Special Orthogonal group SO(3) and R^3. It is known that the SO(3) is homeomorphic to a solid ball D^3 with antipodal points identified while the geometry of R^3 can be regarded as a cuboid. The complete 6-D workspace SE(3) is at the first time parametrically and proportionally partitioned into a number of elements with uniform convergence based on its geometry. As a result, a basis volume element of SE(3) is formed by the product of a basis volume element of R^3 and a basis volume element of SO(3), which is the product of a basis volume element of D^3 and its associated integration measure. By this way, the integration of the complete 6-D workspace volume becomes the simple summation of the basis volume elements of SE(3). Two new global performance indices, i.e., workspace volume ratio Wr and global condition index GCI, are defined over the complete 6-D workspace. A newly proposed 3 RPPS parallel manipulator is optimized based on this finite-partition approach. As a result, the optimal dimensions for maximal workspace are obtained, and the optimal performance points in the workspace are identified.
Resumo:
Transcript of a Panel Discussion at the Dartmouth Symposium, chaired by Eric Lyon.
Resumo:
Computing has recently reached an inflection point with the introduction of multicore processors. On-chip thread-level parallelism is doubling approximately every other year. Concurrency lends itself naturally to allowing a program to trade performance for power savings by regulating the number of active cores; however, in several domains, users are unwilling to sacrifice performance to save power. We present a prediction model for identifying energy-efficient operating points of concurrency in well-tuned multithreaded scientific applications and a runtime system that uses live program analysis to optimize applications dynamically. We describe a dynamic phase-aware performance prediction model that combines multivariate regression techniques with runtime analysis of data collected from hardware event counters to locate optimal operating points of concurrency. Using our model, we develop a prediction-driven phase-aware runtime optimization scheme that throttles concurrency so that power consumption can be reduced and performance can be set at the knee of the scalability curve of each program phase. The use of prediction reduces the overhead of searching the optimization space while achieving near-optimal performance and power savings. A thorough evaluation of our approach shows a reduction in power consumption of 10.8 percent, simultaneous with an improvement in performance of 17.9 percent, resulting in energy savings of 26.7 percent.
Resumo:
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow. Furthermore, thread-level parallelism in such programs is often restricted to pipeline parallelism, which can be hard to discover by a programmer. In this paper we propose a tool that, based on profiling information, helps the programmer to discover parallelism. The programmer hand-picks the code transformations from among the proposed candidates which are then applied by automatic code transformation techniques. <br/> <br/>This paper contributes to the literature by presenting a profiling tool for discovering thread-level parallelism. We track dependencies at the whole-data structure level rather than at the element level or byte level in order to limit the profiling overhead. We perform a thorough analysis of the needs and costs of this technique. Furthermore, we present and validate the belief that programs with complex control and data flow contain significant amounts of exploitable coarse-grain pipeline parallelism in the programs outer loops. This observation validates our approach to whole-data structure dependencies. As state-of-the-art compilers focus on loops iterating over data structure members, this observation also explains why our approach finds coarse-grain pipeline parallelism in cases that have remained out of reach for state-of-the-art compilers. In cases where traditional compilation techniques do find parallelism, our approach allows to discover higher degrees of parallelism, allowing a 40% speedup over traditional compilation techniques. Moreover, we demonstrate real speedups on multiple hardware platforms. <br/> <br/>
Resumo:
In a human-computer dialogue system, the dialogue strategy can range from very restrictive to highly flexible. Each specific dialogue style has its pros and cons and a dialogue system needs to select the most appropriate style for a given user. During the course of interaction, the dialogue style can change based on a users response and the system observation of the user. This allows a dialogue system to understand a user better and provide a more suitable way of communication. Since measures of the quality of the users interaction with the system can be incomplete and uncertain, frameworks for reasoning with uncertain and incomplete information can help the system make better decisions when it chooses a dialogue strategy. In this paper, we investigate how to select a dialogue strategy based on aggregating the factors detected during the interaction with the user. For this purpose, we use probabilistic logic programming (PLP) to model probabilistic knowledge about how these factors will affect the degree of freedom of a dialogue. When a dialogue system needs to know which strategy is more suitable, an appropriate query can be executed against the PLP and a probabilistic solution with a degree of satisfaction is returned. The degree of satisfaction reveals how much the system can trust the probability attached to the solution.
Resumo:
Autonomic management can be used to improve the QoS provided by parallel/distributed applications. We discuss behavioural skeletons introduced in earlier work: rather than relying on programmer ability to design from scratch efficient autonomic policies, we encapsulate general autonomic controller features into algorithmic skeletons. Then we leave to the programmer the duty of specifying the parameters needed to specialise the skeletons to the needs of the particular application at hand. This results in the programmer having the ability to fast prototype and tune distributed/parallel applications with non-trivial autonomic management capabilities. We discuss how behavioural skeletons have been implemented in the framework of GCM(the Grid ComponentModel developed within the CoreGRID NoE and currently being implemented within the GridCOMP STREP project). We present results evaluating the overhead introduced by autonomic management activities as well as the overall behaviour of the skeletons. We also present results achieved with a long running application subject to autonomic management and dynamically adapting to changing features of the target architecture. <br/>Overall the results demonstrate both the feasibility of implementing autonomic control via behavioural skeletons and the effectiveness of our sample behavioural skeletons in managing the functional replication pattern(s).