111 resultados para Parallel machines
Resumo:
Many evolutionary algorithm applications involve either fitness functions with high time complexity or large dimensionality (hence very many fitness evaluations will typically be needed) or both. In such circumstances, there is a dire need to tune various features of the algorithm well so that performance and time savings are optimized. However, these are precisely the circumstances in which prior tuning is very costly in time and resources. There is hence a need for methods which enable fast prior tuning in such cases. We describe a candidate technique for this purpose, in which we model a landscape as a finite state machine, inferred from preliminary sampling runs. In prior algorithm-tuning trials, we can replace the 'real' landscape with the model, enabling extremely fast tuning, saving far more time than was required to infer the model. Preliminary results indicate much promise, though much work needs to be done to establish various aspects of the conditions under which it can be most beneficially used. A main limitation of the method as described here is a restriction to mutation-only algorithms, but there are various ways to address this and other limitations.
Resumo:
In models of complicated physical-chemical processes operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The recently rediscovered weighted splitting schemes have the great advantage of being parallelizable on operator level, which allows us to reduce the computational time if parallel computers are used. In this paper, the computational times needed for the weighted splitting methods are studied in comparison with the sequential (S) splitting and the Marchuk-Strang (MSt) splitting and are illustrated by numerical experiments performed by use of simplified versions of the Danish Eulerian model (DEM).
Resumo:
Objective: This paper presents a detailed study of fractal-based methods for texture characterization of mammographic mass lesions and architectural distortion. The purpose of this study is to explore the use of fractal and lacunarity analysis for the characterization and classification of both tumor lesions and normal breast parenchyma in mammography. Materials and methods: We conducted comparative evaluations of five popular fractal dimension estimation methods for the characterization of the texture of mass lesions and architectural distortion. We applied the concept of lacunarity to the description of the spatial distribution of the pixel intensities in mammographic images. These methods were tested with a set of 57 breast masses and 60 normal breast parenchyma (dataset1), and with another set of 19 architectural distortions and 41 normal breast parenchyma (dataset2). Support vector machines (SVM) were used as a pattern classification method for tumor classification. Results: Experimental results showed that the fractal dimension of region of interest (ROIs) depicting mass lesions and architectural distortion was statistically significantly lower than that of normal breast parenchyma for all five methods. Receiver operating characteristic (ROC) analysis showed that fractional Brownian motion (FBM) method generated the highest area under ROC curve (A z = 0.839 for dataset1, 0.828 for dataset2, respectively) among five methods for both datasets. Lacunarity analysis showed that the ROIs depicting mass lesions and architectural distortion had higher lacunarities than those of ROIs depicting normal breast parenchyma. The combination of FBM fractal dimension and lacunarity yielded the highest A z value (0.903 and 0.875, respectively) than those based on single feature alone for both given datasets. The application of the SVM improved the performance of the fractal-based features in differentiating tumor lesions from normal breast parenchyma by generating higher A z value. Conclusion: FBM texture model is the most appropriate model for characterizing mammographic images due to self-affinity assumption of the method being a better approximation. Lacunarity is an effective counterpart measure of the fractal dimension in texture feature extraction in mammographic images. The classification results obtained in this work suggest that the SVM is an effective method with great potential for classification in mammographic image analysis.
Resumo:
Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.
Resumo:
In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi-threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full-scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread-safe Java messaging system—MPJ Express. The first application is the Gadget-2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite-domain time-difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.
Resumo:
As consumers demand more functionality) from their electronic devices and manufacturers supply the demand then electrical power and clock requirements tend to increase, however reassessing system architecture can fortunately lead to suitable counter reductions. To maintain low clock rates and therefore reduce electrical power, this paper presents a parallel convolutional coder for the transmit side in many wireless consumer devices. The coder accepts a parallel data input and directly computes punctured convolutional codes without the need for a separate puncturing operation while the coded bits are available at the output of the coder in a parallel fashion. Also as the computation is in parallel then the coder can be clocked at 7 times slower than the conventional shift-register based convolutional coder (using DVB 7/8 rate). The presented coder is directly relevant to the design of modern low-power consumer devices
Resumo:
The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on ‘Intelligent agents’ another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.
Resumo:
Rodney Brooks has been called the “Self Styled Bad Boy of Robotics”. In the 1990s he gained this dubious honour by orchestrating a string of highly evocative robots from his artificial interligence Labs at the Massachusettes Institute of Technology (MIT), Boston, USA.
Resumo:
We propose a bridge between two important parallel programming paradigms: data parallelism and communicating sequential processes (CSP). Data parallel pipelined architectures obtained with the Alpha language can be embedded in a control intensive application expressed in CSP-based Handel formalism. The interface is formally defined from the semantics of the languages Alpha and Handel. This work will ease the design of compute intensive applications on FPGAs.
Resumo:
This paper is concerned with the uniformization of a system of afine recurrence equations. This transformation is used in the design (or compilation) of highly parallel embedded systems (VLSI systolic arrays, signal processing filters, etc.). In this paper, we present and implement an automatic system to achieve uniformization of systems of afine recurrence equations. We unify the results from many earlier papers, develop some theoretical extensions, and then propose effective uniformization algorithms. Our results can be used in any high level synthesis tool based on polyhedral representation of nested loop computations.
Resumo:
A novel rotor velocity estimation scheme applicable to vector controlled induction motors has been described. The proposed method will evaluate rotor velocity, ωr, on-line, does not require any extra transducers or injection of any signals, nor does it employ complicated algorithms such as MRAS or Kalman filters. Furthermore, the new scheme will operate at all velocities including zero with very little error. The procedure employs motor model equations, however all differential and integral terms have been eliminated giving a very fast, low-cost, effective and practical alternative to the current available methods. Simulation results verify the operation of the scheme under ideal and PWM conditions.
Resumo:
Purpose – The purpose of this paper is to consider Turing's two tests for machine intelligence: the parallel-paired, three-participants game presented in his 1950 paper, and the “jury-service” one-to-one measure described two years later in a radio broadcast. Both versions were instantiated in practical Turing tests during the 18th Loebner Prize for artificial intelligence hosted at the University of Reading, UK, in October 2008. This involved jury-service tests in the preliminary phase and parallel-paired in the final phase. Design/methodology/approach – Almost 100 test results from the final have been evaluated and this paper reports some intriguing nuances which arose as a result of the unique contest. Findings – In the 2008 competition, Turing's 30 per cent pass rate is not achieved by any machine in the parallel-paired tests but Turing's modified prediction: “at least in a hundred years time” is remembered. Originality/value – The paper presents actual responses from “modern Elizas” to human interrogators during contest dialogues that show considerable improvement in artificial conversational entities (ACE). Unlike their ancestor – Weizenbaum's natural language understanding system – ACE are now able to recall, share information and disclose personal interests.
Resumo:
Recent research in multi-agent systems incorporate fault tolerance concepts. However, the research does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely ‘Intelligent Agents’. In the approach considered a task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The agents hence contribute towards fault tolerance and towards building reliable systems. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.