22 resultados para parallel computation
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.
Resumo:
Diplomityön tarkoituksena on optimoida asiakkaiden sähkölaskun laskeminen hajautetun laskennan avulla. Älykkäiden etäluettavien energiamittareiden tullessa jokaiseen kotitalouteen, energiayhtiöt velvoitetaan laskemaan asiakkaiden sähkölaskut tuntiperusteiseen mittaustietoon perustuen. Kasvava tiedonmäärä lisää myös tarvittavien laskutehtävien määrää. Työssä arvioidaan vaihtoehtoja hajautetun laskennan toteuttamiseksi ja luodaan tarkempi katsaus pilvilaskennan mahdollisuuksiin. Lisäksi ajettiin simulaatioita, joiden avulla arvioitiin rinnakkaislaskennan ja peräkkäislaskennan eroja. Sähkölaskujen oikeinlaskemisen tueksi kehitettiin mittauspuu-algoritmi.
Resumo:
Cellular automata are models for massively parallel computation. A cellular automaton consists of cells which are arranged in some kind of regular lattice and a local update rule which updates the state of each cell according to the states of the cell's neighbors on each step of the computation. This work focuses on reversible one-dimensional cellular automata in which the cells are arranged in a two-way in_nite line and the computation is reversible, that is, the previous states of the cells can be derived from the current ones. In this work it is shown that several properties of reversible one-dimensional cellular automata are algorithmically undecidable, that is, there exists no algorithm that would tell whether a given cellular automaton has the property or not. It is shown that the tiling problem of Wang tiles remains undecidable even in some very restricted special cases. It follows that it is undecidable whether some given states will always appear in computations by the given cellular automaton. It also follows that a weaker form of expansivity, which is a concept of dynamical systems, is an undecidable property for reversible one-dimensional cellular automata. It is shown that several properties of dynamical systems are undecidable for reversible one-dimensional cellular automata. It shown that sensitivity to initial conditions and topological mixing are undecidable properties. Furthermore, non-sensitive and mixing cellular automata are recursively inseparable. It follows that also chaotic behavior is an undecidable property for reversible one-dimensional cellular automata.
Resumo:
Numerical weather prediction and climate simulation have been among the computationally most demanding applications of high performance computing eversince they were started in the 1950's. Since the 1980's, the most powerful computers have featured an ever larger number of processors. By the early 2000's, this number is often several thousand. An operational weather model must use all these processors in a highly coordinated fashion. The critical resource in running such models is not computation, but the amount of necessary communication between the processors. The communication capacity of parallel computers often fallsfar short of their computational power. The articles in this thesis cover fourteen years of research into how to harness thousands of processors on a single weather forecast or climate simulation, so that the application can benefit as much as possible from the power of parallel high performance computers. The resultsattained in these articles have already been widely applied, so that currently most of the organizations that carry out global weather forecasting or climate simulation anywhere in the world use methods introduced in them. Some further studies extend parallelization opportunities into other parts of the weather forecasting environment, in particular to data assimilation of satellite observations.
Resumo:
This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.
Resumo:
Tässä diplomityössä tutkitaan dispariteettikartan laskennan tehostamista interpoloimalla. Kolmiomittausta käyttämällä stereokuvasta muodostetaan ensin harva dispariteettikartta, jonka jälkeen koko kuvan kattava dispariteettikartta muodostetaan interpoloimalla. Kolmiomittausta varten täytyy tietää samaa reaalimaailman pistettä vastaavat kuvapisteet molemmissa kameroissa. Huolimatta siitä, että vastaavien pisteiden hakualue voidaan pienentää kahdesta ulottuvuudesta yhteen ulottuvuuteen käyttämällä esimerkiksi epipolaarista geometriaa, on laskennallisesti tehokkaampaa määrittää osa dispariteetikartasta interpoloimalla, kuin etsiä vastaavia kuvapisteitä stereokuvista. Myöskin johtuen stereonäköjärjestelmän kameroiden välisestä etäisyydestä, kaikki kuvien pisteet eivät löydy toisesta kuvasta. Näin ollen on mahdotonta määrittää koko kuvan kattavaa dispariteettikartaa pelkästään vastaavista pisteistä. Vastaavien pisteiden etsimiseen tässä työssä käytetään dynaamista ohjelmointia sekä korrelaatiomenetelmää. Reaalimaailman pinnat ovat yleisesti ottaen jatkuvia, joten geometrisessä mielessä on perusteltua approksimoida kuvien esittämiä pintoja interpoloimalla. On myöskin olemassa tieteellistä näyttöä, jonkamukaan ihmisen stereonäkö interpoloi objektien pintoja.
Resumo:
This thesis presents briefly the basic operation and use of centrifugal pumps and parallel pumping applications. The characteristics of parallel pumping applications are compared to circuitry, in order to search analogy between these technical fields. The purpose of studying circuitry is to find out if common software tools for solving circuit performance could be used to observe parallel pumping applications. The empirical part of the thesis introduces a simulation environment for parallel pumping systems, which is based on circuit components of Matlab Simulink —software. The created simulation environment ensures the observation of variable speed controlled parallel pumping systems in case of different controlling methods. The introduced simulation environment was evaluated by building a simulation model for actual parallel pumping system at Lappeenranta University of Technology. The simulated performance of the parallel pumps was compared to measured values of the actual system. The gathered information shows, that if the initial data of the system and pump perfonnance is adequate, the circuitry based simulation environment can be exploited to observe parallel pumping systems. The introduced simulation environment can represent the actual operation of parallel pumps in reasonably accuracy. There by the circuitry based simulation can be used as a researching tool to develop new controlling ways for parallel pumps.
Resumo:
Over the last decades, calibration techniques have been widely used to improve the accuracy of robots and machine tools since they only involve software modification instead of changing the design and manufacture of the hardware. Traditionally, there are four steps are required for a calibration, i.e. error modeling, measurement, parameter identification and compensation. The objective of this thesis is to propose a method for the kinematics analysis and error modeling of a newly developed hybrid redundant robot IWR (Intersector Welding Robot), which possesses ten degrees of freedom (DOF) where 6-DOF in parallel and additional 4-DOF in serial. In this article, the problem of kinematics modeling and error modeling of the proposed IWR robot are discussed. Based on the vector arithmetic method, the kinematics model and the sensitivity model of the end-effector subject to the structure parameters is derived and analyzed. The relations between the pose (position and orientation) accuracy and manufacturing tolerances, actuation errors, and connection errors are formulated. Computer simulation is performed to examine the validity and effectiveness of the proposed method.
Resumo:
The maximum realizable power throughput of power electronic converters may be limited or constrained by technical or economical considerations. One solution to this problemis to connect several power converter units in parallel. The parallel connection can be used to increase the current carrying capacity of the overall system beyond the ratings of individual power converter units. Thus, it is possible to use several lower-power converter units, produced in large quantities, as building blocks to construct high-power converters in a modular manner. High-power converters realized by using parallel connection are needed for example in multimegawatt wind power generation systems. Parallel connection of power converter units is also required in emerging applications such as photovoltaic and fuel cell power conversion. The parallel operation of power converter units is not, however, problem free. This is because parallel-operating units are subject to overcurrent stresses, which are caused by unequal load current sharing or currents that flow between the units. Commonly, the term ’circulatingcurrent’ is used to describe both the unequal load current sharing and the currents flowing between the units. Circulating currents, again, are caused by component tolerances and asynchronous operation of the parallel units. Parallel-operating units are also subject to stresses caused by unequal thermal stress distribution. Both of these problemscan, nevertheless, be handled with a proper circulating current control. To design an effective circulating current control system, we need information about circulating current dynamics. The dynamics of the circulating currents can be investigated by developing appropriate mathematical models. In this dissertation, circulating current models aredeveloped for two different types of parallel two-level three-phase inverter configurations. Themodels, which are developed for an arbitrary number of parallel units, provide a framework for analyzing circulating current generation mechanisms and developing circulating current control systems. In addition to developing circulating current models, modulation of parallel inverters is considered. It is illustrated that depending on the parallel inverter configuration and the modulation method applied, common-mode circulating currents may be excited as a consequence of the differential-mode circulating current control. To prevent the common-mode circulating currents that are caused by the modulation, a dual modulator method is introduced. The dual modulator basically consists of two independently operating modulators, the outputs of which eventually constitute the switching commands of the inverter. The two independently operating modulators are referred to as primary and secondary modulators. In its intended usage, the same voltage vector is fed to the primary modulators of each parallel unit, and the inputs of the secondary modulators are obtained from the circulating current controllers. To ensure that voltage commands obtained from the circulating current controllers are realizable, it must be guaranteed that the inverter is not driven into saturation by the primary modulator. The inverter saturation can be prevented by limiting the inputs of the primary and secondary modulators. Because of this, also a limitation algorithm is proposed. The operation of both the proposed dual modulator and the limitation algorithm is verified experimentally.
Resumo:
The aim of this thesis is to describe hybrid drive design problems, the advantages and difficulties related to the drive. A review of possible hybrid constructions, benefits of parallel, series and series-parallel hybrids is done. In the thesis analytical and finite element calculations of permanent magnet synchronous machines with embedded magnets were done. The finite element calculations were done using Cedrat’s Flux 2D software. This machine is planned to be used as a motor-generator in a low power parallel hybrid vehicle. The boundary conditions for the design were found from Lucas-TVS Ltd., India. Design Requirements, briefly: • The system DC voltage level is 120 V, which implies Uphase = 49 V (RMS) in a three phase system. • The power output of 10 kW at base speed 1500 rpm (Torque of 65 Nm) is desired. • The maximum outer diameter should not be more than 250 mm, and the maximum core length should not exceed 40 mm. The main difficulties which the author met were the dimensional restrictions. After having designed and analyzed several possible constructions they were compared and the final design selected. Dimensioned and detailed design is performed. Effects of different parameters, such as the number of poles, number of turns and magnetic geometry are discussed. The best modification offers considerable reduction of volume.
Resumo:
Memristive computing refers to the utilization of the memristor, the fourth fundamental passive circuit element, in computational tasks. The existence of the memristor was theoretically predicted in 1971 by Leon O. Chua, but experimentally validated only in 2008 by HP Labs. A memristor is essentially a nonvolatile nanoscale programmable resistor — indeed, memory resistor — whose resistance, or memristance to be precise, is changed by applying a voltage across, or current through, the device. Memristive computing is a new area of research, and many of its fundamental questions still remain open. For example, it is yet unclear which applications would benefit the most from the inherent nonlinear dynamics of memristors. In any case, these dynamics should be exploited to allow memristors to perform computation in a natural way instead of attempting to emulate existing technologies such as CMOS logic. Examples of such methods of computation presented in this thesis are memristive stateful logic operations, memristive multiplication based on the translinear principle, and the exploitation of nonlinear dynamics to construct chaotic memristive circuits. This thesis considers memristive computing at various levels of abstraction. The first part of the thesis analyses the physical properties and the current-voltage behaviour of a single device. The middle part presents memristor programming methods, and describes microcircuits for logic and analog operations. The final chapters discuss memristive computing in largescale applications. In particular, cellular neural networks, and associative memory architectures are proposed as applications that significantly benefit from memristive implementation. The work presents several new results on memristor modeling and programming, memristive logic, analog arithmetic operations on memristors, and applications of memristors. The main conclusion of this thesis is that memristive computing will be advantageous in large-scale, highly parallel mixed-mode processing architectures. This can be justified by the following two arguments. First, since processing can be performed directly within memristive memory architectures, the required circuitry, processing time, and possibly also power consumption can be reduced compared to a conventional CMOS implementation. Second, intrachip communication can be naturally implemented by a memristive crossbar structure.
Resumo:
To obtain the desirable accuracy of a robot, there are two techniques available. The first option would be to make the robot match the nominal mathematic model. In other words, the manufacturing and assembling tolerances of every part would be extremely tight so that all of the various parameters would match the “design” or “nominal” values as closely as possible. This method can satisfy most of the accuracy requirements, but the cost would increase dramatically as the accuracy requirement increases. Alternatively, a more cost-effective solution is to build a manipulator with relaxed manufacturing and assembling tolerances. By modifying the mathematical model in the controller, the actual errors of the robot can be compensated. This is the essence of robot calibration. Simply put, robot calibration is the process of defining an appropriate error model and then identifying the various parameter errors that make the error model match the robot as closely as possible. This work focuses on kinematic calibration of a 10 degree-of-freedom (DOF) redundant serial-parallel hybrid robot. The robot consists of a 4-DOF serial mechanism and a 6-DOF hexapod parallel manipulator. The redundant 4-DOF serial structure is used to enlarge workspace and the 6-DOF hexapod manipulator is used to provide high load capabilities and stiffness for the whole structure. The main objective of the study is to develop a suitable calibration method to improve the accuracy of the redundant serial-parallel hybrid robot. To this end, a Denavit–Hartenberg (DH) hybrid error model and a Product-of-Exponential (POE) error model are developed for error modeling of the proposed robot. Furthermore, two kinds of global optimization methods, i.e. the differential-evolution (DE) algorithm and the Markov Chain Monte Carlo (MCMC) algorithm, are employed to identify the parameter errors of the derived error model. A measurement method based on a 3-2-1 wire-based pose estimation system is proposed and implemented in a Solidworks environment to simulate the real experimental validations. Numerical simulations and Solidworks prototype-model validations are carried out on the hybrid robot to verify the effectiveness, accuracy and robustness of the calibration algorithms.
Resumo:
Video transcoding refers to the process of converting a digital video from one format into another format. It is a compute-intensive operation. Therefore, transcoding of a large number of simultaneous video streams requires a large amount of computing resources. Moreover, to handle di erent load conditions in a cost-e cient manner, the video transcoding service should be dynamically scalable. Infrastructure as a Service Clouds currently offer computing resources, such as virtual machines, under the pay-per-use business model. Thus the IaaS Clouds can be leveraged to provide a coste cient, dynamically scalable video transcoding service. To use computing resources e ciently in a cloud computing environment, cost-e cient virtual machine provisioning is required to avoid overutilization and under-utilization of virtual machines. This thesis presents proactive virtual machine resource allocation and de-allocation algorithms for video transcoding in cloud computing. Since users' requests for videos may change at di erent times, a check is required to see if the current computing resources are adequate for the video requests. Therefore, the work on admission control is also provided. In addition to admission control, temporal resolution reduction is used to avoid jitters in a video. Furthermore, in a cloud computing environment such as Amazon EC2, the computing resources are more expensive as compared with the storage resources. Therefore, to avoid repetition of transcoding operations, a transcoded video needs to be stored for a certain time. To store all videos for the same amount of time is also not cost-e cient because popular transcoded videos have high access rate while unpopular transcoded videos are rarely accessed. This thesis provides a cost-e cient computation and storage trade-o strategy, which stores videos in the video repository as long as it is cost-e cient to store them. This thesis also proposes video segmentation strategies for bit rate reduction and spatial resolution reduction video transcoding. The evaluation of proposed strategies is performed using a message passing interface based video transcoder, which uses a coarse-grain parallel processing approach where video is segmented at group of pictures level.
Resumo:
The pumping processes requiring wide range of flow are often equipped with parallelconnected centrifugal pumps. In parallel pumping systems, the use of variable speed control allows that the required output for the process can be delivered with a varying number of operated pump units and selected rotational speed references. However, the optimization of the parallel-connected rotational speed controlled pump units often requires adaptive modelling of both parallel pump characteristics and the surrounding system in varying operation conditions. The available information required for the system modelling in typical parallel pumping applications such as waste water treatment and various cooling and water delivery pumping tasks can be limited, and the lack of real-time operation point monitoring often sets limits for accurate energy efficiency optimization. Hence, alternatives for easily implementable control strategies which can be adopted with minimum system data are necessary. This doctoral thesis concentrates on the methods that allow the energy efficient use of variable speed controlled parallel pumps in system scenarios in which the parallel pump units consist of a centrifugal pump, an electric motor, and a frequency converter. Firstly, the suitable operation conditions for variable speed controlled parallel pumps are studied. Secondly, methods for determining the output of each parallel pump unit using characteristic curve-based operation point estimation with frequency converter are discussed. Thirdly, the implementation of the control strategy based on real-time pump operation point estimation and sub-optimization of each parallel pump unit is studied. The findings of the thesis support the idea that the energy efficiency of the pumping can be increased without the installation of new, more efficient components in the systems by simply adopting suitable control strategies. An easily implementable and adaptive control strategy for variable speed controlled parallel pumping systems can be created by utilizing the pump operation point estimation available in modern frequency converters. Hence, additional real-time flow metering, start-up measurements, and detailed system model are unnecessary, and the pumping task can be fulfilled by determining a speed reference for each parallel-pump unit which suggests the energy efficient operation of the pumping system.