44 resultados para scalable parallel programming


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global illumination algorithms are at the center of realistic image synthesis and account for non-trivial light transport and occlusion within scenes, such as indirect illumination, ambient occlusion, and environment lighting. Their computationally most difficult part is determining light source visibility at each visible scene point. Height fields, on the other hand, constitute an important special case of geometry and are mainly used to describe certain types of objects such as terrains and to map detailed geometry onto object surfaces. The geometry of an entire scene can also be approximated by treating the distance values of its camera projection as a screen-space height field. In order to shadow height fields from environment lights a horizon map is usually used to occlude incident light. We reduce the per-receiver time complexity of generating the horizon map on N N height fields from O(N) of the previous work to O(1) by using an algorithm that incrementally traverses the height field and reuses the information already gathered along the path of traversal. We also propose an accurate method to integrate the incident light within the limits given by the horizon map. Indirect illumination in height fields requires information about which other points are visible to each height field point. We present an algorithm to determine this intervisibility in a time complexity that matches the space complexity of the produced visibility information, which is in contrast to previous methods which scale in the height field size. As a result the amount of computation is reduced by two orders of magnitude in common use cases. Screen-space ambient obscurance methods approximate ambient obscurance from the depth bu er geometry and have been widely adopted by contemporary real-time applications. They work by sampling the screen-space geometry around each receiver point but have been previously limited to near- field effects because sampling a large radius quickly exceeds the render time budget. We present an algorithm that reduces the quadratic per-pixel complexity of previous methods to a linear complexity by line sweeping over the depth bu er and maintaining an internal representation of the processed geometry from which occluders can be efficiently queried. Another algorithm is presented to determine ambient obscurance from the entire depth bu er at each screen pixel. The algorithm scans the depth bu er in a quick pre-pass and locates important features in it, which are then used to evaluate the ambient obscurance integral accurately. We also propose an evaluation of the integral such that results within a few percent of the ray traced screen-space reference are obtained at real-time render times.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel-connected photovoltaic inverters are required in large solar plants where it is not economically or technically reasonable to use a single inverter. Currently, parallel inverters require individual isolating transformers to cut the path for the circulating current. In this doctoral dissertation, the problem is approached by attempting to minimize the generated circulating current. The circulating current is a function of the generated common-mode voltages of the parallel inverters and can be minimized by synchronizing the inverters. The synchronization has previously been achieved by a communication link. However, in photovoltaic systems the inverters may be located far apart from each other. Thus, a control free of communication is desired. It is shown in this doctoral dissertation that the circulating current can also be obtained by a common-mode voltage measurement. A control method based on a short-time switching frequency transition is developed and tested with an actual photovoltaic environment of two parallel inverters connected to two 5 kW solar arrays. Controls based on the measurement of the circulating current and the common-mode voltage are generated and tested. A communication-free method of controlling the circulating current between parallelconnected inverters is developed and verified.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The dissertation proposes two control strategies, which include the trajectory planning and vibration suppression, for a kinematic redundant serial-parallel robot machine, with the aim of attaining the satisfactory machining performance. For a given prescribed trajectory of the robot's end-effector in the Cartesian space, a set of trajectories in the robot's joint space are generated based on the best stiffness performance of the robot along the prescribed trajectory. To construct the required system-wide analytical stiffness model for the serial-parallel robot machine, a variant of the virtual joint method (VJM) is proposed in the dissertation. The modified method is an evolution of Gosselin's lumped model that can account for the deformations of a flexible link in more directions. The effectiveness of this VJM variant is validated by comparing the computed stiffness results of a flexible link with the those of a matrix structural analysis (MSA) method. The comparison shows that the numerical results from both methods on an individual flexible beam are almost identical, which, in some sense, provides mutual validation. The most prominent advantage of the presented VJM variant compared with the MSA method is that it can be applied in a flexible structure system with complicated kinematics formed in terms of flexible serial links and joints. Moreover, by combining the VJM variant and the virtual work principle, a systemwide analytical stiffness model can be easily obtained for mechanisms with both serial kinematics and parallel kinematics. In the dissertation, a system-wide stiffness model of a kinematic redundant serial-parallel robot machine is constructed based on integration of the VJM variant and the virtual work principle. Numerical results of its stiffness performance are reported. For a kinematic redundant robot, to generate a set of feasible joints' trajectories for a prescribed trajectory of its end-effector, its system-wide stiffness performance is taken as the constraint in the joints trajectory planning in the dissertation. For a prescribed location of the end-effector, the robot permits an infinite number of inverse solutions, which consequently yields infinite kinds of stiffness performance. Therefore, a differential evolution (DE) algorithm in which the positions of redundant joints in the kinematics are taken as input variables was employed to search for the best stiffness performance of the robot. Numerical results of the generated joint trajectories are given for a kinematic redundant serial-parallel robot machine, IWR (Intersector Welding/Cutting Robot), when a particular trajectory of its end-effector has been prescribed. The numerical results show that the joint trajectories generated based on the stiffness optimization are feasible for realization in the control system since they are acceptably smooth. The results imply that the stiffness performance of the robot machine deviates smoothly with respect to the kinematic configuration in the adjacent domain of its best stiffness performance. To suppress the vibration of the robot machine due to varying cutting force during the machining process, this dissertation proposed a feedforward control strategy, which is constructed based on the derived inverse dynamics model of target system. The effectiveness of applying such a feedforward control in the vibration suppression has been validated in a parallel manipulator in the software environment. The experimental study of such a feedforward control has also been included in the dissertation. The difficulties of modelling the actual system due to the unknown components in its dynamics is noticed. As a solution, a back propagation (BP) neural network is proposed for identification of the unknown components of the dynamics model of the target system. To train such a BP neural network, a modified Levenberg-Marquardt algorithm that can utilize an experimental input-output data set of the entire dynamic system is introduced in the dissertation. Validation of the BP neural network and the modified Levenberg- Marquardt algorithm is done, respectively, by a sinusoidal output approximation, a second order system parameters estimation, and a friction model estimation of a parallel manipulator, which represent three different application aspects of this method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The state of the object-oriented programming course in Lappeenranta University of Technology had reached the point, where it required changes to provide better learning opportunities and thus the learning outcomes. Based on the student feedback the course was partially dated and ineffective. The components of the course were analysed and the ineffective elements were removed and new methods were introduced to improve the course. The major changes included the change from traditional teaching methods to reverse classroom method and the use of Java as the programming language. The changes were measured by the student feedback, lecturer’s observations and comparison to previous years. The feedback suggested that the changes were successful; the course received higher overall grade than before.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The whole research of the current Master Thesis project is related to Big Data transfer over Parallel Data Link and my main objective is to assist the Saint-Petersburg National Research University ITMO research team to accomplish this project and apply Green IT methods for the data transfer system. The goal of the team is to transfer Big Data by using parallel data links with SDN Openflow approach. My task as a team member was to compare existing data transfer applications in case to verify which results the highest data transfer speed in which occasions and explain the reasons. In the context of this thesis work a comparison between 5 different utilities was done, which including Fast Data Transfer (FDT), BBCP, BBFTP, GridFTP, and FTS3. A number of scripts where developed which consist of creating random binary data to be incompressible to have fair comparison between utilities, execute the Utilities with specified parameters, create log files, results, system parameters, and plot graphs to compare the results. Transferring such an enormous variety of data can take a long time, and hence, the necessity appears to reduce the energy consumption to make them greener. In the context of Green IT approach, our team used Cloud Computing infrastructure called OpenStack. It’s more efficient to allocated specific amount of hardware resources to test different scenarios rather than using the whole resources from our testbed. Testing our implementation with OpenStack infrastructure results that the virtual channel does not consist of any traffic and we can achieve the highest possible throughput. After receiving the final results we are in place to identify which utilities produce faster data transfer in different scenarios with specific TCP parameters and we can use them in real network data links.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

New emerging technologies in the recent decade have brought new options to cross platform computer graphics development. This master thesis took a look for cross platform 3D graphics development possibilities. All platform dependent and non real time solutions were excluded. WebGL and two different OpenGL based solutions were assessed via demo application by using most recent development tools. In the results pros and cons of the each solutions were noted.