13 resultados para Parallel Programming

em Universidade Federal do Rio Grande do Norte(UFRN)


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The telecommunications play a fundamental role in the contemporary society, having as one of its main roles to give people the possibility to connect them and integrate them into society in which they operate and, therewith, accelerate development through knowledge. But as new technologies are introduced on the market, increases the demand for new products and services that depend on the infrastructure offered, making the problems of planning of telecommunication networks become increasingly large and complex. Many of these problems, however, can be formulated as combinatorial optimization models, and the use of heuristic algorithms can help solve these issues in the planning phase. This paper proposes the development of a Parallel Evolutionary Algorithm to be applied to telecommunications problem known in the literature as SONET Ring Assignment Problem SRAP. This problem is the class NP-hard and arises during the physical planning of a telecommunication network and consists of determining the connections between locations (customers), satisfying a series of constrains of the lowest possible cost. Experimental results illustrate the effectiveness of the Evolutionary Algorithm parallel, over other methods, to obtain solutions that are either optimal or very close to it

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The seismic method is of extreme importance in geophysics. Mainly associated with oil exploration, this line of research focuses most of all investment in this area. The acquisition, processing and interpretation of seismic data are the parts that instantiate a seismic study. Seismic processing in particular is focused on the imaging that represents the geological structures in subsurface. Seismic processing has evolved significantly in recent decades due to the demands of the oil industry, and also due to the technological advances of hardware that achieved higher storage and digital information processing capabilities, which enabled the development of more sophisticated processing algorithms such as the ones that use of parallel architectures. One of the most important steps in seismic processing is imaging. Migration of seismic data is one of the techniques used for imaging, with the goal of obtaining a seismic section image that represents the geological structures the most accurately and faithfully as possible. The result of migration is a 2D or 3D image which it is possible to identify faults and salt domes among other structures of interest, such as potential hydrocarbon reservoirs. However, a migration fulfilled with quality and accuracy may be a long time consuming process, due to the mathematical algorithm heuristics and the extensive amount of data inputs and outputs involved in this process, which may take days, weeks and even months of uninterrupted execution on the supercomputers, representing large computational and financial costs, that could derail the implementation of these methods. Aiming at performance improvement, this work conducted the core parallelization of a Reverse Time Migration (RTM) algorithm, using the parallel programming model Open Multi-Processing (OpenMP), due to the large computational effort required by this migration technique. Furthermore, analyzes such as speedup, efficiency were performed, and ultimately, the identification of the algorithmic scalability degree with respect to the technological advancement expected by future processors

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents the concept, design and implementation of a MP-SoC platform, named STORM (MP-SoC DirecTory-Based PlatfORM). Currently the platform is composed of the following modules: SPARC V8 processor, GPOP processor, Cache module, Memory module, Directory module and two different modles of Network-on-Chip, NoCX4 and Obese Tree. All modules were implemented using SystemC, simulated and validated, individually or in group. The modules description is presented in details. For programming the platform in C it was implemented a SPARC assembler, fully compatible with gcc s generated assembly code. For the parallel programming it was implemented a library for mutex managing, using the due assembler s support. A total of 10 simulations of increasing complexity are presented for the validation of the presented concepts. The simulations include real parallel applications, such as matrix multiplication, Mergesort, KMP, Motion Estimation and DCT 2D

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The evolution of wireless communication systems leads to Dynamic Spectrum Allocation for Cognitive Radio, which requires reliable spectrum sensing techniques. Among the spectrum sensing methods proposed in the literature, those that exploit cyclostationary characteristics of radio signals are particularly suitable for communication environments with low signal-to-noise ratios, or with non-stationary noise. However, such methods have high computational complexity that directly raises the power consumption of devices which often have very stringent low-power requirements. We propose a strategy for cyclostationary spectrum sensing with reduced energy consumption. This strategy is based on the principle that p processors working at slower frequencies consume less power than a single processor for the same execution time. We devise a strict relation between the energy savings and common parallel system metrics. The results of simulations show that our strategy promises very significant savings in actual devices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work describes the study and the implementation of the vector speed control for a three-phase Bearingless induction machine with divided winding of 4 poles and 1,1 kW using the neural rotor flux estimation. The vector speed control operates together with the radial positioning controllers and with the winding currents controllers of the stator phases. For the radial positioning, the forces controlled by the internal machine magnetic fields are used. For the radial forces optimization , a special rotor winding with independent circuits which allows a low rotational torque influence was used. The neural flux estimation applied to the vector speed controls has the objective of compensating the parameter dependences of the conventional estimators in relation to the parameter machine s variations due to the temperature increases or due to the rotor magnetic saturation. The implemented control system allows a direct comparison between the respective responses of the speed and radial positioning controllers to the machine oriented by the neural rotor flux estimator in relation to the conventional flux estimator. All the system control is executed by a program developed in the ANSI C language. The DSP resources used by the system are: the Analog/Digital channels converters, the PWM outputs and the parallel and RS-232 serial interfaces, which are responsible, respectively, by the DSP programming and the data capture through the supervisory system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due of industrial informatics several attempts have been done to develop notations and semantics, which are used for classifying and describing different kind of system behavior, particularly in the modeling phase. Such attempts provide the infrastructure to resolve some real problems of engineering and construct practical systems that aim at, mainly, to increase the productivity, quality, and security of the process. Despite the many studies that have attempted to develop friendly methods for industrial controller programming, they are still programmed by conventional trial-and-error methods and, in practice, there is little written documentation on these systems. The ideal solution would be to use a computational environment that allows industrial engineers to implement the system using high-level language and that follows international standards. Accordingly, this work proposes a methodology for plant and control modelling of the discrete event systems that include sequential, parallel and timed operations, using a formalism based on Statecharts, denominated Basic Statechart (BSC). The methodology also permits automatic procedures to validate and implement these systems. To validate our methodology, we presented two case studies with typical examples of the manufacturing sector. The first example shows a sequential control for a tagged machine, which is used to illustrated dependences between the devices of the plant. In the second example, we discuss more than one strategy for controlling a manufacturing cell. The model with no control has 72 states (distinct configurations) and, the model with sequential control generated 20 different states, but they only act in 8 distinct configurations. The model with parallel control generated 210 different states, but these 210 configurations act only in 26 distinct configurations, therefore, one strategy control less restrictive than previous. Lastly, we presented one example for highlight the modular characteristic of our methodology, which it is very important to maintenance of applications. In this example, the sensors for identifying pieces in the plant were removed. So, changes in the control model are needed to transmit the information of the input buffer sensor to the others positions of the cell

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work describes the study and the implementation of the speed control for a three-phase induction motor of 1,1 kW and 4 poles using the neural rotor flux estimation. The vector speed control operates together with the winding currents controller of the stator phasis. The neural flux estimation applied to the vector speed controls has the objective of compensating the parameter dependences of the conventional estimators in relation to the parameter machine s variations due to the temperature increases or due to the rotor magnetic saturation. The implemented control system allows a direct comparison between the respective responses of the speed controls to the machine oriented by the neural rotor flux estimator in relation to the conventional flux estimator. All the system control is executed by a program developed in the ANSI C language. The main DSP recources used by the system are, respectively, the Analog/Digital channels converters, the PWM outputs and the parallel and RS-232 serial interfaces, which are responsible, respectively, by the DSP programming and the data capture through the supervisory system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study shows the implementation and the embedding of an Artificial Neural Network (ANN) in hardware, or in a programmable device, as a field programmable gate array (FPGA). This work allowed the exploration of different implementations, described in VHDL, of multilayer perceptrons ANN. Due to the parallelism inherent to ANNs, there are disadvantages in software implementations due to the sequential nature of the Von Neumann architectures. As an alternative to this problem, there is a hardware implementation that allows to exploit all the parallelism implicit in this model. Currently, there is an increase in use of FPGAs as a platform to implement neural networks in hardware, exploiting the high processing power, low cost, ease of programming and ability to reconfigure the circuit, allowing the network to adapt to different applications. Given this context, the aim is to develop arrays of neural networks in hardware, a flexible architecture, in which it is possible to add or remove neurons, and mainly, modify the network topology, in order to enable a modular network of fixed-point arithmetic in a FPGA. Five synthesis of VHDL descriptions were produced: two for the neuron with one or two entrances, and three different architectures of ANN. The descriptions of the used architectures became very modular, easily allowing the increase or decrease of the number of neurons. As a result, some complete neural networks were implemented in FPGA, in fixed-point arithmetic, with a high-capacity parallel processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The last years have presented an increase in the acceptance and adoption of the parallel processing, as much for scientific computation of high performance as for applications of general intention. This acceptance has been favored mainly for the development of environments with massive parallel processing (MPP - Massively Parallel Processing) and of the distributed computation. A common point between distributed systems and MPPs architectures is the notion of message exchange, that allows the communication between processes. An environment of message exchange consists basically of a communication library that, acting as an extension of the programming languages that allow to the elaboration of applications parallel, such as C, C++ and Fortran. In the development of applications parallel, a basic aspect is on to the analysis of performance of the same ones. Several can be the metric ones used in this analysis: time of execution, efficiency in the use of the processing elements, scalability of the application with respect to the increase in the number of processors or to the increase of the instance of the treat problem. The establishment of models or mechanisms that allow this analysis can be a task sufficiently complicated considering parameters and involved degrees of freedom in the implementation of the parallel application. An joined alternative has been the use of collection tools and visualization of performance data, that allow the user to identify to points of strangulation and sources of inefficiency in an application. For an efficient visualization one becomes necessary to identify and to collect given relative to the execution of the application, stage this called instrumentation. In this work it is presented, initially, a study of the main techniques used in the collection of the performance data, and after that a detailed analysis of the main available tools is made that can be used in architectures parallel of the type to cluster Beowulf with Linux on X86 platform being used libraries of communication based in applications MPI - Message Passing Interface, such as LAM and MPICH. This analysis is validated on applications parallel bars that deal with the problems of the training of neural nets of the type perceptrons using retro-propagation. The gotten conclusions show to the potentiality and easinesses of the analyzed tools.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work presents a scalable and efficient parallel implementation of the Standard Simplex algorithm in the multicore architecture to solve large scale linear programming problems. We present a general scheme explaining how each step of the standard Simplex algorithm was parallelized, indicating some important points of the parallel implementation. Performance analysis were conducted by comparing the sequential time using the Simplex tableau and the Simplex of the CPLEXR IBM. The experiments were executed on a shared memory machine with 24 cores. The scalability analysis was performed with problems of different dimensions, finding evidence that our parallel standard Simplex algorithm has a better parallel efficiency for problems with more variables than constraints. In comparison with CPLEXR , the proposed parallel algorithm achieved a efficiency of up to 16 times better

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It bet on the next generation of computers as architecture with multiple processors and/or multicore processors. In this sense there are challenges related to features interconnection, operating frequency, the area on chip, power dissipation, performance and programmability. The mechanism of interconnection and communication it was considered ideal for this type of architecture are the networks-on-chip, due its scalability, reusability and intrinsic parallelism. The networks-on-chip communication is accomplished by transmitting packets that carry data and instructions that represent requests and responses between the processing elements interconnected by the network. The transmission of packets is accomplished as in a pipeline between the routers in the network, from source to destination of the communication, even allowing simultaneous communications between pairs of different sources and destinations. From this fact, it is proposed to transform the entire infrastructure communication of network-on-chip, using the routing mechanisms, arbitration and storage, in a parallel processing system for high performance. In this proposal, the packages are formed by instructions and data that represent the applications, which are executed on routers as well as they are transmitted, using the pipeline and parallel communication transmissions. In contrast, traditional processors are not used, but only single cores that control the access to memory. An implementation of this idea is called IPNoSys (Integrated Processing NoC System), which has an own programming model and a routing algorithm that guarantees the execution of all instructions in the packets, preventing situations of deadlock, livelock and starvation. This architecture provides mechanisms for input and output, interruption and operating system support. As proof of concept was developed a programming environment and a simulator for this architecture in SystemC, which allows configuration of various parameters and to obtain several results to evaluate it

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Removing inconsistencies in a project is a less expensive activity when done in the early steps of design. The use of formal methods improves the understanding of systems. They have various techniques such as formal specification and verification to identify these problems in the initial stages of a project. However, the transformation from a formal specification into a programming language is a non-trivial task and error prone, specially when done manually. The aid of tools at this stage can bring great benefits to the final product to be developed. This paper proposes the extension of a tool whose focus is the automatic translation of specifications written in CSPM into Handel-C. CSP is a formal description language suitable for concurrent systems, and CSPM is the notation used in tools support. Handel-C is a programming language whose result can be compiled directly into FPGA s. Our extension increases the number of CSPM operators accepted by the tool, allowing the user to define local processes, to rename channels in a process and to use Boolean guards on external choices. In addition, we also propose the implementation of a communication protocol that eliminates some restrictions on parallel composition of processes in the translation into Handel-C, allowing communication in a same channel between multiple processes to be mapped in a consistent manner and that improper communication in a channel does not ocurr in the generated code, ie, communications that are not allowed in the system specification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increasing complexity of integrated circuits has boosted the development of communications architectures like Networks-on-Chip (NoCs), as an architecture; alternative for interconnection of Systems-on-Chip (SoC). Networks-on-Chip complain for component reuse, parallelism and scalability, enhancing reusability in projects of dedicated applications. In the literature, lots of proposals have been made, suggesting different configurations for networks-on-chip architectures. Among all networks-on-chip considered, the architecture of IPNoSys is a non conventional one, since it allows the execution of operations, while the communication process is performed. This study aims to evaluate the execution of data-flow based applications on IPNoSys, focusing on their adaptation against the design constraints. Data-flow based applications are characterized by the flowing of continuous stream of data, on which operations are executed. We expect that these type of applications can be improved when running on IPNoSys, because they have a programming model similar to the execution model of this network. By observing the behavior of these applications when running on IPNoSys, were performed changes in the execution model of the network IPNoSys, allowing the implementation of an instruction level parallelism. For these purposes, analysis of the implementations of dataflow applications were performed and compared