65 resultados para Arquiteturas Paralelas
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
The last years have presented an increase in the acceptance and adoption of the parallel processing, as much for scientific computation of high performance as for applications of general intention. This acceptance has been favored mainly for the development of environments with massive parallel processing (MPP - Massively Parallel Processing) and of the distributed computation. A common point between distributed systems and MPPs architectures is the notion of message exchange, that allows the communication between processes. An environment of message exchange consists basically of a communication library that, acting as an extension of the programming languages that allow to the elaboration of applications parallel, such as C, C++ and Fortran. In the development of applications parallel, a basic aspect is on to the analysis of performance of the same ones. Several can be the metric ones used in this analysis: time of execution, efficiency in the use of the processing elements, scalability of the application with respect to the increase in the number of processors or to the increase of the instance of the treat problem. The establishment of models or mechanisms that allow this analysis can be a task sufficiently complicated considering parameters and involved degrees of freedom in the implementation of the parallel application. An joined alternative has been the use of collection tools and visualization of performance data, that allow the user to identify to points of strangulation and sources of inefficiency in an application. For an efficient visualization one becomes necessary to identify and to collect given relative to the execution of the application, stage this called instrumentation. In this work it is presented, initially, a study of the main techniques used in the collection of the performance data, and after that a detailed analysis of the main available tools is made that can be used in architectures parallel of the type to cluster Beowulf with Linux on X86 platform being used libraries of communication based in applications MPI - Message Passing Interface, such as LAM and MPICH. This analysis is validated on applications parallel bars that deal with the problems of the training of neural nets of the type perceptrons using retro-propagation. The gotten conclusions show to the potentiality and easinesses of the analyzed tools.
Resumo:
The seismic method is of extreme importance in geophysics. Mainly associated with oil exploration, this line of research focuses most of all investment in this area. The acquisition, processing and interpretation of seismic data are the parts that instantiate a seismic study. Seismic processing in particular is focused on the imaging that represents the geological structures in subsurface. Seismic processing has evolved significantly in recent decades due to the demands of the oil industry, and also due to the technological advances of hardware that achieved higher storage and digital information processing capabilities, which enabled the development of more sophisticated processing algorithms such as the ones that use of parallel architectures. One of the most important steps in seismic processing is imaging. Migration of seismic data is one of the techniques used for imaging, with the goal of obtaining a seismic section image that represents the geological structures the most accurately and faithfully as possible. The result of migration is a 2D or 3D image which it is possible to identify faults and salt domes among other structures of interest, such as potential hydrocarbon reservoirs. However, a migration fulfilled with quality and accuracy may be a long time consuming process, due to the mathematical algorithm heuristics and the extensive amount of data inputs and outputs involved in this process, which may take days, weeks and even months of uninterrupted execution on the supercomputers, representing large computational and financial costs, that could derail the implementation of these methods. Aiming at performance improvement, this work conducted the core parallelization of a Reverse Time Migration (RTM) algorithm, using the parallel programming model Open Multi-Processing (OpenMP), due to the large computational effort required by this migration technique. Furthermore, analyzes such as speedup, efficiency were performed, and ultimately, the identification of the algorithmic scalability degree with respect to the technological advancement expected by future processors
Resumo:
This paper analyzes the performance of a parallel implementation of Coupled Simulated Annealing (CSA) for the unconstrained optimization of continuous variables problems. Parallel processing is an efficient form of information processing with emphasis on exploration of simultaneous events in the execution of software. It arises primarily due to high computational performance demands, and the difficulty in increasing the speed of a single processing core. Despite multicore processors being easily found nowadays, several algorithms are not yet suitable for running on parallel architectures. The algorithm is characterized by a group of Simulated Annealing (SA) optimizers working together on refining the solution. Each SA optimizer runs on a single thread executed by different processors. In the analysis of parallel performance and scalability, these metrics were investigated: the execution time; the speedup of the algorithm with respect to increasing the number of processors; and the efficient use of processing elements with respect to the increasing size of the treated problem. Furthermore, the quality of the final solution was verified. For the study, this paper proposes a parallel version of CSA and its equivalent serial version. Both algorithms were analysed on 14 benchmark functions. For each of these functions, the CSA is evaluated using 2-24 optimizers. The results obtained are shown and discussed observing the analysis of the metrics. The conclusions of the paper characterize the CSA as a good parallel algorithm, both in the quality of the solutions and the parallel scalability and parallel efficiency
Resumo:
With the growth of energy consumption worldwide, conventional reservoirs, the reservoirs called "easy exploration and production" are not meeting the global energy demand. This has led many researchers to develop projects that will address these needs, companies in the oil sector has invested in techniques that helping in locating and drilling wells. One of the techniques employed in oil exploration process is the reverse time migration (RTM), in English, Reverse Time Migration, which is a method of seismic imaging that produces excellent image of the subsurface. It is algorithm based in calculation on the wave equation. RTM is considered one of the most advanced seismic imaging techniques. The economic value of the oil reserves that require RTM to be localized is very high, this means that the development of these algorithms becomes a competitive differentiator for companies seismic processing. But, it requires great computational power, that it still somehow harms its practical success. The objective of this work is to explore the implementation of this algorithm in unconventional architectures, specifically GPUs using the CUDA by making an analysis of the difficulties in developing the same, as well as the performance of the algorithm in the sequential and parallel version
Resumo:
Particle Swarm Optimization is a metaheuristic that arose in order to simulate the behavior of a number of birds in flight, with its random movement locally, but globally determined. This technique has been widely used to address non-liner continuous problems and yet little explored in discrete problems. This paper presents the operation of this metaheuristic, and propose strategies for implementation of optimization discret problems as form of execution parallel as sequential. The computational experiments were performed to instances of the TSP, selected in the library TSPLIB contenct to 3038 nodes, showing the improvement of performance of parallel methods for their sequential versions, in executation time and results
Resumo:
The increasing demand for high performance wireless communication systems has shown the inefficiency of the current model of fixed allocation of the radio spectrum. In this context, cognitive radio appears as a more efficient alternative, by providing opportunistic spectrum access, with the maximum bandwidth possible. To ensure these requirements, it is necessary that the transmitter identify opportunities for transmission and the receiver recognizes the parameters defined for the communication signal. The techniques that use cyclostationary analysis can be applied to problems in either spectrum sensing and modulation classification, even in low signal-to-noise ratio (SNR) environments. However, despite the robustness, one of the main disadvantages of cyclostationarity is the high computational cost for calculating its functions. This work proposes efficient architectures for obtaining cyclostationary features to be employed in either spectrum sensing and automatic modulation classification (AMC). In the context of spectrum sensing, a parallelized algorithm for extracting cyclostationary features of communication signals is presented. The performance of this features extractor parallelization is evaluated by speedup and parallel eficiency metrics. The architecture for spectrum sensing is analyzed for several configuration of false alarm probability, SNR levels and observation time for BPSK and QPSK modulations. In the context of AMC, the reduced alpha-profile is proposed as as a cyclostationary signature calculated for a reduced cyclic frequencies set. This signature is validated by a modulation classification architecture based on pattern matching. The architecture for AMC is investigated for correct classification rates of AM, BPSK, QPSK, MSK and FSK modulations, considering several scenarios of observation length and SNR levels. The numerical results of performance obtained in this work show the eficiency of the proposed architectures
Resumo:
The main goal of the present work is related to the dynamics of the steady state, incompressible, laminar flow with heat transfer, of an electrically conducting and Newtonian fluid inside a flat parallel-plate channel under the action of an external and uniform magnetic field. For solution of the governing equations, written in the parabolic boundary layer and stream-function formulation, it was employed the hybrid, numericalanalytical, approach known as Generalized Integral Transform Technique (GITT). The flow is sustained by a pressure gradient and the magnetic field is applied in the direction normal to the flow and is assumed that normal magnetic field is kept uniform, remaining larger than any other fields generated in other directions. In order to evaluate the influence of the applied magnetic field on both entrance regions, thermal and hydrodynamic, for this forced convection problem, as well as for validating purposes of the adopted solution methodology, two kinds of channel entry conditions for the velocity field were used: an uniform and an non-MHD parabolic profile. On the other hand, for the thermal problem only an uniform temperature profile at the channel inlet was employed as boundary condition. Along the channel wall, plates are maintained at constant temperature, either equal to or different from each other. Results for the velocity and temperature fields as well as for the main related potentials are produced and compared, for validation purposes, to results reported on literature as function of the main dimensionless governing parameters as Reynolds and Hartman numbers, for typical situations. Finally, in order to illustrate the consistency of the integral transform method, convergence analyses are also effectuated and presented
Resumo:
The Reconfigurables Architectures had appeares as an alternative to the ASICs and the GGP, keeping a balance between flexibility and performance. This work presents a proposal for the modeling of Reconfigurables with Chu Spaces, describing the subjects main about this thematic. The solution proposal consists of a modeling that uses a generalization of the Chu Spaces, called of Chu nets, to model the configurations of a Reconfigurables Architectures. To validate the models, three algorithms had been developed and implemented to compose configurable logic blocks, detection of controllability and observability in applications for Reconfigurables Architectures modeled by Chu nets
Resumo:
The use of Multiple Input Multiple Output (MIMO) systems has permitted the recent evolution of wireless communication standards. The Spatial Multiplexing MIMO technique, in particular, provides a linear gain at the transmission capacity with the minimum between the numbers of transmit and receive antennas. To obtain a near capacity performance in SM-MIMO systems a soft decision Maximum A Posteriori Probability MIMO detector is necessary. However, such detector is too complex for practical solutions. Hence, the goal of a MIMO detector algorithm aimed for implementation is to get a good approximation of the ideal detector while keeping an acceptable complexity. Moreover, the algorithm needs to be mapped to a VLSI architecture with small area and high data rate. Since Spatial Multiplexing is a recent technique, it is argued that there is still much room for development of related algorithms and architectures. Therefore, this thesis focused on the study of sub optimum algorithms and VLSI architectures for broadband MIMO detector with soft decision. As a result, novel algorithms have been developed starting from proposals of optimizations for already established algorithms. Based on these results, new MIMO detector architectures with configurable modulation and competitive area, performance and data rate parameters are here proposed. The developed algorithms have been extensively simulated and the architectures were synthesized so that the results can serve as a reference for other works in the area
Resumo:
With the growth of energy consumption worldwide, conventional reservoirs, the reservoirs called "easy exploration and production" are not meeting the global energy demand. This has led many researchers to develop projects that will address these needs, companies in the oil sector has invested in techniques that helping in locating and drilling wells. One of the techniques employed in oil exploration process is the reverse time migration (RTM), in English, Reverse Time Migration, which is a method of seismic imaging that produces excellent image of the subsurface. It is algorithm based in calculation on the wave equation. RTM is considered one of the most advanced seismic imaging techniques. The economic value of the oil reserves that require RTM to be localized is very high, this means that the development of these algorithms becomes a competitive differentiator for companies seismic processing. But, it requires great computational power, that it still somehow harms its practical success. The objective of this work is to explore the implementation of this algorithm in unconventional architectures, specifically GPUs using the CUDA by making an analysis of the difficulties in developing the same, as well as the performance of the algorithm in the sequential and parallel version
Resumo:
This master dissertation presents the study and implementation of inteligent algorithms to monitor the measurement of sensors involved in natural gas custody transfer processes. To create these algoritmhs Artificial Neural Networks are investigated because they have some particular properties, such as: learning, adaptation, prediction. A neural predictor is developed to reproduce the sensor output dynamic behavior, in such a way that its output is compared to the real sensor output. A recurrent neural network is used for this purpose, because of its ability to deal with dynamic information. The real sensor output and the estimated predictor output work as the basis for the creation of possible sensor fault detection and diagnosis strategies. Two competitive neural network architectures are investigated and their capabilities are used to classify different kinds of faults. The prediction algorithm and the fault detection classification strategies, as well as the obtained results, are presented
Resumo:
Increase hydrocarbons production is the main goal of the oilwell industry worldwide. Hydraulic fracturing is often applied to achieve this goal due to a combination of attractive aspects including easiness and low operational costs associated with fast and highly economical response. Conventional fracturing usually involves high-flowing high-pressure pumping of a viscous fluid responsible for opening the fracture in the hydrocarbon producing rock. The thickness of the fracture should be enough to assure the penetration of the particles of a solid proppant into the rock. The proppant is driven into the target formation by a carrier fluid. After pumping, all fluids are filtered through the faces of the fracture and penetrate the rock. The proppant remains in the fracture holding it open and assuring high hydraulic conductivity. The present study proposes a different approach for hydraulic fracturing. Fractures with infinity conductivity are formed and used to further improve the production of highly permeable formations as well as to produce long fractures in naturally fractured formations. Naturally open fractures with infinite conductivity are usually encountered. They can be observed in rock outcrops and core plugs, or noticed by the total loss of circulation during drilling (even with low density fluids), image profiles, pumping tests (Mini-Frac and Mini Fall Off), and injection tests below fracturing pressure, whose flow is higher than expected for radial Darcian ones. Naturally occurring fractures are kept open by randomly shaped and placed supporting points, able to hold the faces of the fracture separate even under typical closing pressures. The approach presented herein generates infinite conductivity canal held open by artificially created parallel supporting areas positioned both horizontally and vertically. The size of these areas is designed to hold the permeable zones open supported by the impermeable areas. The England & Green equation was used to theoretically prove that the fracture can be held open by such artificially created set of horizontal parallel supporting areas. To assess the benefits of fractures characterized by infinite conductivity, an overall comparison with finite conductivity fractures was carried out using a series of parameters including fracture pressure loss and dimensionless conductivity as a function of flow production, FOI folds of increase, flow production and cumulative production as a function of time, and finally plots of net present value and productivity index
Resumo:
Food habits and morpho-histology of the digestive tract of marbled swamp eel, Synbranchus marmoratus (Block, 1917) were investigated. The fish samples were captured during August, 2007 to July, 2008 in the Marechal Dutra reservoir, Acari, Rio Grande do Norte. The rain fall data was obtained from EMPARN. The fish captured, were measured, weighed, dissected, eviscerated and individual stomach weights were registered. The stomach contents analyses were carried out based on volumetric method, points, frequency of occurrence and applying the Index of Relative Importance. The degrees of repletion of the stomachs were determined besides the Index of Repletion relating to feeding activity variations and frequency of ingestion during the rainy and dry seasons. The rainfall varied from 0 mm a 335 mm with a mean value of 71.62 mm. Highest rainfall of 335.5 mm was registered in March, 2008 and August to December was the dry period. During the dry period the study species presented high degrees of repletion of the stomachs, with a peak value in the month of September (mean = 4.54; ± SD = 0.56). The minimum mean value of = 3.99 ± SD = 0.25 was registered in the month of May during the rainy period. The stomach contents of S. marmoratus registered show that this fish prefers animals, 78.22% of crustaceans 2.85% of mollusks, 3.25% of fish, 1.4% of insects and 13.5% of semi-digested organic matter, thus characterizing the study species as a carnivore with a preference for crustaceans. The morpho-histological aspects of the digestive tract of S. marmoratus indicate that the mouth is terminal adapted to open widely, thin lips with taste buds, small villiform teeth forming a single series on maxillas, four pairs of branchial arches with short and widely spaced branchial rays. The oesophagus is short and cylindrical with a small diameter. The oesophagus wall is thick with mucas surface and internal parallel folds. The stomach is retilinical in form, presenting cardiac, caecal and pyloric portions. The caecal portion is long and is intermediary in position between the cardiac and pyloric portions. The cardiac portion of the stomach is short and cylindrical formed of simple epithelial cylindrical mucus cells. The caecal portion is long with narrow walls, a big cavity and smaller folds which give rise to gastric glands. The phyloric portion has no glands and primary or secondary mucas folds. The morphohistological aspects of the digestive tract of S. marmoratus indicate its adaptation to a carnivorous feeding habit
Resumo:
Due of industrial informatics several attempts have been done to develop notations and semantics, which are used for classifying and describing different kind of system behavior, particularly in the modeling phase. Such attempts provide the infrastructure to resolve some real problems of engineering and construct practical systems that aim at, mainly, to increase the productivity, quality, and security of the process. Despite the many studies that have attempted to develop friendly methods for industrial controller programming, they are still programmed by conventional trial-and-error methods and, in practice, there is little written documentation on these systems. The ideal solution would be to use a computational environment that allows industrial engineers to implement the system using high-level language and that follows international standards. Accordingly, this work proposes a methodology for plant and control modelling of the discrete event systems that include sequential, parallel and timed operations, using a formalism based on Statecharts, denominated Basic Statechart (BSC). The methodology also permits automatic procedures to validate and implement these systems. To validate our methodology, we presented two case studies with typical examples of the manufacturing sector. The first example shows a sequential control for a tagged machine, which is used to illustrated dependences between the devices of the plant. In the second example, we discuss more than one strategy for controlling a manufacturing cell. The model with no control has 72 states (distinct configurations) and, the model with sequential control generated 20 different states, but they only act in 8 distinct configurations. The model with parallel control generated 210 different states, but these 210 configurations act only in 26 distinct configurations, therefore, one strategy control less restrictive than previous. Lastly, we presented one example for highlight the modular characteristic of our methodology, which it is very important to maintenance of applications. In this example, the sensors for identifying pieces in the plant were removed. So, changes in the control model are needed to transmit the information of the input buffer sensor to the others positions of the cell
Resumo:
The number of applications based on embedded systems grows significantly every year, even with the fact that embedded systems have restrictions, and simple processing units, the performance of these has improved every day. However the complexity of applications also increase, a better performance will always be necessary. So even such advances, there are cases, which an embedded system with a single unit of processing is not sufficient to achieve the information processing in real time. To improve the performance of these systems, an implementation with parallel processing can be used in more complex applications that require high performance. The idea is to move beyond applications that already use embedded systems, exploring the use of a set of units processing working together to implement an intelligent algorithm. The number of existing works in the areas of parallel processing, systems intelligent and embedded systems is wide. However works that link these three areas to solve any problem are reduced. In this context, this work aimed to use tools available for FPGA architectures, to develop a platform with multiple processors to use in pattern classification with artificial neural networks