765 resultados para Parallel computing
Resumo:
A parallel formulation for the simulation of a branch prediction algorithm is presented. This parallel formulation identifies independent tasks in the algorithm which can be executed concurrently. The parallel implementation is based on the multithreading model and two parallel programming platforms: pthreads and Cilk++. Improvement in execution performance by up to 7 times is observed for a generic 2-bit predictor in a 12-core multiprocessor system.
Resumo:
An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.
Resumo:
A recent study conducted by Blocken et al. (Numerical study on the existence of the Venturi effect in passages between perpendicular buildings. Journal of Engineering Mechanics, 2008,134: 1021-1028) challenged the popular view of the existence of the ‘Venturi effect’ in building passages as the wind is exposed to an open boundary. The present research extends the work of Blocken et al. (2008a) into a more general setup with the building orientation varying from 0° to 180° using CFD simulations. Our results reveal that the passage flow is mainly determined by the combination of corner streams. It is also shown that converging passages have a higher wind-blocking effect compared to diverging passages, explained by a lower wind speed and higher drag coefficient. Fluxes on the top plane of the passage volume reverse from outflow to inflow in the cases of α=135°, 150° and 165°. A simple mathematical expression to explain the relationship between the flux ratio and the geometric parameters has been developed to aid wind design in an urban neighborhood. In addition, a converging passage with α=15° is recommended for urban wind design in cold and temperate climates since the passage flow changes smoothly and a relatively lower wind speed is expected compared with that where there are no buildings. While for the high-density urban area in (sub)tropical climates such as Hong Kong where there is a desire for more wind, a diverging passage with α=150° is a better choice to promote ventilation at the pedestrian level.
Resumo:
We extend the method of Cassels for computing the Cassels-Tate pairing on the 2-Selmer group of an elliptic curve, to the case of 3-Selmer groups. This requires significant modifications to both the local and global parts of the calculation. Our method is practical in sufficiently small examples, and can be used to improve the upper bound for the rank of an elliptic curve obtained by 3-descent.
Resumo:
In this paper we describe the development of a program that aims at the optimal integration of observed data in an oceanographic model describ
MAGNETOHYDRODYNAMIC SIMULATIONS OF RECONNECTION AND PARTICLE ACCELERATION: THREE-DIMENSIONAL EFFECTS
Resumo:
Magnetic fields can change their topology through a process known as magnetic reconnection. This process in not only important for understanding the origin and evolution of the large-scale magnetic field, but is seen as a possibly efficient particle accelerator producing cosmic rays mainly through the first-order Fermi process. In this work we study the properties of particle acceleration inserted in reconnection zones and show that the velocity component parallel to the magnetic field of test particles inserted in magnetohydrodynamic (MHD) domains of reconnection without including kinetic effects, such as pressure anisotropy, the Hall term, or anomalous effects, increases exponentially. Also, the acceleration of the perpendicular component is always possible in such models. We find that within contracting magnetic islands or current sheets the particles accelerate predominantly through the first-order Fermi process, as previously described, while outside the current sheets and islands the particles experience mostly drift acceleration due to magnetic field gradients. Considering two-dimensional MHD models without a guide field, we find that the parallel acceleration stops at some level. This saturation effect is, however, removed in the presence of an out-of-plane guide field or in three-dimensional models. Therefore, we stress the importance of the guide field and fully three-dimensional studies for a complete understanding of the process of particle acceleration in astrophysical reconnection environments.
Resumo:
GPR (Ground Penetrating Radar) results are shown for perpendicular broadside and parallel broadside antenna orientations. Performance in detection and localization of concrete tubes and steel tanks is compared as a function of acquisition configuration. The comparison is done using 100 MHz and 200 MHz center frequency antennas. All tubes and tanks are buried at the geophysical test site of IAG/USP in Sao Paulo city, Brazil. The results show that the long steel pipe with a 38-mm diameter was well detected with the perpendicular broadside configuration. The concrete tubes were better detected with the parallel broadside configuration, clearly showing hyperbolic diffraction events from all targets up to 2-m depth. Steel tanks were detected with the two configurations. However, the parallel broadside configuration was generated to a much lesser extent an apparent hyperbolic reflection corresponding to constructive interference of diffraction hyperbolas of adjacent targets placed at the same depth. Vertical concrete tubes and steel tanks were better contained with parallel broadside antennas, where the apexes of the diffraction hyperbolas better corresponded to the horizontal location of the buried target disposition. The two configurations provide details about buried targets emphasizing how GPR multi-component configurations have the potential to improve the subsurface image quality as well as to discriminate different buried targets. It is judged that they hold some applicability in geotechnical and geoscientific studies. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This paper proposes a parallel hardware architecture for image feature detection based on the Scale Invariant Feature Transform algorithm and applied to the Simultaneous Localization And Mapping problem. The work also proposes specific hardware optimizations considered fundamental to embed such a robotic control system on-a-chip. The proposed architecture is completely stand-alone; it reads the input data directly from a CMOS image sensor and provides the results via a field-programmable gate array coupled to an embedded processor. The results may either be used directly in an on-chip application or accessed through an Ethernet connection. The system is able to detect features up to 30 frames per second (320 x 240 pixels) and has accuracy similar to a PC-based implementation. The achieved system performance is at least one order of magnitude better than a PC-based solution, a result achieved by investigating the impact of several hardware-orientated optimizations oil performance, area and accuracy.
Resumo:
In 2006 the Route load balancing algorithm was proposed and compared to other techniques aiming at optimizing the process allocation in grid environments. This algorithm schedules tasks of parallel applications considering computer neighborhoods (where the distance is defined by the network latency). Route presents good results for large environments, although there are cases where neighbors do not have an enough computational capacity nor communication system capable of serving the application. In those situations the Route migrates tasks until they stabilize in a grid area with enough resources. This migration may take long time what reduces the overall performance. In order to improve such stabilization time, this paper proposes RouteGA (Route with Genetic Algorithm support) which considers historical information on parallel application behavior and also the computer capacities and load to optimize the scheduling. This information is extracted by using monitors and summarized in a knowledge base used to quantify the occupation of tasks. Afterwards, such information is used to parameterize a genetic algorithm responsible for optimizing the task allocation. Results confirm that RouteGA outperforms the load balancing carried out by the original Route, which had previously outperformed others scheduling algorithms from literature.
Resumo:
In this paper an analytical solution of the temperature of an opaque material containing two overlapping and parallel subsurface cylinders, illuminated by a modulated light beam, is presented. The method is based on the expansion of plane and cylindrical thermal waves in series of Bessel and Hankel functions. This model is addressed to the study of heat propagation in composite materials with interconnection between inclusions, as is the case of inverse opals and fiber reinforced composites. Measurements on calibrated samples using lock-in infrared thermography confirm the validity of the model.
Resumo:
A novel cryptography method based on the Lorenz`s attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher.
Resumo:
We present parallel algorithms on the BSP/CGM model, with p processors, to count and generate all the maximal cliques of a circle graph with n vertices and m edges. To count the number of all the maximal cliques, without actually generating them, our algorithm requires O(log p) communication rounds with O(nm/p) local computation time. We also present an algorithm to generate the first maximal clique in O(log p) communication rounds with O(nm/p) local computation, and to generate each one of the subsequent maximal cliques this algorithm requires O(log p) communication rounds with O(m/p) local computation. The maximal cliques generation algorithm is based on generating all maximal paths in a directed acyclic graph, and we present an algorithm for this problem that uses O(log p) communication rounds with O(m/p) local computation for each maximal path. We also show that the presented algorithms can be extended to the CREW PRAM model.
Resumo:
The InteGrade project is a multi-university effort to build a novel grid computing middleware based on the opportunistic use of resources belonging to user workstations. The InteGrade middleware currently enables the execution of sequential, bag-of-tasks, and parallel applications that follow the BSP or the MPI programming models. This article presents the lessons learned over the last five years of the InteGrade development and describes the solutions achieved concerning the support for robust application execution. The contributions cover the related fields of application scheduling, execution management, and fault tolerance. We present our solutions, describing their implementation principles and evaluation through the analysis of several experimental results. (C) 2010 Elsevier Inc. All rights reserved.