11 resultados para Parallel programming (computer science)
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
This paper presents a new parallel methodology for calculating the determinant of matrices of the order n, with computational complexity O(n), using the Gauss-Jordan Elimination Method and Chio's Rule as references. We intend to present our step-by-step methodology using clear mathematical language, where we will demonstrate how to calculate the determinant of a matrix of the order n in an analytical format. We will also present a computational model with one sequential algorithm and one parallel algorithm using a pseudo-code.
Resumo:
Consider the NP-hard problem of, given a simple graph G, to find a series-parallel subgraph of G with the maximum number of edges. The algorithm that, given a connected graph G, outputs a spanning tree of G, is a 1/2-approximation. Indeed, if n is the number of vertices in G, any spanning tree in G has n-1 edges and any series-parallel graph on n vertices has at most 2n-3 edges. We present a 7/12 -approximation for this problem and results showing the limits of our approach.
Resumo:
Field-Programmable Gate Arrays (FPGAs) are becoming increasingly important in embedded and high-performance computing systems. They allow performance levels close to the ones obtained with Application-Specific Integrated Circuits, while still keeping design and implementation flexibility. However, to efficiently program FPGAs, one needs the expertise of hardware developers in order to master hardware description languages (HDLs) such as VHDL or Verilog. Attempts to furnish a high-level compilation flow (e.g., from C programs) still have to address open issues before broader efficient results can be obtained. Bearing in mind an FPGA available resources, it has been developed LALP (Language for Aggressive Loop Pipelining), a novel language to program FPGA-based accelerators, and its compilation framework, including mapping capabilities. The main ideas behind LALP are to provide a higher abstraction level than HDLs, to exploit the intrinsic parallelism of hardware resources, and to allow the programmer to control execution stages whenever the compiler techniques are unable to generate efficient implementations. Those features are particularly useful to implement loop pipelining, a well regarded technique used to accelerate computations in several application domains. This paper describes LALP, and shows how it can be used to achieve high-performance computing solutions.
Resumo:
In this work we introduce a relaxed version of the constant positive linear dependence constraint qualification (CPLD) that we call RCPLD. This development is inspired by a recent generalization of the constant rank constraint qualification by Minchenko and Stakhovski that was called RCRCQ. We show that RCPLD is enough to ensure the convergence of an augmented Lagrangian algorithm and that it asserts the validity of an error bound. We also provide proofs and counter-examples that show the relations of RCRCQ and RCPLD with other known constraint qualifications. In particular, RCPLD is strictly weaker than CPLD and RCRCQ, while still stronger than Abadie's constraint qualification. We also verify that the second order necessary optimality condition holds under RCRCQ.
Resumo:
The web services (WS) technology provides a comprehensive solution for representing, discovering, and invoking services in a wide variety of environments, including Service Oriented Architectures (SOA) and grid computing systems. At the core of WS technology lie a number of XML-based standards, such as the Simple Object Access Protocol (SOAP), that have successfully ensured WS extensibility, transparency, and interoperability. Nonetheless, there is an increasing demand to enhance WS performance, which is severely impaired by XML's verbosity. SOAP communications produce considerable network traffic, making them unfit for distributed, loosely coupled, and heterogeneous computing environments such as the open Internet. Also, they introduce higher latency and processing delays than other technologies, like Java RMI and CORBA. WS research has recently focused on SOAP performance enhancement. Many approaches build on the observation that SOAP message exchange usually involves highly similar messages (those created by the same implementation usually have the same structure, and those sent from a server to multiple clients tend to show similarities in structure and content). Similarity evaluation and differential encoding have thus emerged as SOAP performance enhancement techniques. The main idea is to identify the common parts of SOAP messages, to be processed only once, avoiding a large amount of overhead. Other approaches investigate nontraditional processor architectures, including micro-and macrolevel parallel processing solutions, so as to further increase the processing rates of SOAP/XML software toolkits. This survey paper provides a concise, yet comprehensive review of the research efforts aimed at SOAP performance enhancement. A unified view of the problem is provided, covering almost every phase of SOAP processing, ranging over message parsing, serialization, deserialization, compression, multicasting, security evaluation, and data/instruction-level processing.
Resumo:
There are some variants of the widely used Fuzzy C-Means (FCM) algorithm that support clustering data distributed across different sites. Those methods have been studied under different names, like collaborative and parallel fuzzy clustering. In this study, we offer some augmentation of the two FCM-based clustering algorithms used to cluster distributed data by arriving at some constructive ways of determining essential parameters of the algorithms (including the number of clusters) and forming a set of systematically structured guidelines such as a selection of the specific algorithm depending on the nature of the data environment and the assumptions being made about the number of clusters. A thorough complexity analysis, including space, time, and communication aspects, is reported. A series of detailed numeric experiments is used to illustrate the main ideas discussed in the study.
Resumo:
This article describes a real-world production planning and scheduling problem occurring at an integrated pulp and paper mill (P&P) which manufactures paper for cardboard out of produced pulp. During the cooking of wood chips in the digester, two by-products are produced: the pulp itself (virgin fibers) and the waste stream known as black liquor. The former is then mixed with recycled fibers and processed in a paper machine. Here, due to significant sequence-dependent setups in paper type changeovers, sizing and sequencing of lots have to be made simultaneously in order to efficiently use capacity. The latter is converted into electrical energy using a set of evaporators, recovery boilers and counter-pressure turbines. The planning challenge is then to synchronize the material flow as it moves through the pulp and paper mills, and energy plant, maximizing customer demand (as backlogging is allowed), and minimizing operation costs. Due to the intensive capital feature of P&P, the output of the digester must be maximized. As the production bottleneck is not fixed, to tackle this problem we propose a new model that integrates the critical production units associated to the pulp and paper mills, and energy plant for the first time. Simple stochastic mixed integer programming based local search heuristics are developed to obtain good feasible solutions for the problem. The benefits of integrating the three stages are discussed. The proposed approaches are tested on real-world data. Our work may help P&P companies to increase their competitiveness and reactiveness in dealing with demand pattern oscillations. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Breakthrough advances in microprocessor technology and efficient power management have altered the course of development of processors with the emergence of multi-core processor technology, in order to bring higher level of processing. The utilization of many-core technology has boosted computing power provided by cluster of workstations or SMPs, providing large computational power at an affordable cost using solely commodity components. Different implementations of message-passing libraries and system softwares (including Operating Systems) are installed in such cluster and multi-cluster computing systems. In order to guarantee correct execution of message-passing parallel applications in a computing environment other than that originally the parallel application was developed, review of the application code is needed. In this paper, a hybrid communication interfacing strategy is proposed, to execute a parallel application in a group of computing nodes belonging to different clusters or multi-clusters (computing systems may be running different operating systems and MPI implementations), interconnected with public or private IP addresses, and responding interchangeably to user execution requests. Experimental results demonstrate the feasibility of this proposed strategy and its effectiveness, through the execution of benchmarking parallel applications.
Resumo:
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.
The boundedness of penalty parameters in an augmented Lagrangian method with constrained subproblems
Resumo:
Augmented Lagrangian methods are effective tools for solving large-scale nonlinear programming problems. At each outer iteration, a minimization subproblem with simple constraints, whose objective function depends on updated Lagrange multipliers and penalty parameters, is approximately solved. When the penalty parameter becomes very large, solving the subproblem becomes difficult; therefore, the effectiveness of this approach is associated with the boundedness of the penalty parameters. In this paper, it is proved that under more natural assumptions than the ones employed until now, penalty parameters are bounded. For proving the new boundedness result, the original algorithm has been slightly modified. Numerical consequences of the modifications are discussed and computational experiments are presented.
Resumo:
Abstract Background Several mathematical and statistical methods have been proposed in the last few years to analyze microarray data. Most of those methods involve complicated formulas, and software implementations that require advanced computer programming skills. Researchers from other areas may experience difficulties when they attempting to use those methods in their research. Here we present an user-friendly toolbox which allows large-scale gene expression analysis to be carried out by biomedical researchers with limited programming skills. Results Here, we introduce an user-friendly toolbox called GEDI (Gene Expression Data Interpreter), an extensible, open-source, and freely-available tool that we believe will be useful to a wide range of laboratories, and to researchers with no background in Mathematics and Computer Science, allowing them to analyze their own data by applying both classical and advanced approaches developed and recently published by Fujita et al. Conclusion GEDI is an integrated user-friendly viewer that combines the state of the art SVR, DVAR and SVAR algorithms, previously developed by us. It facilitates the application of SVR, DVAR and SVAR, further than the mathematical formulas present in the corresponding publications, and allows one to better understand the results by means of available visualizations. Both running the statistical methods and visualizing the results are carried out within the graphical user interface, rendering these algorithms accessible to the broad community of researchers in Molecular Biology.