18 resultados para performance analysis
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Performance analysis is the task of monitor the behavior of a program execution. The main goal is to find out the possible adjustments that might be done in order improve the performance. To be able to get that improvement it is necessary to find the different causes of overhead. Nowadays we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software). When we talk about multicore we are also speaking of shared memory systems, on this master thesis we talk about the issues involved on the performance analysis and tuning of applications running specifically in a shared Memory system. We move one step ahead to take the performance analysis to another level by analyzing the applications structure and patterns. We also present some tools specifically addressed to the performance analysis of OpenMP multithread application. At the end we present the results of some experiments performed with a set of OpenMP scientific application.
Resumo:
Earth System Models (ESM) have been successfuly developed over past few years, and are currently beeing used for simulating present day-climate, seasonal to interanual predictions of climate change. The supercomputer performance plays an important role in climate modeling since one of the challenging issues for climate modellers is to efficiently and accurately couple earth System components on present day computers architectures. At the Barcelona Supercomputing Center (BSC), we work with the EC- Earth System Model. The EC- Earth is an ESM, which currently consists of an atmosphere (IFS) and an ocean (NEMO) model that communicate with each other through the OASIS coupler. Additional modules (e.g. for chemistry and vegetation ) are under development. The EC-Earth ESM has been ported successfully over diferent high performance computin platforms (e.g, IBM P6 AIX, CRAY XT-5, Intelbased Linux Clusters, SGI Altix) at diferent sites in Europ (e.g., KNMI, ICHEC, ECMWF). The objective of the first phase of the project was to identify and document the issues related with the portability and performance of EC-Earth on the MareNostrum supercomputer, a System based on IBM PowerPC 970MP processors and run under a Linux Suse Distribution. EC-Earth was successfully ported to MareNostrum, and a compilation incompatibilty was solved by a two step compilation approach using XLF version 10.1 and 12.1 compilers. In addition, the EC-Earth performance was analyzed with respect to escalability and trace analysis with the Paravear software. This analysis showed that EC-Earth with a larger number of IFS CPUs (<128) is not feasible at the moment since some issues exists with the IFS-NEMO balance and MPI Communications.
Resumo:
The computer simulation of reaction dynamics has nowadays reached a remarkable degree of accuracy. Triatomic elementary reactions are rigorously studied with great detail on a straightforward basis using a considerable variety of Quantum Dynamics computational tools available to the scientific community. In our contribution we compare the performance of two quantum scattering codes in the computation of reaction cross sections of a triatomic benchmark reaction such as the gas phase reaction Ne + H2+ %12. NeH++ H. The computational codes are selected as representative of time-dependent (Real Wave Packet [ ]) and time-independent (ABC [ ]) methodologies. The main conclusion to be drawn from our study is that both strategies are, to a great extent, not competing but rather complementary. While time-dependent calculations advantages with respect to the energy range that can be covered in a single simulation, time-independent approaches offer much more detailed information from each single energy calculation. Further details such as the calculation of reactivity at very low collision energies or the computational effort related to account for the Coriolis couplings are analyzed in this paper.
Resumo:
A comparative performance analysis of four geolocation methods in terms of their theoretical root mean square positioning errors is provided. Comparison is established in two different ways: strict and average. In the strict type, methods are examined for a particular geometric configuration of base stations(BSs) with respect to mobile position, which determines a givennoise profile affecting the respective time-of-arrival (TOA) or timedifference-of-arrival (TDOA) estimates. In the average type, methodsare evaluated in terms of the expected covariance matrix ofthe position error over an ensemble of random geometries, so thatcomparison is geometry independent. Exact semianalytical equationsand associated lower bounds (depending solely on the noiseprofile) are obtained for the average covariance matrix of the positionerror in terms of the so-called information matrix specific toeach geolocation method. Statistical channel models inferred fromfield trials are used to define realistic prior probabilities for therandom geometries. A final evaluation provides extensive resultsrelating the expected position error to channel model parametersand the number of base stations.
Resumo:
The computer simulation of reaction dynamics has nowadays reached a remarkable degree of accuracy. Triatomic elementary reactions are rigorously studied with great detail on a straightforward basis using a considerable variety of Quantum Dynamics computational tools available to the scientific community. In our contribution we compare the performance of two quantum scattering codes in the computation of reaction cross sections of a triatomic benchmark reaction such as the gas phase reaction Ne + H2+ %12. NeH++ H. The computational codes are selected as representative of time-dependent (Real Wave Packet [ ]) and time-independent (ABC [ ]) methodologies. The main conclusion to be drawn from our study is that both strategies are, to a great extent, not competing but rather complementary. While time-dependent calculations advantages with respect to the energy range that can be covered in a single simulation, time-independent approaches offer much more detailed information from each single energy calculation. Further details such as the calculation of reactivity at very low collision energies or the computational effort related to account for the Coriolis couplings are analyzed in this paper.
Resumo:
Architectural design and deployment of Peer-to-Peer Video-on-Demand (P2PVoD) systems which support VCR functionalities is attracting the interest of an increasing number of research groups within the scientific community; especially due to the intrinsic characteristics of such systems and the benefits that peers could provide at reducing the server load. This work focuses on the performance analysis of a P2P-VoD system considering user behaviors obtained from real traces together with other synthetic user patterns. The experiments performed show that it is feasible to achieve a performance close to the best possible. Future work will consider monitoring the physical characteristics of the network in order to improve the design of different aspects of a VoD system.
Resumo:
En termes de temps d'execució i ús de dades, les aplicacions paral·leles/distribuïdes poden tenir execucions variables, fins i tot quan s'empra el mateix conjunt de dades d'entrada. Existeixen certs aspectes de rendiment relacionats amb l'entorn que poden afectar dinàmicament el comportament de l'aplicació, tals com: la capacitat de la memòria, latència de la xarxa, el nombre de nodes, l'heterogeneïtat dels nodes, entre d'altres. És important considerar que l'aplicació pot executar-se en diferents configuracions de maquinari i el desenvolupador d'aplicacions no port garantir que els ajustaments de rendiment per a un sistema en particular continuïn essent vàlids per a d'altres configuracions. L'anàlisi dinàmica de les aplicacions ha demostrat ser el millor enfocament per a l'anàlisi del rendiment per dues raons principals. En primer lloc, ofereix una solució molt còmoda des del punt de vista dels desenvolupadors mentre que aquests dissenyen i evaluen les seves aplicacions paral·leles. En segon lloc, perquè s'adapta millor a l'aplicació durant l'execució. Aquest enfocament no requereix la intervenció de desenvolupadors o fins i tot l'accés al codi font de l'aplicació. S'analitza l'aplicació en temps real d'execució i es considra i analitza la recerca dels possibles colls d'ampolla i optimitzacions. Per a optimitzar l'execució de l'aplicació bioinformàtica mpiBLAST, vam analitzar el seu comportament per a identificar els paràmetres que intervenen en el rendiment d'ella, com ara: l'ús de la memòria, l'ús de la xarxa, patrons d'E/S, el sistema de fitxers emprat, l'arquitectura del processador, la grandària de la base de dades biològica, la grandària de la seqüència de consulta, la distribució de les seqüències dintre d'elles, el nombre de fragments de la base de dades i/o la granularitat dels treballs assignats a cada procés. El nostre objectiu és determinar quins d'aquests paràmetres tenen major impacte en el rendiment de les aplicacions i com ajustar-los dinàmicament per a millorar el rendiment de l'aplicació. Analitzant el rendiment de l'aplicació mpiBLAST hem trobat un conjunt de dades que identifiquen cert nivell de serial·lització dintre l'execució. Reconeixent l'impacte de la caracterització de les seqüències dintre de les diferents bases de dades i una relació entre la capacitat dels workers i la granularitat de la càrrega de treball actual, aquestes podrien ser sintonitzades dinàmicament. Altres millores també inclouen optimitzacions relacionades amb el sistema de fitxers paral·lel i la possibilitat d'execució en múltiples multinucli. La grandària de gra de treball està influenciat per factors com el tipus de base de dades, la grandària de la base de dades, i la relació entre grandària de la càrrega de treball i la capacitat dels treballadors.
Resumo:
Los procesadores multi-core y el multi-threading por hardware permiten aumentar el rendimiento de las aplicaciones. Por un lado, los procesadores multi-core combinan 2 o más procesadores en un mismo chip. Por otro lado, el multi-threading por hardware es una técnica que incrementa la utilización de los recursos del procesador. Este trabajo presenta un análisis de rendimiento de los resultados obtenidos en dos aplicaciones, multiplicación de matrices densas y transformada rápida de Fourier. Ambas aplicaciones se han ejecutado en arquitecturas multi-core que explotan el paralelismo a nivel de thread pero con un modelo de multi-threading diferente. Los resultados obtenidos muestran la importancia de entender y saber analizar el efecto del multi-core y multi-threading en el rendimiento.
Resumo:
The aim of the present work is to investigate innovative processes within a geographical cluster, and thus contribute to the debate on the effects of industrial clusters on innovation capacity. In particular, we would like to ascertain whether the advantages of industrial districts in promoting innovation, as already revealed by literature (diffusion of knowledge, social capital and trust, efficient networking), are also keys to success in the Tuscan shipbuilding industry of pleasure and sporting boats. First, we verify the existence of clusters of shipbuilding in Tuscany, using a specific methodology. Next, in the identified clusters, we analyse three innovative networks financed in a policy to support innovation, and examine whether the typical features of a cluster for promoting innovation are at work, using a questionnaire administered to 71 actors. Finally, we develop a performance analysis of the cluster firms and ascertain whether their different behaviours also lead to different performances. The analysis results show that our case records effects of industrial clustering on innovation capacity, such as the important role given to trust and social capital, the significant worth put in interfirm relations and in each partner’s specific competencies, or even the distinctive performance of firms belonging to a cluster.
Resumo:
En la actualidad, la computación de altas prestaciones está siendo utilizada en multitud de campos científicos donde los distintos problemas estudiados se resuelven mediante aplicaciones paralelas/distribuidas. Estas aplicaciones requieren gran capacidad de cómputo, bien sea por la complejidad de los problemas o por la necesidad de solventar situaciones en tiempo real. Por lo tanto se debe aprovechar los recursos y altas capacidades computacionales de los sistemas paralelos en los que se ejecutan estas aplicaciones con el fin de obtener un buen rendimiento. Sin embargo, lograr este rendimiento en una aplicación ejecutándose en un sistema es una dura tarea que requiere un alto grado de experiencia, especialmente cuando se trata de aplicaciones que presentan un comportamiento dinámico o cuando se usan sistemas heterogéneos. En estos casos actualmente se plantea realizar una mejora de rendimiento automática y dinámica de las aplicaciones como mejor enfoque para el análisis del rendimiento. El presente trabajo de investigación se sitúa dentro de este ámbito de estudio y su objetivo principal es sintonizar dinámicamente mediante MATE (Monitoring, Analysis and Tuning Environment) una aplicación MPI empleada en computación de altas prestaciones que siga un paradigma Master/Worker. Las técnicas de sintonización integradas en MATE han sido desarrolladas a partir del estudio de un modelo de rendimiento que refleja los cuellos de botella propios de aplicaciones situadas bajo un paradigma Master/Worker: balanceo de carga y número de workers. La ejecución de la aplicación elegida bajo el control dinámico de MATE y de la estrategia de sintonización implementada ha permitido observar la adaptación del comportamiento de dicha aplicación a las condiciones actuales del sistema donde se ejecuta, obteniendo así una mejora de su rendimiento.
Resumo:
During the last decade the interest on space-borne Synthetic Aperture Radars (SAR) for remote sensing applications has grown as testified by the number of recent and forthcoming missions as TerraSAR-X, RADARSAT-2, COSMO-kyMed, TanDEM-X and the Spanish SEOSAR/PAZ. In this sense, this thesis proposes to study and analyze the performance of the state-of-the-Art space-borne SAR systems, with modes able to provide Moving Target Indication capabilities (MTI), i.e. moving object detection and estimation. The research will focus on the MTI processing techniques as well as the architecture and/ or configuration of the SAR instrument, setting the limitations of the current systems with MTI capabilities, and proposing efficient solutions for the future missions. Two European projects, to which the Universitat Politècnica de Catalunya provides support, are an excellent framework for the research activities suggested in this thesis. NEWA project proposes a potential European space-borne radar system with MTI capabilities in order to fulfill the upcoming European security policies. This thesis will critically review the state-of-the-Art MTI processing techniques as well as the readiness and maturity level of the developed capabilities. For each one of the techniques a performance analysis will be carried out based on the available technologies, deriving a roadmap and identifying the different technological gaps. In line with this study a simulator tool will be developed in order to validate and evaluate different MTI techniques in the basis of a flexible space-borne radar configuration. The calibration of a SAR system is mandatory for the accurate formation of the SAR images and turns to be critical in the advanced operation modes as MTI. In this sense, the SEOSAR/PAZ project proposes the study and estimation of the radiometric budget. This thesis will also focus on an exhaustive analysis of the radiometric budget considering the current calibration concepts and their possible limitations. In the framework of this project a key point will be the study of the Dual Receive Antenna (DRA) mode, which provides MTI capabilities to the mission. An additional aspect under study is the applicability of the Digital Beamforming on multichannel and/or multistatic radar platforms, which conform potential solutions for the NEWA project with the aim to fully exploit its capability jointly with MTI techniques.
Resumo:
In this correspondence, we propose applying the hiddenMarkov models (HMM) theory to the problem of blind channel estimationand data detection. The Baum–Welch (BW) algorithm, which is able toestimate all the parameters of the model, is enriched by introducingsome linear constraints emerging from a linear FIR hypothesis on thechannel. Additionally, a version of the algorithm that is suitable for timevaryingchannels is also presented. Performance is analyzed in a GSMenvironment using standard test channels and is found to be close to thatobtained with a nonblind receiver.
Resumo:
The low levels of unemployment recorded in the UK in recent years are widely cited asevidence of the country’s improved economic performance, and the apparent convergence of unemployment rates across the country’s regions used to suggest that the longstanding divide in living standards between the relatively prosperous ‘south’ and the more depressed ‘north’ has been substantially narrowed. Dissenters from theseconclusions have drawn attention to the greatly increased extent of non-employment(around a quarter of the UK’s working age population are not in employment) and themarked regional dimension in its distribution across the country. Amongst these dissenters it is generally agreed that non-employment is concentrated amongst oldermales previously employed in the now very much smaller ‘heavy’ industries (e.g. coal,steel, shipbuilding).This paper uses the tools of compositiona l data analysis to provide a much richer picture of non-employment and one which challenges the conventional analysis wisdom about UK labour market performance as well as the dissenters view of the nature of theproblem. It is shown that, associated with the striking ‘north/south’ divide in nonemployment rates, there is a statistically significant relationship between the size of the non-employment rate and the composition of non-employment. Specifically, it is shown that the share of unemployment in non-employment is negatively correlated with the overall non-employment rate: in regions where the non-employment rate is high the share of unemployment is relatively low. So the unemployment rate is not a very reliable indicator of regional disparities in labour market performance. Even more importantly from a policy viewpoint, a significant positive relationship is found between the size ofthe non-employment rate and the share of those not employed through reason of sicknessor disability and it seems (contrary to the dissenters) that this connection is just as strong for women as it is for men
Resumo:
The low quality of education is a persistent problem in many developed countries. Parallel to in the last decades exists a tendency towards decentralization in many developed and developing countries. Using micro data from the Programme for International Student Assessment (PISA) referred to 22 countries, we test whether there exists an impact of fiscal and political decentralization on student performance in the areas of mathematics, reading skills and science. We observe that fiscal decentralization exerts an unequivocal positive effect on students’ outcomes in all areas, while the effect of political decentralization is more ambiguous. On the one hand, the capacity of the subnational governments to rule on its region has a positive effect on students’ performance in mathematics. On the other hand, the capacity to influence the country as a whole has a negative impact on mathematics achievement. As a general result, we observe that students’ performance in Mathematics is more sensible to these exogenous variations than in Sciences and reading skills. Keywords: School outcomes, PISA, fiscal decentralization, political decentralization JEL codes: H11, H77, I21
Resumo:
This paper studies the effect of providing relative performance feedback information onindividual performance and on individual affective response, when agents are rewardedaccording to their absolute performance. In a laboratory set-up, agents perform a realeffort task and when receiving feedback, they are asked to rate their happiness, arousaland feeling of dominance. Control subjects learn only their absolute performance, whilethe treated subjects additionally learn the average performance in the session.Performance is 17 percent higher when relative performance feedback is provided.Furthermore, although feedback increases the performance independent of the content(i.e., performing above or below the average), the content is determinant for theaffective response. When subjects are treated, the inequality in the happiness and thefeeling of dominance between those subjects performing above and below the averageincreases by 8 and 6 percentage points, respectively.