77 resultados para parallel application
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Performance prediction and application behavior modeling have been the subject of exten- sive research that aim to estimate applications performance with an acceptable precision. A novel approach to predict the performance of parallel applications is based in the con- cept of Parallel Application Signatures that consists in extract an application most relevant parts (phases) and the number of times they repeat (weights). Executing these phases in a target machine and multiplying its exeuction time by its weight an estimation of the application total execution time can be made. One of the problems is that the performance of an application depends on the program workload. Every type of workload affects differently how an application performs in a given system and so affects the signature execution time. Since the workloads used in most scientific parallel applications have dimensions and data ranges well known and the behavior of these applications are mostly deterministic, a model of how the programs workload affect its performance can be obtained. We create a new methodology to model how a program’s workload affect the parallel application signature. Using regression analysis we are able to generalize each phase time execution and weight function to predict an application performance in a target system for any type of workload within predefined range. We validate our methodology using a synthetic program, benchmarks applications and well known real scientific applications.
Resumo:
Fault tolerance has become a major issue for computer and software engineers because the occurrence of faults increases the cost of using a parallel computer. RADIC is the fault tolerance architecture for message passing systems which is transparent, decentralized, flexible and scalable. This master thesis presents the methodology used to implement the RADIC architecture over Open MPI, a well-know large-used message passing library. This implementation kept the RADIC architecture characteristics. In order to validate the implementation we have executed a synthetic ping program, besides, to evaluate the implementation performance we have used the NAS Parallel Benchmarks. The results prove that the RADIC architecture performance depends on the communication pattern of the parallel application which is running. Furthermore, our implementation proves that the RADIC architecture could be implemented over an existent message passing library.
Resumo:
La gestión de recursos en los procesadores multi-core ha ganado importancia con la evolución de las aplicaciones y arquitecturas. Pero esta gestión es muy compleja. Por ejemplo, una misma aplicación paralela ejecutada múltiples veces con los mismos datos de entrada, en un único nodo multi-core, puede tener tiempos de ejecución muy variables. Hay múltiples factores hardware y software que afectan al rendimiento. La forma en que los recursos hardware (cómputo y memoria) se asignan a los procesos o threads, posiblemente de varias aplicaciones que compiten entre sí, es fundamental para determinar este rendimiento. La diferencia entre hacer la asignación de recursos sin conocer la verdadera necesidad de la aplicación, frente a asignación con una meta específica es cada vez mayor. La mejor manera de realizar esta asignación és automáticamente, con una mínima intervención del programador. Es importante destacar, que la forma en que la aplicación se ejecuta en una arquitectura no necesariamente es la más adecuada, y esta situación puede mejorarse a través de la gestión adecuada de los recursos disponibles. Una apropiada gestión de recursos puede ofrecer ventajas tanto al desarrollador de las aplicaciones, como al entorno informático donde ésta se ejecuta, permitiendo un mayor número de aplicaciones en ejecución con la misma cantidad de recursos. Así mismo, esta gestión de recursos no requeriría introducir cambios a la aplicación, o a su estrategia operativa. A fin de proponer políticas para la gestión de los recursos, se analizó el comportamiento de aplicaciones intensivas de cómputo e intensivas de memoria. Este análisis se llevó a cabo a través del estudio de los parámetros de ubicación entre los cores, la necesidad de usar la memoria compartida, el tamaño de la carga de entrada, la distribución de los datos dentro del procesador y la granularidad de trabajo. Nuestro objetivo es identificar cómo estos parámetros influyen en la eficiencia de la ejecución, identificar cuellos de botella y proponer posibles mejoras. Otra propuesta es adaptar las estrategias ya utilizadas por el Scheduler con el fin de obtener mejores resultados.
Resumo:
Actualmente existen muchas aplicaciones paralelas/distribuidas en las cuales SPMD es el paradigma más usado. Obtener un buen rendimiento en una aplicación paralela de este tipo es uno de los principales desafíos dada la gran cantidad de aplicaciones existentes. Este objetivo no es fácil de resolver ya que existe una gran variedad de configuraciones de hardware, y también la naturaleza de los problemas pueden ser variados así como la forma de implementarlos. En consecuencia, si no se considera adecuadamente la combinación "software/hardware" pueden aparecer problemas inherentes a una aplicación iterativa sin una jerarquía de control definida de acuerdo a este paradigma. En SPMD todos los procesos ejecutan el mismo código pero computan una sección diferente de los datos de entrada. Una solución a un posible problema del rendimiento es proponer una estrategia de balance de carga para homogeneizar el cómputo entre los diferentes procesos. En este trabajo analizamos el benchmark CG con cargas heterogéneas con la finalidad de detectar los posibles problemas de rendimiento en una aplicación real. Un factor que determina el rendimiento en esta aplicación es la cantidad de elementos nonzero contenida en la sección de matriz asignada a cada proceso. Determinamos que es posible definir una estrategia de balance de carga que puede ser implementada de forma dinámica y demostramos experimentalmente que el rendimiento de la aplicación puede mejorarse de forma significativa con dicha estrategia.
Resumo:
This paper shows how a high level matrix programming language may be used to perform Monte Carlo simulation, bootstrapping, estimation by maximum likelihood and GMM, and kernel regression in parallel on symmetric multiprocessor computers or clusters of workstations. The implementation of parallelization is done in a way such that an investigator may use the programs without any knowledge of parallel programming. A bootable CD that allows rapid creation of a cluster for parallel computing is introduced. Examples show that parallelization can lead to important reductions in computational time. Detailed discussion of how the Monte Carlo problem was parallelized is included as an example for learning to write parallel programs for Octave.
Resumo:
This note describes ParallelKnoppix, a bootable CD that allows creation of a Linux cluster in very little time. An experienced user can create a cluster ready to execute MPI programs in less than 10 minutes. The computers used may be heterogeneous machines, of the IA-32 architecture. When the cluster is shut down, all machines except one are in their original state, and the last can be returned to its original state by deleting a directory. The system thus provides a means of using non-dedicated computers to create a cluster. An example session is documented.
Resumo:
We study simply-connected irreducible non-locally symmetric pseudo-Riemannian Spin(q) manifolds admitting parallel quaternionic spinors.
Resumo:
Peer-reviewed
Resumo:
Background: Since barrier protection measures to avoid contact with allergens are being increasingly developed, we assessed the clinical efficacy and tolerability of a topical nasal microemulsion made of glycerol esters in patients with allergic rhinitis. Methods: Randomized, controlled, double-blind, parallel group, multicentre, multinational clinical trial in which adult patients with allergic rhinitis or rhinoconjunctivitis due to sensitization to birch, grass or olive tree pollens received treatment with topical microemulsion or placebo during the pollen seasons. Efficacy variables included scores in the mini-RQLQ questionnaire, number and severity of nasal, ocular and lung signs and symptoms, need for symptomatic medications and patients" satisfaction with treatment. Adverse events were also recorded. Results: Demographic characteristics were homogeneous between groups and mini-RQLQ scores did not differ significantly at baseline (visit 1). From symptoms recorded in the diary cards, the ME group showed statistically significant better scores for nasal congestion (0.72 vs. 1.01; p = 0.017) and mean total nasal symptoms (0.7 vs. 0.9; p = 0.045). At visit 2 (pollen season), lower values were observed in the mini-RQLQ in the ME group, although there were no statistically significant differences between groups in both full analysis set (FAS) and patients completing treatment (PPS) populations. The results obtained in the nasal symptoms domain of the mini-RQLQ at visit 2 showed the highest difference (−0.43; 95% CI: -0.88 to 0.02) for the ME group in the FAS population. The topical microemulsion was safe and well tolerated and no major discomforts were observed. Satisfaction rating with the treatment was similar between the groups. Conclusions: The topical application of the microemulsion is a feasible and safe therapy in the prevention of allergic symptoms, particularly nasal congestion.
Resumo:
A l'estadística de processos estocàstics i camps aleatoris, una funció de moments o un cumulant d'un estimador de la funció de correlació o de la densitat espectral sovint pot contenir una integral amb un producte cíclic de nuclis. En aquest treball es defineix i s'investiga aquesta classe d'integrals i es demostra la desigualtat de Young-Hölder que permet estudiar el comportament asimptòtic de les esmentades integrals en la situació quan els nuclis depenen d'un pàràmetre. Es considera una aplicació al problema d'estimació de la funció de resposta en un sistema de Volterra.
Resumo:
This paper presents an application of the Multiple-Scale Integrated Assessment of Societal Metabolism to the recent economic history of Ecuador and Spain. Understanding the relationship between the Gross Domestic Product (GDP) and the throughput of matter and energy over time in modern societies is crucial for understanding the sustainability predicament as it is linked to economic growth. When considering the dynamics of economic development, Spain was able to take a different path than Ecuador thanks to the different characteristics of its energy budget and other key variables. This and other changes are described using economic and biophysical variables (both extensive and intensive referring to different hierarchical levels). The representation of these parallel changes (on different levels and describable only using different variables) can be kept in coherence by adopting the frame provided by MSIASM.
Resumo:
The paper is devoted to the study of a type of differential systems which appear usually in the study of some Hamiltonian systems with 2 degrees of freedom. We prove the existence of infinitely many periodic orbits on each negative energy level. All these periodic orbits pass near the total collision. Finally we apply these results to study the existence of periodic orbits in the charged collinear 3–body problem.
Resumo:
This note describes ParallelKnoppix, a bootable CD that allows econometricians with average knowledge of computers to create and begin using a high performance computing cluster for parallel computing in very little time. The computers used may be heterogeneous machines, and clusters of up to 200 nodes are supported. When the cluster is shut down, all machines are in their original state, so their temporary use in the cluster does not interfere with their normal uses. An example shows how a Monte Carlo study of a bootstrap test procedure may be done in parallel. Using a cluster of 20 nodes, the example runs approximately 20 times faster than it does on a single computer.
Resumo:
This comment corrects the errors in the estimation process that appear in Martins (2001). The first error is in the parametric probit estimation, as the previously presented results do not maximize the log-likelihood function. In the global maximum more variables become significant. As for the semiparametric estimation method, the kernel function used in Martins (2001) can take on both positive and negative values, which implies that the participation probability estimates may be outside the interval [0,1]. We have solved the problem by applying local smoothing in the kernel estimation, as suggested by Klein and Spady (1993).
Resumo:
This paper provides empirical evidence that continuous time models with one factor of volatility, in some conditions, are able to fit the main characteristics of financial data. It also reports the importance of the feedback factor in capturing the strong volatility clustering of data, caused by a possible change in the pattern of volatility in the last part of the sample. We use the Efficient Method of Moments (EMM) by Gallant and Tauchen (1996) to estimate logarithmic models with one and two stochastic volatility factors (with and without feedback) and to select among them.