17 resultados para Distributed non-coherent shared memory

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

La gestión de recursos en los procesadores multi-core ha ganado importancia con la evolución de las aplicaciones y arquitecturas. Pero esta gestión es muy compleja. Por ejemplo, una misma aplicación paralela ejecutada múltiples veces con los mismos datos de entrada, en un único nodo multi-core, puede tener tiempos de ejecución muy variables. Hay múltiples factores hardware y software que afectan al rendimiento. La forma en que los recursos hardware (cómputo y memoria) se asignan a los procesos o threads, posiblemente de varias aplicaciones que compiten entre sí, es fundamental para determinar este rendimiento. La diferencia entre hacer la asignación de recursos sin conocer la verdadera necesidad de la aplicación, frente a asignación con una meta específica es cada vez mayor. La mejor manera de realizar esta asignación és automáticamente, con una mínima intervención del programador. Es importante destacar, que la forma en que la aplicación se ejecuta en una arquitectura no necesariamente es la más adecuada, y esta situación puede mejorarse a través de la gestión adecuada de los recursos disponibles. Una apropiada gestión de recursos puede ofrecer ventajas tanto al desarrollador de las aplicaciones, como al entorno informático donde ésta se ejecuta, permitiendo un mayor número de aplicaciones en ejecución con la misma cantidad de recursos. Así mismo, esta gestión de recursos no requeriría introducir cambios a la aplicación, o a su estrategia operativa. A fin de proponer políticas para la gestión de los recursos, se analizó el comportamiento de aplicaciones intensivas de cómputo e intensivas de memoria. Este análisis se llevó a cabo a través del estudio de los parámetros de ubicación entre los cores, la necesidad de usar la memoria compartida, el tamaño de la carga de entrada, la distribución de los datos dentro del procesador y la granularidad de trabajo. Nuestro objetivo es identificar cómo estos parámetros influyen en la eficiencia de la ejecución, identificar cuellos de botella y proponer posibles mejoras. Otra propuesta es adaptar las estrategias ya utilizadas por el Scheduler con el fin de obtener mejores resultados.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En el entorno actual, diversas ramas de las ciencias, tienen la necesidad de auxiliarse de la computación de altas prestaciones para la obtención de resultados a relativamente corto plazo. Ello es debido fundamentalmente, al alto volumen de información que necesita ser procesada y también al costo computacional que demandan dichos cálculos. El beneficio al realizar este procesamiento de manera distribuida y paralela, logra acortar los tiempos de espera en la obtención de los resultados y de esta forma posibilita una toma decisiones con mayor anticipación. Para soportar ello, existen fundamentalmente dos modelos de programación ampliamente extendidos: el modelo de paso de mensajes a través de librerías basadas en el estándar MPI, y el de memoria compartida con la utilización de OpenMP. Las aplicaciones híbridas son aquellas que combinan ambos modelos con el fin de aprovechar en cada caso, las potencialidades específicas del paralelismo en cada uno. Lamentablemente, la práctica ha demostrado que la utilización de esta combinación de modelos, no garantiza necesariamente una mejoría en el comportamiento de las aplicaciones. Por lo tanto, un análisis de los factores que influyen en el rendimiento de las mismas, nos beneficiaría a la hora de implementarlas pero también, sería un primer paso con el fin de llegar a predecir su comportamiento. Adicionalmente, supondría una vía para determinar que parámetros de la aplicación modificar con el fin de mejorar su rendimiento. En el trabajo actual nos proponemos definir una metodología para la identificación de factores de rendimiento en aplicaciones híbridas y en congruencia, la identificación de algunos factores que influyen en el rendimiento de las mismas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method to compute, quickly and efficiently, the mutual information achieved by an IID (independent identically distributed) complex Gaussian signal on a block Rayleigh-faded channel without side information at the receiver. The method accommodates both scalar and MIMO (multiple-input multiple-output) settings. Operationally, this mutual information represents the highest spectral efficiency that can be attained using Gaussiancodebooks. Examples are provided that illustrate the loss in spectral efficiency caused by fast fading and how that loss is amplified when multiple transmit antennas are used. These examples are further enriched by comparisons with the channel capacity under perfect channel-state information at the receiver, and with the spectral efficiency attained by pilot-based transmission.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method to compute, quickly and efficiently, the mutual information achieved by an IID (independent identically distributed) complex Gaussian signal on a block Rayleigh-faded channel without side information at the receiver. The method accommodates both scalar and MIMO (multiple-input multiple-output) settings. Operationally, this mutual information represents the highest spectral efficiency that can be attained using Gaussiancodebooks. Examples are provided that illustrate the loss in spectral efficiency caused by fast fading and how that loss is amplified when multiple transmit antennas are used. These examples are further enriched by comparisons with the channel capacity under perfect channel-state information at the receiver, and with the spectral efficiency attained by pilot-based transmission.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabajo analiza el rendimiento de cuatro nodos de cómputo multiprocesador de memoria compartida para resolver el problema N-body. Se paraleliza el algoritmo serie, y se codifica usando el lenguaje C extendido con OpenMP. El resultado son dos variantes que obedecen a dos criterios de optimización diferentes: minimizar los requisitos de memoria y minimizar el volumen de cómputo. Posteriormente, se realiza un proceso de análisis de las prestaciones del programa sobre los nodos de cómputo. Se modela el rendimiento de las variantes secuenciales y paralelas de la aplicación, y de los nodos de cómputo; se instrumentan y ejecutan los programas para obtener resultados en forma de varias métricas; finalmente se muestran e interpretan los resultados, proporcionando claves que explican ineficiencias y cuellos de botella en el rendimiento y posibles líneas de mejora. La experiencia de este estudio concreto ha permitido esbozar una incipiente metodología de análisis de rendimiento, identificación de problemas y sintonización de algoritmos a nodos de cómputo multiprocesador de memoria compartida.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance analysis is the task of monitor the behavior of a program execution. The main goal is to find out the possible adjustments that might be done in order improve the performance. To be able to get that improvement it is necessary to find the different causes of overhead. Nowadays we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software). When we talk about multicore we are also speaking of shared memory systems, on this master thesis we talk about the issues involved on the performance analysis and tuning of applications running specifically in a shared Memory system. We move one step ahead to take the performance analysis to another level by analyzing the applications structure and patterns. We also present some tools specifically addressed to the performance analysis of OpenMP multithread application. At the end we present the results of some experiments performed with a set of OpenMP scientific application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Estudi comparatiu amb benchmark del rendiment en dues plataformes multicore multithreading de diferents modalitats de paral·lelització de multiplicacions de matrius de nombres enters i de nombres en coma flotant mitjançant el model de memòria compartida OpenMP versió 2.5 i OpenMP versió 3.0.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introducción y objetivos. Se ha señalado que, en la miocardiopatía hipertrófica (MCH), la desorganización de las fibras regionales da lugar a segmentos en los que la deformación es nula o está gravemente reducida, y que estos segmentos tienen una distribución no uniforme en el ventrículo izquierdo (VI). Esto contrasta con lo observado en otros tipos de hipertrofia como en el corazón de atleta o la hipertrofia ventricular izquierda hipertensiva (HVI-HT), en los que puede haber una deformación cardiaca anormal, pero nunca tan reducida como para que se observe ausencia de deformación. Así pues, proponemos el empleo de la distribución de los valores de strain para estudiar la deformación en la MCH. Métodos. Con el empleo de resonancia magnética marcada (tagged), reconstruimos la deformación sistólica del VI de 12 sujetos de control, 10 atletas, 12 pacientes con MCH y 10 pacientes con HVI-HT. La deformación se cuantificó con un algoritmo de registro no rígido y determinando los valores de strain sistólico máximo radial y circunferencial en 16 segmentos del VI. Resultados. Los pacientes con MCH presentaron unos valores medios de strain significativamente inferiores a los de los demás grupos. Sin embargo, aunque la deformación observada en los individuos sanos y en los pacientes con HVI-HT se concentraba alrededor del valor medio, en la MCH coexistían segmentos con contracción normal y segmentos con una deformación nula o significativamente reducida, con lo que se producía una mayor heterogeneidad de los valores de strain. Se observaron también algunos segmentos sin deformación incluso en ausencia de fibrosis o hipertrofia. Conclusiones. La distribución de strain caracteriza los patrones específicos de deformación miocárdica en pacientes con diferentes etiologías de la HVI. Los pacientes con MCH presentaron un valor medio de strain significativamente inferior, así como una mayor heterogeneidad de strain (en comparación con los controles, los atletas y los pacientes con HVI-HT), y tenían regiones sin deformación.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Remote sensing spatial, spectral, and temporal resolutions of images, acquired over a reasonably sized image extent, result in imagery that can be processed to represent land cover over large areas with an amount of spatial detail that is very attractive for monitoring, management, and scienti c activities. With Moore's Law alive and well, more and more parallelism is introduced into all computing platforms, at all levels of integration and programming to achieve higher performance and energy e ciency. Being the geometric calibration process one of the most time consuming processes when using remote sensing images, the aim of this work is to accelerate this process by taking advantage of new computing architectures and technologies, specially focusing in exploiting computation over shared memory multi-threading hardware. A parallel implementation of the most time consuming process in the remote sensing geometric correction has been implemented using OpenMP directives. This work compares the performance of the original serial binary versus the parallelized implementation, using several multi-threaded modern CPU architectures, discussing about the approach to nd the optimum hardware for a cost-e ective execution.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new parametric minimum distance time-domain estimator for ARFIMA processes is introduced in this paper. The proposed estimator minimizes the sum of squared correlations of residuals obtained after filtering a series through ARFIMA parameters. The estimator iseasy to compute and is consistent and asymptotically normally distributed for fractionallyintegrated (FI) processes with an integration order d strictly greater than -0.75. Therefore, it can be applied to both stationary and non-stationary processes. Deterministic components are also allowed in the DGP. Furthermore, as a by-product, the estimation procedure provides an immediate check on the adequacy of the specified model. This is so because the criterion function, when evaluated at the estimated values, coincides with the Box-Pierce goodness of fit statistic. Empirical applications and Monte-Carlo simulations supporting the analytical results and showing the good performance of the estimator in finite samples are also provided.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Behavioral consequences of a brain insult represent an interaction between the injury and the capacity of the rest of the brain to adapt to it. We provide experimental support for the notion that genetic factors play a critical role in such adaptation. We induced a controlled brain disruption using repetitive transcranial magnetic stimulation (rTMS) and show that APOE status determines its impact on distributed brain networks as assessed by functional MRI (fMRI).Twenty non-demented elders exhibiting mild memory dysfunction underwent two fMRI studies during face-name encoding tasks (before and after rTMS). Baseline task performance was associated with activation of a network of brain regions in prefrontal, parietal, medial temporal and visual associative areas. APOE ε4 bearers exhibited this pattern in two separate independent components, whereas ε4-non carriers presented a single partially overlapping network. Following rTMS all subjects showed slight ameliorations in memory performance, regardless of APOE status. However, after rTMS APOE ε4-carriers showed significant changes in brain network activation, expressing strikingly similar spatial configuration as the one observed in the non-carrier group prior to stimulation. Similarly, activity in areas of the default-mode network (DMN) was found in a single component among the ε4-non bearers, whereas among carriers it appeared disaggregated in three distinct spatiotemporal components that changed to an integrated single component after rTMS. Our findings demonstrate that genetic background play a fundamental role in the brain responses to focal insults, conditioning expression of distinct brain networks to sustain similar cognitive performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Neurodevelopmental disruptions caused by obstetric complications play a role in the etiology of several phenotypes associated with neuropsychiatric diseases and cognitive dysfunctions. Importantly, it has been noticed that epigenetic processes occurring early in life may mediate these associations. Here, DNA methylation signatures at IGF2 (insulin-like growth factor 2) and IGF2BP1-3 (IGF2-binding proteins 1-3) were examined in a sample consisting of 34 adult monozygotic (MZ) twins informative for obstetric complications and cognitive performance. Multivariate linear regression analysis of twin data was implemented to test for associations between methylation levels and both birth weight (BW) and adult working memory (WM) performance. Familial and unique environmental factors underlying these potential relationships were evaluated. A link was detected between DNA methylation levels of two CpG sites in the IGF2BP1 gene and both BW and adult WM performance. The BW-IGF2BP1 methylation association seemed due to non-shared environmental factors influencing BW, whereas the WM-IGF2BP1 methylation relationship seemed mediated by both genes and environment. Our data is in agreement with previous evidence indicating that DNA methylation status may be related to prenatal stress and later neurocognitive phenotypes. While former reports independently detected associations between DNA methylation and either BW or WM, current results suggest that these relationships are not confounded by each other.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The analysis of the promoter sequence of genes with similar expression patterns isa basic tool to annotate common regulatory elements. Multiple sequence alignments are on thebasis of most comparative approaches. The characterization of regulatory regions from coexpressedgenes at the sequence level, however, does not yield satisfactory results in manyoccasions as promoter regions of genes sharing similar expression programs often do not shownucleotide sequence conservation.Results: In a recent approach to circumvent this limitation, we proposed to align the maps ofpredicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of tworelated promoters, taking into account the label of the corresponding factor and the position in theprimary sequence. We have now extended the basic algorithm to permit multiple promotercomparisons using the progressive alignment paradigm. In addition, non-collinear conservationblocks might now be identified in the resulting alignments. We have optimized the parameters ofthe algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafishorthologous gene promoters.Conclusion: Results in this dataset indicate that TF-map alignments are able to detect high-levelregulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detectedby the typical sequence alignments. Three particular examples are introduced here to illustrate thepower of the multiple TF-map alignments to characterize conserved regulatory elements inabsence of sequence similarity. We consider this kind of approach can be extremely useful in thefuture to annotate potential transcription factor binding sites on sets of co-regulated genes fromhigh-throughput expression experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although it is commonly accepted that most macroeconomic variables are nonstationary, it is often difficult to identify the source of the non-stationarity. In particular, it is well-known that integrated and short memory models containing trending components that may display sudden changes in their parameters share some statistical properties that make their identification a hard task. The goal of this paper is to extend the classical testing framework for I(1) versus I(0)+ breaks by considering a a more general class of models under the null hypothesis: non-stationary fractionally integrated (FI) processes. A similar identification problem holds in this broader setting which is shown to be a relevant issue from both a statistical and an economic perspective. The proposed test is developed in the time domain and is very simple to compute. The asymptotic properties of the new technique are derived and it is shown by simulation that it is very well-behaved in finite samples. To illustrate the usefulness of the proposed technique, an application using inflation data is also provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The usual development of the continuous-time random walk (CTRW) assumes that jumps and time intervals are a two-dimensional set of independent and identically distributed random variables. In this paper, we address the theoretical setting of nonindependent CTRWs where consecutive jumps and/or time intervals are correlated. An exact solution to the problem is obtained for the special but relevant case in which the correlation solely depends on the signs of consecutive jumps. Even in this simple case, some interesting features arise, such as transitions from unimodal to bimodal distributions due to correlation. We also develop the necessary analytical techniques and approximations to handle more general situations that can appear in practice.