980 resultados para execution


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniquesfor maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables,and an approach for performing parallel addition of N input symbols.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The modern computer systems that are in use nowadays are mostly processor-dominant, which means that their memory is treated as a slave element that has one major task – to serve execution units data requirements. This organization is based on the classical Von Neumann's computer model, proposed seven decades ago in the 1950ties. This model suffers from a substantial processor-memory bottleneck, because of the huge disparity between the processor and memory working speeds. In order to solve this problem, in this paper we propose a novel architecture and organization of processors and computers that attempts to provide stronger match between the processing and memory elements in the system. The proposed model utilizes a memory-centric architecture, wherein the execution hardware is added to the memory code blocks, allowing them to perform instructions scheduling and execution, management of data requests and responses, and direct communication with the data memory blocks without using registers. This organization allows concurrent execution of all threads, processes or program segments that fit in the memory at a given time. Therefore, in this paper we describe several possibilities for organizing the proposed memory-centric system with multiple data and logicmemory merged blocks, by utilizing a high-speed interconnection switching network.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The parameterized expectations algorithm (PEA) involves a long simulation and a nonlinear least squares (NLS) fit, both embedded in a loop. Both steps are natural candidates for parallelization. This note shows that parallelization can lead to important speedups for the PEA. I provide example code for a simple model that can serve as a template for parallelization of more interesting models, as well as a download link for an image of a bootable CD that allows creation of a cluster and execution of the example code in minutes, with no need to install any software.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The objective of this work was to develop an easily applicable technique and a standardized protocol for high-quality post-mortem angiography. This protocol should (1) increase the radiological interpretation by decreasing artifacts due to the perfusion and by reaching a complete filling of the vascular system and (2) ease and standardize the execution of the examination. To this aim, 45 human corpses were investigated by post-mortem computed tomography (CT) angiography using different perfusion protocols, a modified heart-lung machine and a new contrast agent mixture, specifically developed for post-mortem investigations. The quality of the CT angiographies was evaluated radiologically by observing the filling of the vascular system and assessing the interpretability of the resulting images and by comparing radiological diagnoses to conventional autopsy conclusions. Post-mortem angiography yielded satisfactory results provided that the volumes of the injected contrast agent mixture were high enough to completely fill the vascular system. In order to avoid artifacts due to the post-mortem perfusion, a minimum of three angiographic phases and one native scan had to be performed. These findings were taken into account to develop a protocol for quality post-mortem CT angiography that minimizes the risk of radiological misinterpretation. The proposed protocol is easy applicable in a standardized way and yields high-quality radiologically interpretable visualization of the vascular system in post-mortem investigations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Un reto al ejecutar las aplicaciones en un cluster es lograr mejorar las prestaciones utilizando los recursos de manera eficiente, y este reto es mayor al utilizar un ambiente distribuido. Teniendo en cuenta este reto, se proponen un conjunto de reglas para realizar el cómputo en cada uno de los nodos, basado en el análisis de cómputo y comunicaciones de las aplicaciones, se analiza un esquema de mapping de celdas y un método para planificar el orden de ejecución, tomando en consideración la ejecución por prioridad, donde las celdas de fronteras tienen una mayor prioridad con respecto a las celdas internas. En la experimentación se muestra el solapamiento del computo interno con las comunicaciones de las celdas fronteras, obteniendo resultados donde el Speedup aumenta y los niveles de eficiencia se mantienen por encima de un 85%, finalmente se obtiene ganancias de los tiempos de ejecución, concluyendo que si se puede diseñar un esquemas de solapamiento que permita que la ejecución de las aplicaciones SPMD en un cluster se hagan de forma eficiente.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El principal objectiu d'aquest treball és proporcionar una metodologia per a reduir el temps de càlcul del mètode d'interpolació kriging sense pèrdua de la qualitat del model resultat. La solució adoptada ha estat la paral·lelització de l'algorisme mitjançant MPI sobre llenguatge C. Prèviament ha estat necessari automatitzar l'ajust del variograma que millor s'adapta a la distribució espacial de la variable d'estudi. Els resultats experimentals demostren la validesa de la solució implementada, en reduir de forma significativa els temps d'execució final de tot el procés.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A mesura que la complexitat de les tasques dels agents mòbils va creixent, és més important que aquestes no perdin el treball realitzat. Hem de saber en tot moment que la execució s’està desenvolupant favorablement. Aquest projecte tracta d’explicar el procés d’elaboració d’un component de tolerància a fallades des de la seva idea inicial fins a la seva implementació. Analitzarem la situació i dissenyarem una solució. Procurarem que el nostre component emmascari la fallada d’un agent, detectant-la i posteriorment recuperant l’execució des d’on s’ha interromput. Tot això procurant seguir la metodologia de disseny d’agents mòbils per a plataformes lleugeres.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

CISNE es un sistema de cómputo en paralelo del Departamento de Arquitectura de Computadores y Sistemas Operativos (DACSO). Para poder implementar políticas de ordenacción de colas y selección de trabajos, este sistema necesita predecir el tiempo de ejecución de las aplicaciones. Con este trabajo se pretende proveer al sistema CISNE de un método para predecir el tiempo de ejecución basado en un histórico donde se almacenarán todos los datos sobre las ejecuciones.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As computer chips implementation technologies evolve to obtain more performance, those computer chips are using smaller components, with bigger density of transistors and working with lower power voltages. All these factors turn the computer chips less robust and increase the probability of a transient fault. Transient faults may occur once and never more happen the same way in a computer system lifetime. There are distinct consequences when a transient fault occurs: the operating system might abort the execution if the change produced by the fault is detected by bad behavior of the application, but the biggest risk is that the fault produces an undetected data corruption that modifies the application final result without warnings (for example a bit flip in some crucial data). With the objective of researching transient faults in computer system’s processor registers and memory we have developed an extension of HP’s and AMD joint full system simulation environment, named COTSon. This extension allows the injection of faults that change a single bit in processor registers and memory of the simulated computer. The developed fault injection system makes it possible to: evaluate the effects of single bit flip transient faults in an application, analyze an application robustness against single bit flip transient faults and validate fault detection mechanism and strategies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Performance analysis is the task of monitor the behavior of a program execution. The main goal is to find out the possible adjustments that might be done in order improve the performance. To be able to get that improvement it is necessary to find the different causes of overhead. Nowadays we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software). When we talk about multicore we are also speaking of shared memory systems, on this master thesis we talk about the issues involved on the performance analysis and tuning of applications running specifically in a shared Memory system. We move one step ahead to take the performance analysis to another level by analyzing the applications structure and patterns. We also present some tools specifically addressed to the performance analysis of OpenMP multithread application. At the end we present the results of some experiments performed with a set of OpenMP scientific application.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En termes de temps d'execució i ús de dades, les aplicacions paral·leles/distribuïdes poden tenir execucions variables, fins i tot quan s'empra el mateix conjunt de dades d'entrada. Existeixen certs aspectes de rendiment relacionats amb l'entorn que poden afectar dinàmicament el comportament de l'aplicació, tals com: la capacitat de la memòria, latència de la xarxa, el nombre de nodes, l'heterogeneïtat dels nodes, entre d'altres. És important considerar que l'aplicació pot executar-se en diferents configuracions de maquinari i el desenvolupador d'aplicacions no port garantir que els ajustaments de rendiment per a un sistema en particular continuïn essent vàlids per a d'altres configuracions. L'anàlisi dinàmica de les aplicacions ha demostrat ser el millor enfocament per a l'anàlisi del rendiment per dues raons principals. En primer lloc, ofereix una solució molt còmoda des del punt de vista dels desenvolupadors mentre que aquests dissenyen i evaluen les seves aplicacions paral·leles. En segon lloc, perquè s'adapta millor a l'aplicació durant l'execució. Aquest enfocament no requereix la intervenció de desenvolupadors o fins i tot l'accés al codi font de l'aplicació. S'analitza l'aplicació en temps real d'execució i es considra i analitza la recerca dels possibles colls d'ampolla i optimitzacions. Per a optimitzar l'execució de l'aplicació bioinformàtica mpiBLAST, vam analitzar el seu comportament per a identificar els paràmetres que intervenen en el rendiment d'ella, com ara: l'ús de la memòria, l'ús de la xarxa, patrons d'E/S, el sistema de fitxers emprat, l'arquitectura del processador, la grandària de la base de dades biològica, la grandària de la seqüència de consulta, la distribució de les seqüències dintre d'elles, el nombre de fragments de la base de dades i/o la granularitat dels treballs assignats a cada procés. El nostre objectiu és determinar quins d'aquests paràmetres tenen major impacte en el rendiment de les aplicacions i com ajustar-los dinàmicament per a millorar el rendiment de l'aplicació. Analitzant el rendiment de l'aplicació mpiBLAST hem trobat un conjunt de dades que identifiquen cert nivell de serial·lització dintre l'execució. Reconeixent l'impacte de la caracterització de les seqüències dintre de les diferents bases de dades i una relació entre la capacitat dels workers i la granularitat de la càrrega de treball actual, aquestes podrien ser sintonitzades dinàmicament. Altres millores també inclouen optimitzacions relacionades amb el sistema de fitxers paral·lel i la possibilitat d'execució en múltiples multinucli. La grandària de gra de treball està influenciat per factors com el tipus de base de dades, la grandària de la base de dades, i la relació entre grandària de la càrrega de treball i la capacitat dels treballadors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Myc proto-oncoproteins are transcription factors that recognize numerous target genes through hexameric DNA sequences called E-boxes. The mechanism by which they then activate the expression of these targets is still under debate. Here, we use an RNAi screen in Drosophila S2 cells to identify Drosophila host cell factor (dHCF) as a novel co-factor for Myc that is functionally required for the activation of a Myc-dependent reporter construct. dHCF is also essential for the full activation of endogenous Myc target genes in S2 cells, and for the ability of Myc to promote growth in vivo. Myc and dHCF physically interact, and they colocalize on common target genes. Furthermore, down-regulation of dHCF-associated histone acetyltransferase and histone methyltransferase complexes in vivo interferes with the Myc biological activities. We therefore propose that dHCF recruits such chromatin-modifying complexes and thereby contributes to the expression of Myc targets and hence to the execution of Myc biological activities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Approximate Quickselect, a simple modification of the well known Quickselect algorithm for selection, can be used to efficiently find an element with rank k in a given range [i..j], out of n given elements. We study basic cost measures of Approximate Quickselect by computing exact and asymptotic results for the expected number of passes, comparisons and data moves during the execution of this algorithm. The key element appearing in the analysis of Approximate Quickselect is a trivariate recurrence that we solve in full generality. The general solution of the recurrence proves to be very useful, as it allows us to tackle several related problems, besides the analysis that originally motivated us. In particular, we have been able to carry out a precise analysis of the expected number of moves of the ith element when selecting the jth smallest element with standard Quickselect, where we are able to give both exact and asymptotic results. Moreover, we can apply our general results to obtain exact and asymptotic results for several parameters in binary search trees, namely the expected number of common ancestors of the nodes with rank i and j, the expected size of the subtree rooted at the least common ancestor of the nodes with rank i and j, and the expected distance between the nodes of ranks i and j.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En aquest projecte s’ha implementat un sistema de control per a les bombes microfluídiques LPVX de The Lee Company funcionant a mode de xeringa. El sistema consisteix en un circuit controlador basat en el microxip UDN 296 B de Allegro MicroSystems, que conté dos Ponts en H per a controlar motors pas a pas i dos mòduls de Modulació d’Amplada de Polsos (PWM), governat a partir d’un programa de control com a instrument virtual dissenyat sota l’entorn LabVIEW. El programa de control permet indicar la quantitat de volum a aspirar o dispensar per la bomba i escollir entre una execució simple o una de continuada, podent-ne controlar en aquest segona opció el temps entre execució i execució. El programa també permet visualitzar el procés mitjançant la obtenció de la imatge d’una webcam amb DirectShow. Finalment també permet el control remot de l’Instrument Virtual a través de la xarxa d’Internet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Experiences with population-based chemotherapy and other methods for the control of schistosomiasis mansoni in two subsaharan foci are described. In the forest area of Maniema (Zaire), intense transmission of Schistosoma mansoni, high prevalences and intensities of infection, and important morbidity have been documental. Taking into account the limited financial means and the poor logistic conditions, the control strategy has been based mainly on targeted chemotherapy of heavily infected people (>600 epg). After ten years of intervention, prevalences and intensities have hardly been affected, but the initial severe hepatosplenic morbidity has almost disappeared. In Burundi, a national research and control programme has been initiated in 1982. Prevalences, intensities and morbidity were moderate, transmission was focal and erratic in time and space. A more structural control strategy was developed, based on screening and selective therapy, health education, sanitation and domestic water supply. Prevalences and intensities have been considerably reduced, though the results show focal and unpredicatable variations. Transmission and reinfection were not signifcantly affected by chemotherapy alone, and eventual outcome of repeated selective treatment appears to be limited by the sensitivity of the screening method. Intestinal morbidity was strongly reduced by community-based selective treatment, but hepatosplenic enlargement was hardly affected; this is possibly due to the confounding impact of increasing malaria morbidity. The experiences show the importance of local structures and conditions for the development of an adapted control strategy. It is further concluded that population-based chemotherapy is a highly valid tool for the rapid control of morbidity, but should in most operational conditions not be considered as a tool for transmission control. Integration of planning, execution and surveillance in regular health services...