844 resultados para Execution Efficiency


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This work was supported by FCT (Fundação para a Ciência e Tecnologia) within Project Scope (UID/CEC/00319/2013), by LIP (Laboratório de Instrumentação e Física Experimental de Partículas) and by Project Search-ON2 (NORTE-07-0162- FEDER-000086), co-funded by the North Portugal Regional Operational Programme (ON.2 - O Novo Norte), under the National Strategic Reference Framework, through the European Regional Development Fund.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The last years have presented an increase in the acceptance and adoption of the parallel processing, as much for scientific computation of high performance as for applications of general intention. This acceptance has been favored mainly for the development of environments with massive parallel processing (MPP - Massively Parallel Processing) and of the distributed computation. A common point between distributed systems and MPPs architectures is the notion of message exchange, that allows the communication between processes. An environment of message exchange consists basically of a communication library that, acting as an extension of the programming languages that allow to the elaboration of applications parallel, such as C, C++ and Fortran. In the development of applications parallel, a basic aspect is on to the analysis of performance of the same ones. Several can be the metric ones used in this analysis: time of execution, efficiency in the use of the processing elements, scalability of the application with respect to the increase in the number of processors or to the increase of the instance of the treat problem. The establishment of models or mechanisms that allow this analysis can be a task sufficiently complicated considering parameters and involved degrees of freedom in the implementation of the parallel application. An joined alternative has been the use of collection tools and visualization of performance data, that allow the user to identify to points of strangulation and sources of inefficiency in an application. For an efficient visualization one becomes necessary to identify and to collect given relative to the execution of the application, stage this called instrumentation. In this work it is presented, initially, a study of the main techniques used in the collection of the performance data, and after that a detailed analysis of the main available tools is made that can be used in architectures parallel of the type to cluster Beowulf with Linux on X86 platform being used libraries of communication based in applications MPI - Message Passing Interface, such as LAM and MPICH. This analysis is validated on applications parallel bars that deal with the problems of the training of neural nets of the type perceptrons using retro-propagation. The gotten conclusions show to the potentiality and easinesses of the analyzed tools.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Equipped with state-of-the-art smartphones and mobile devices, today's highly interconnected urban population is increasingly dependent on these gadgets to organize and plan their daily lives. These applications often rely on current (or preferred) locations of individual users or a group of users to provide the desired service, which jeopardizes their privacy; users do not necessarily want to reveal their current (or preferred) locations to the service provider or to other, possibly untrusted, users. In this paper, we propose privacy-preserving algorithms for determining an optimal meeting location for a group of users. We perform a thorough privacy evaluation by formally quantifying privacy-loss of the proposed approaches. In order to study the performance of our algorithms in a real deployment, we implement and test their execution efficiency on Nokia smartphones. By means of a targeted user-study, we attempt to get an insight into the privacy-awareness of users in location-based services and the usability of the proposed solutions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El manejo sustentable de los recursos naturales relacionados con proyectos de utilización de los recursos hídricos (entre otros), requiere en muchos casos de la modificación del relieve existente. Esto conlleva la necesidad de adecuación de la capa homogénea superior del suelo, operación que suele denominarse "sistematización", la cual facilita una distribución más uniforme de las lluvias y del agua de riego. Esta modificación de la capa superior del suelo es realizada en base a un proyecto, cuya inclinación responda a las pendientes naturales o a las establecidas por el diseñador. En la ejecución del diseño proyectado, en superficies superiores a una hectárea, el movimiento de tierra se realiza con equipos pesados, que no aseguran un alto porcentaje de eficiencia en lo que al movimiento de tierra se refiere, ya que parte del material se pierde en el acarreo, pero muy especialmente, por la compactación desuniforme del mismo, asociada con las texturas complejas del suelo a trabajar. El presente trabajo determinó el índice de precisión en la ejecución del proyecto de sistematización a partir de un índice estadístico internacionalmente aceptado, el "Root Mean Squared Error (RMSE)", comparando los valores altimétricos proyectados y los realmente obtenidos luego de la ejecución del proyecto, en tres parcelas con distinta secuencia de labores y maquinaria utilizadas, pero con el mismo tipo de suelo en el área del eje Pilar - La Plata (Argentina). Los resultados obtenidos, que varían de un RMSE de 4 a 6 cm, permiten concluir, para los sitios y las condiciones estudiadas, que no pueden asegurarse en la sistematización índices de precisión en la ejecución de la obra, inferiores a los 4 cm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

European Master Human Rights and Democratisation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniques for maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables, and an approach for performing parallel addition of N input symbols.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniquesfor maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables,and an approach for performing parallel addition of N input symbols.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Specific choices about how to represent complex networks can have a substantial impact on the execution time required for the respective construction and analysis of those structures. In this work we report a comparison of the effects of representing complex networks statically by adjacency matrices or dynamically by adjacency lists. Three theoretical models of complex networks are considered: two types of Erdos-Renyi as well as the Barabasi-Albert model. We investigated the effect of the different representations with respect to the construction and measurement of several topological properties (i.e. degree, clustering coefficient, shortest path length, and betweenness centrality). We found that different forms of representation generally have a substantial effect on the execution time, with the sparse representation frequently resulting in remarkably superior performance. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the present work, the effects of spatial constraints on the efficiency of task execution in systems underlain by geographical complex networks are investigated, where the probability of connection decreases with the distance between the nodes. The investigation considers several configurations of the parameters defining the network connectivity, and the Barabasi-Albert network model is also considered for comparisons. The results show that the effect of connectivity is significant only for shorter tasks, the locality of connection simplied by the spatial constraints reduces efficiency, and the addition of edges can improve the efficiency of the execution, although with increasing locality of the connections the improvement is small.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

“Biosim” is a simulation software which works to simulate the harvesting system.This system is able to design a model for any logistic problem with the combination of several objects so that the artificial system can show the performance of an individual model. The system will also describe the efficiency, possibility to be chosen for real life application of that particular model. So, when any one wish to setup a logistic model like- harvesting system, in real life he/she may be noticed about the suitable prostitution for his plants and factories as well as he/she may get information about the least number of objects, total time to complete the task, total investment required for his model, total amount of noise produced for his establishment in advance. It will produce an advance over view for his model. But “Biosim” is quite slow .As it is an object based system, it takes long time to make its decision. Here the main task is to modify the system so that it can work faster than the previous. So, the main objective of this thesis is to reduce the load of “Biosim” by making some modification of the original system as well as to increase its efficiency. So that the whole system will be faster than the previous one and performs more efficiently when it will be applied in real life. Theconcept is to separate the execution part of ”Biosim” form its graphical engine and run this separated portion in a third generation language platform. C++ is chosenhere as this external platform. After completing the proposed system, results with different models have been observed. The results show that, for any type of plants of fields, for any number of trucks, the proposed system is faster than the original system. The proposed system takes at least 15% less time “Biosim”. The efficiency increase with the complexity of than the original the model. More complex the model, more efficient the proposed system is than original “Biosim”.Depending on the complexity of a model, the proposed system can be 56.53 % faster than the original “Biosim”.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many hymenopteran insect societies, selfish workers are policed, as selfishness can negatively affect the average inclusive fitness of one or both castes by reducing either the degree of average relatedness to the colony's male offspring or colony efficiency. In stingless bees, the rapid capping of brood cells could aid in controlling selfishness; to this end, we studied cell-sealing efficacy in Melipona bicolor. Execution of cell sealing was found to be both rapid and almost continuous. Comparing the performance of reproductive and non-reproductive workers, the former sealed the cells more efficiently when they contained their own eggs, but less so when the queens' eggs were involved. We argue that the occurrence of disruptions in cell sealing through self-serving reproductive workers is capable of undermining sealing efficacy as a policing instrument, thus making reproductive workers potential rogue individuals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present the design and implementation of the and-parallel component of ACE. ACE is a computational model for the full Prolog language that simultaneously exploits both or-parallelism and independent and-parallelism. A high performance implementation of the ACE model has been realized and its performance reported in this paper. We discuss how some of the standard problems which appear when implementing and-parallel systems are solved in ACE. We then propose a number of optimizations aimed at reducing the overheads and the increased memory consumption which occur in such systems when using previously proposed solutions. Finally, we present results from an implementation of ACE which includes the optimizations proposed. The results show that ACE exploits and-parallelism with high efficiency and high speedups. Furthermore, they also show that the proposed optimizations, which are applicable to many other and-parallel systems, significantly decrease memory consumption and increase speedups and absolute performance both in forwards execution and during backtracking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents some fundamental properties of independent and-parallelism and extends its applicability by enlarging the class of goals eligible for parallel execution. A simple model of (independent) and-parallel execution is proposed and issues of correctness and efficiency discussed in the light of this model. Two conditions, "strict" and "non-strict" independence, are defined and then proved sufficient to ensure correctness and efñciency of parallel execution: if goals which meet these conditions are executed in parallel the solutions obtained are the same as those produced by standard sequential execution. Also, in absence of failure, the parallel proof procedure does not genérate any additional work (with respect to standard SLD-resolution) while the actual execution time is reduced. Finally, in case of failure of any of the goals no slow down will occur. For strict independence the results are shown to hold independently of whether the parallel goals execute in the same environment or in sepárate environments. In addition, a formal basis is given for the automatic compile-time generation of independent and-parallelism: compile-time conditions to efficiently check goal independence at run-time are proposed and proved sufficient. Also, rules are given for constructing simpler conditions if information regarding the binding context of the goals to be executed in parallel is available to the compiler.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents and proves some fundamental results for independent and-parallelism (IAP). First, the paper treats the issues of correctness and efficiency: after defining strict and non-strict goal independence, it is proved that if strictly independent goals are executed in parallel the solutions obtained are the same as those produced by standard sequential execution. It is also shown that, in the absence of failure, the parallel proof procedure doesn't genérate any additional work (with respect to standard SLDresolution) while the actual execution time is reduced. The same results hold even if non-strictly independent goals are executed in parallel, provided a trivial rewriting of such goals is performed. In addition, and most importantly, treats the issue of compile-time generation of IAP by proposing conditions, to be written at compile-time, to efficiently check strict and non-strict goal independence at run-time and proving the sufficiency of such conditions. It is also shown how simpler conditions can be constructed if some information regarding the binding context of the goals to be executed in parallel is available to the compiler trough either local or program-level analysis. These results therefore provide a formal basis for the automatic compile-time generation of IAP. As a corollary of such results, the paper also proves that negative goals are always non-strictly independent, and that goals which share a first occurrence of an existential variable are never independent.