970 resultados para Efficient implementation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose Trade & Cap (T&C), an economics-inspired mechanism that incentivizes users to voluntarily coordinate their consumption of the bandwidth of a shared resource (e.g., a DSLAM link) so as to converge on what they perceive to be an equitable allocation, while ensuring efficient resource utilization. Under T&C, rather than acting as an arbiter, an Internet Service Provider (ISP) acts as an enforcer of what the community of rational users sharing the resource decides is a fair allocation of that resource. Our T&C mechanism proceeds in two phases. In the first, software agents acting on behalf of users engage in a strategic trading game in which each user agent selfishly chooses bandwidth slots to reserve in support of primary, interactive network usage activities. In the second phase, each user is allowed to acquire additional bandwidth slots in support of presumed open-ended need for fluid bandwidth, catering to secondary applications. The acquisition of this fluid bandwidth is subject to the remaining "buying power" of each user and by prevalent "market prices" – both of which are determined by the results of the trading phase and a desirable aggregate cap on link utilization. We present analytical results that establish the underpinnings of our T&C mechanism, including game-theoretic results pertaining to the trading phase, and pricing of fluid bandwidth allocation pertaining to the capping phase. Using real network traces, we present extensive experimental results that demonstrate the benefits of our scheme, which we also show to be practical by highlighting the salient features of an efficient implementation architecture.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe an approach aimed at addressing the issue of joint exploitation of control (stream) and data parallelism in a skeleton based parallel programming environment, based on annotations and refactoring. Annotations drive efficient implementation of a parallel computation. Refactoring is used to transform the associated skeleton tree into a more efficient, functionally equivalent skeleton tree. In most cases, cost models are used to drive the refactoring process. We show how sample use case applications/kernels may be optimized and discuss preliminary experiments with FastFlow assessing the theoretical results. © 2013 Springer-Verlag.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data flow techniques have been around since the early '70s when they were used in compilers for sequential languages. Shortly after their introduction they were also consideredas a possible model for parallel computing, although the impact here was limited. Recently, however, data flow has been identified as a candidate for efficient implementation of various programming models on multi-core architectures. In most cases, however, the burden of determining data flow "macro" instructions is left to the programmer, while the compiler/run time system manages only the efficient scheduling of these instructions. We discuss a structured parallel programming approach supporting automatic compilation of programs to macro data flow and we show experimental results demonstrating the feasibility of the approach and the efficiency of the resulting "object" code on different classes of state-of-the-art multi-core architectures. The experimental results use different base mechanisms to implement the macro data flow run time support, from plain pthreads with condition variables to more modern and effective lock- and fence-free parallel frameworks. Experimental results comparing efficiency of the proposed approach with those achieved using other, more classical, parallel frameworks are also presented. © 2012 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider a multiple femtocell deployment in a small area which shares spectrum with the underlaid macrocell. We design a joint energy and radio spectrum scheme which aims not only for co-existence with the macrocell, but also for an energy-efficient implementation of the multi-femtocells. Particularly, aggregate energy usage on dense femtocell channels is formulated taking into account the cost of both the spectrum and energy usage. We investigate an energy-and-spectral efficient approach to balance between the two costs by varying the number of active sub-channels and their energy. The proposed scheme is addressed by deriving closed-form expressions for the interference towards the macrocell and the outage capacity. Analytically, discrete regions under which the most promising outage capacity is achieved by the same size of active sub-channels are introduced. Through a joint optimization of the sub-channels and their energy, properties can be found for the maximum outage capacity under realistic constraints. Using asymptotic and numerical analysis, it can be noticed that in a dense femtocell deployment, the optimum utilization of the energy and the spectrum to maximize the outage capacity converges towards a round-robin scheduling approach for a very small outage threshold. This is the inverse of the traditional greedy approach. © 2012 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present BDDT, a task-parallel runtime system that dynamically discovers and resolves dependencies among parallel tasks. BDDT allows the programmer to specify detailed task footprints on any memory address range, multidimensional array tile or dynamic region. BDDT uses a block-based dependence analysis with arbitrary granularity. The analysis is applicable to existing C programs without having to restructure object or array allocation, and provides flexibility in array layouts and tile dimensions.
We evaluate BDDT using a representative set of benchmarks, and we compare it to SMPSs (the equivalent runtime system in StarSs) and OpenMP. BDDT performs comparable to or better than SMPSs and is able to cope with task granularity as much as one order of magnitude finer than SMPSs. Compared to OpenMP, BDDT performs up to 3.9× better for benchmarks that benefit from dynamic dependence analysis. BDDT provides additional data annotations to bypass dependence analysis. Using these annotations, BDDT outperforms OpenMP also in benchmarks where dependence analysis does not discover additional parallelism, thanks to a more efficient implementation of the runtime system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a fully-distributed self-healing algorithm DEX, that maintains a constant degree expander network in a dynamic setting. To the best of our knowledge, our algorithm provides the first efficient distributed construction of expanders - whose expansion properties hold deterministically - that works even under an all-powerful adaptive adversary that controls the dynamic changes to the network (the adversary has unlimited computational power and knowledge of the entire network state, can decide which nodes join and leave and at what time, and knows the past random choices made by the algorithm). Previous distributed expander constructions typically provide only probabilistic guarantees on the network expansion which rapidly degrade in a dynamic setting, in particular, the expansion properties can degrade even more rapidly under adversarial insertions and deletions. Our algorithm provides efficient maintenance and incurs a low overhead per insertion/deletion by an adaptive adversary: only O(log n) rounds and O(log n) messages are needed with high probability (n is the number of nodes currently in the network). The algorithm requires only a constant number of topology changes. Moreover, our algorithm allows for an efficient implementation and maintenance of a distributed hash table (DHT) on top of DEX, with only a constant additional overhead. Our results are a step towards implementing efficient self-healing networks that have guaranteed properties (constant bounded degree and expansion) despite dynamic changes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a fully-distributed self-healing algorithm dex that maintains a constant degree expander network in a dynamic setting. To the best of our knowledge, our algorithm provides the first efficient distributed construction of expanders—whose expansion properties holddeterministically—that works even under an all-powerful adaptive adversary that controls the dynamic changes to the network (the adversary has unlimited computational power and knowledge of the entire network state, can decide which nodes join and leave and at what time, and knows the past random choices made by the algorithm). Previous distributed expander constructions typically provide only probabilistic guarantees on the network expansion whichrapidly degrade in a dynamic setting; in particular, the expansion properties can degrade even more rapidly under adversarial insertions and deletions. Our algorithm provides efficient maintenance and incurs a low overhead per insertion/deletion by an adaptive adversary: only O(logn)O(log⁡n) rounds and O(logn)O(log⁡n) messages are needed with high probability (n is the number of nodes currently in the network). The algorithm requires only a constant number of topology changes. Moreover, our algorithm allows for an efficient implementation and maintenance of a distributed hash table on top of dex  with only a constant additional overhead. Our results are a step towards implementing efficient self-healing networks that have guaranteed properties (constant bounded degree and expansion) despite dynamic changes.

Gopal Pandurangan has been supported in part by Nanyang Technological University Grant M58110000, Singapore Ministry of Education (MOE) Academic Research Fund (AcRF) Tier 2 Grant MOE2010-T2-2-082, MOE AcRF Tier 1 Grant MOE2012-T1-001-094, and the United States-Israel Binational Science Foundation (BSF) Grant 2008348. Peter Robinson has been supported by Grant MOE2011-T2-2-042 “Fault-tolerant Communication Complexity in Wireless Networks” from the Singapore MoE AcRF-2. Work done in part while the author was at the Nanyang Technological University and at the National University of Singapore. Amitabh Trehan has been supported by the Israeli Centers of Research Excellence (I-CORE) program (Center No. 4/11). Work done in part while the author was at Hebrew University of Jerusalem and at the Technion and supported by a Technion fellowship.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O desenvolvimento de sistemas computacionais é um processo complexo, com múltiplas etapas, que requer uma análise profunda do problema, levando em consideração as limitações e os requisitos aplicáveis. Tal tarefa envolve a exploração de técnicas alternativas e de algoritmos computacionais para optimizar o sistema e satisfazer os requisitos estabelecidos. Neste contexto, uma das mais importantes etapas é a análise e implementação de algoritmos computacionais. Enormes avanços tecnológicos no âmbito das FPGAs (Field-Programmable Gate Arrays) tornaram possível o desenvolvimento de sistemas de engenharia extremamente complexos. Contudo, o número de transístores disponíveis por chip está a crescer mais rapidamente do que a capacidade que temos para desenvolver sistemas que tirem proveito desse crescimento. Esta limitação já bem conhecida, antes de se revelar com FPGAs, já se verificava com ASICs (Application-Specific Integrated Circuits) e tem vindo a aumentar continuamente. O desenvolvimento de sistemas com base em FPGAs de alta capacidade envolve uma grande variedade de ferramentas, incluindo métodos para a implementação eficiente de algoritmos computacionais. Esta tese pretende proporcionar uma contribuição nesta área, tirando partido da reutilização, do aumento do nível de abstracção e de especificações algorítmicas mais automatizadas e claras. Mais especificamente, é apresentado um estudo que foi levado a cabo no sentido de obter critérios relativos à implementação em hardware de algoritmos recursivos versus iterativos. Depois de serem apresentadas algumas das estratégias para implementar recursividade em hardware mais significativas, descreve-se, em pormenor, um conjunto de algoritmos para resolver problemas de pesquisa combinatória (considerados enquanto exemplos de aplicação). Versões recursivas e iterativas destes algoritmos foram implementados e testados em FPGA. Com base nos resultados obtidos, é feita uma cuidada análise comparativa. Novas ferramentas e técnicas de investigação que foram desenvolvidas no âmbito desta tese são também discutidas e demonstradas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Consideramos o problema de controlo óptimo de tempo mínimo para sistemas de controlo mono-entrada e controlo afim num espaço de dimensão finita com condições inicial e final fixas, onde o controlo escalar toma valores num intervalo fechado. Quando aplicamos o método de tiro a este problema, vários obstáculos podem surgir uma vez que a função de tiro não é diferenciável quando o controlo é bang-bang. No caso bang-bang os tempos conjugados são teoricamente bem definidos para este tipo de sistemas de controlo, contudo os algoritmos computacionais directos disponíveis são de difícil aplicação. Por outro lado, no caso suave o conceito teórico e prático de tempos conjugados é bem conhecido, e ferramentas computacionais eficazes estão disponíveis. Propomos um procedimento de regularização para o qual as soluções do problema de tempo mínimo correspondente dependem de um parâmetro real positivo suficientemente pequeno e são definidas por funções suaves em relação à variável tempo, facilitando a aplicação do método de tiro simples. Provamos, sob hipóteses convenientes, a convergência forte das soluções do problema regularizado para a solução do problema inicial, quando o parâmetro real tende para zero. A determinação de tempos conjugados das trajectórias localmente óptimas do problema regularizado enquadra-se na teoria suave conhecida. Provamos, sob hipóteses adequadas, a convergência do primeiro tempo conjugado do problema regularizado para o primeiro tempo conjugado do problema inicial bang-bang, quando o parâmetro real tende para zero. Consequentemente, obtemos um algoritmo eficiente para a computação de tempos conjugados no caso bang-bang.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cette étude est consacrée à l’université publique marocaine. Elle se situe dans le champ de l’enseignement supérieur public. Les chercheurs du secteur universitaire au Maroc qualifient la gestion de l’enseignement supérieur de centralisée, bureaucratisée, rigide et incapable de trouver des réponses efficaces à la société. L’université publique marocaine vit une crise : elle a fait l’objet de nombreux critiques sur la nature des services universitaires. Sur le plan académique, elle est inappropriée pour faire face à la demande sociale en matière de l’enseignement universitaire. Sur le plan interne, elle est inadaptée à cause de dysfonctionnement pédagogie, organisationnel et administratif. L’université publique n’a pas été apte à s’adapter au secteur privé en créant des débouchés viables pour ses diplômés. Devant la gravité de la situation de l’enseignement supérieur public marocain, une Commission Royale Spéciale a été créée, dont le mandat était de trouver une meilleure façon de rationaliser le système universitaire. C’est ainsi qu’en 1999, la Commission a établi une Charte nationale de l’éducation et de la formation. Les premiers éléments de la nouvelle réforme ont été mis en application dès la rentrée universitaire 2003-2004. Cette nouvelle réforme est perçue comme un moyen d’améliorer le fonctionnement des établissements universitaires publics. Son objectif principal est de réformer d’une manière globale le système universitaire public. Dans les recherches qui se sont intéressées à la réforme de l’université publique marocaine, nous avons constaté qu’il y a une absence de documentation en ce qui trait aux réactions des acteurs universitaires et professionnels face aux orientations de cette réforme. Dans le but d’apporter des éclaircissements, nous nous sommes fixé un double objectif : déterminer, à partir de la perception d’acteurs universitaires, les effets des orientations de la nouvelle réforme et de ses modalités; connaître les changements organisationnels et leurs exigences. La stratégie de recherche répondant le mieux à notre double objectif était la recherche exploratoire. La démarche que nous avons privilégiée fut celle d’une première étude avant l’implantation de la nouvelle réforme et d’une autre après trois semestres de son implantation. Les questions qui ont soutenu notre recherche sont les suivantes : les attitudes des acteurs universitaires ont-elles été modifiées par l’introduction de la nouvelle réforme? Si oui, dans quel sens ont-elles été modifiées? Est-ce que la nouvelle réforme a modifié les pratiques pédagogiques et financières dans le sens indiqué par la charte? Quelles formes de contribution des acteurs universitaires peuvent-ils apporter à une implantation efficace de la nouvelle réforme? Parmi les quatorze universités publiques que compte le Maroc, nous avons choisi l’Université Mohammed V de Rabat-Salé. Cet établissement est l'une des universités les plus anciennes au Maroc. Elle est caractérisée par un nombre significatif de départements qui ont un potentiel de recherche et une réputation nationale. Aucune université ne dispose d’autant de facultés et de différentes disciplines : lettres, sciences, économie, droit, médecine et pharmacie, médecine dentaire, ingénierie, technologie et autres. La démarche méthodologique retenue est axée sur des entrevues auprès des acteurs universitaires et professionnels de trois facultés : 1) faculté des Lettres et Sciences humaines, 2) faculté des Sciences juridiques, économiques et sociales, 3) faculté des Sciences. Celles-ci sont considérées comme des facultés pilotes par rapport à la nouvelle réforme. Nous avons entrepris deux séries d’entrevues : la première en 2001 avant l’implantation de la nouvelle réforme de l’université et la deuxième en 2005 après son implantation. Nous avons mené au total quarante-cinq (45) entrevues qui se sont déroulées en deux périodes : la première a eu lieu entre décembre 2000 et janvier 2001 et la deuxième entre décembre 2004 et janvier 2005. Lors de la première série d’entrevues, notre protocole était composé de questions spécifiques portant sur les initiatives inhérentes à la mise en application d’un système modulaire, sur les procédures pour restructurer la formation universitaire publique, sur le développement de projets spéciaux et de matériel didactique en rapport avec le nouveau système pédagogique et sur les propositions et les procédures pour la participation de l’université au marché du travail. Nous avons aussi posé des questions concernant les aspects financiers. Enfin, pour mieux comprendre le contexte, des questions portaient sur les évaluations et les recommandations de la nouvelle réforme de l’université publique. Au cours de la deuxième période d’entrevues, nous avons recueilli des données sur le soutien du département au pilotage des objectifs de la nouvelle réforme universitaire, le soutien des instances professionnelles à l’avancement de la réforme, la coopération des enseignants au plan de l’avancement des pratiques pédagogiques et les conditions nécessaires à une implantation efficace. Les réponses obtenues auprès des acteurs universitaires et professionnels ont été soumises à une analyse de contenu. Nous avons opté pour le modèle politique comme cadre conceptuel de notre recherche. Ce modèle nous a aidés à montrer l’importance des acteurs universitaires et professionnels dans les démarches pour l’application de la nouvelle réforme. Il nous a aidés également à comprendre comment les caractéristiques de la communauté universitaire peuvent faciliter ou bloquer la réussite de la réforme en cours. Cette recherche montre dans quelle mesure les objectifs de la nouvelle réforme fixés par la Commission Royale Spéciale sont en voie de réalisation. En ce sens, notre recherche pourrait être utile au plan national marocain : elle pourrait aider les responsables politiques et les administrateurs universitaires à prendre des décisions appropriées au processus d’implantation de la nouvelle réforme universitaire.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation, few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and can foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images with computed tomography (CT) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image called EMMA. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation. Finally, we will describe a number of additional real-world applications that can be solved efficiently and reliably using EMMA. EMMA can be used in machine learning to find maximally informative projections of high-dimensional data. EMMA can also be used to detect and correct corruption in magnetic resonance images (MRI).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MPJ Express is our implementation of MPI-like bindings for Java. In this paper we discuss our intermediate buffering layer that makes use of the so-called direct byte buffers introduced in the Java New I/O package. The purpose of this layer is to support the implementation of derived datatypes. MPJ Express is the first Java messaging library that implements this feature using pure Java. In addition, this buffering layer allows efficient implementation of communication devices based on proprietary networks such as Myrinet. In this paper we evaluate the performance of our buffering layer and demonstrate the usefulness of direct byte buffers. Also, we evaluate the performance of MPJ Express against other messaging systems using Myrinet and show that our buffering layer has made it possible to avoid the overheads suffered by other Java systems such as mpiJava that relies on the Java Native Interface.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MPJ Express is our implementation of MPI-like bindings for Java. In this paper we discuss our intermediate buffering layer that makes use of the so-called direct byte buffers introduced in the Java New I/O package. The purpose of this layer is to support the implementation of derived datatypes. MPJ Express is the first Java messaging library that implements this feature using pure Java. In addition, this buffering layer allows efficient implementation of communication devices based on proprietary networks such as Myrinet. In this paper we evaluate the performance of our buffering layer and demonstrate the usefulness of direct byte buffers. Also, we evaluate the performance of MPJ Express against other messaging systems using Myrinet and show that our buffering layer has made it possible to avoid the overheads suffered by other Java systems such as mpiJava that relies on the Java Native Interface.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a polynomial-based noise variance estimator for multiple-input multiple-output single-carrier block transmission (MIMO-SCBT) systems. It is shown that the optimal pilots for noise variance estimation satisfy the same condition as that for channel estimation. Theoretical analysis indicates that the proposed estimator is statistically more efficient than the conventional sum of squared residuals (SSR) based estimator. Furthermore, we obtain an efficient implementation of the estimator by exploiting its special structure. Numerical results confirm our theoretical analysis.