171 resultados para Networking and Communications
Resumo:
Multicore computational accelerators such as GPUs are now commodity components for highperformance computing at scale. While such accelerators have been studied in some detail as stand-alone computational engines, their integration in large-scale distributed systems raises new challenges and trade-offs. In this paper, we present an exploration of resource management alternatives for building asymmetric accelerator-based distributed systems. We present these alternatives in the context of a capabilities-aware framework for data-intensive computing, which uses an enhanced implementation of the MapReduce programming model for accelerator-based clusters, compared to the state of the art. The framework can transparently utilize heterogeneous accelerators for deriving high performance with low programming effort. Our work is the first to compare heterogeneous types of accelerators, GPUs and a Cell processors, in the same environment and the first to explore the trade-offs between compute-efficient and control-efficient accelerators on data-intensive systems. Our investigation shows that our framework scales well with the number of different compute nodes. Furthermore, it runs simultaneously on two different types of accelerators, successfully adapts to the resource capabilities, and performs 26.9% better on average than a static execution approach.
Resumo:
Advances in silicon technology have been a key development in the realisation of many telecommunication and signal processing systems. In many cases, the development of application-specific digital signal processing (DSP) chips is the most cost-effective solution and provides the highest performance. Advances made in computer-aided design (CAD) tools and design methodologies now allow designers to develop complex chips within months or even weeks. This paper gives an insight into the challenges and design methodologies of implementing advanced highperformance chips for DSP. In particular, the paper reviews some of the techniques used to develop circuit architectures from high-level descriptions and the tools which are then used to realise silicon layout.
Resumo:
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow. Furthermore, thread-level parallelism in such programs is often restricted to pipeline parallelism, which can be hard to discover by a programmer. In this paper we propose a tool that, based on profiling information, helps the programmer to discover parallelism. The programmer hand-picks the code transformations from among the proposed candidates which are then applied by automatic code transformation techniques.
This paper contributes to the literature by presenting a profiling tool for discovering thread-level parallelism. We track dependencies at the whole-data structure level rather than at the element level or byte level in order to limit the profiling overhead. We perform a thorough analysis of the needs and costs of this technique. Furthermore, we present and validate the belief that programs with complex control and data flow contain significant amounts of exploitable coarse-grain pipeline parallelism in the program’s outer loops. This observation validates our approach to whole-data structure dependencies. As state-of-the-art compilers focus on loops iterating over data structure members, this observation also explains why our approach finds coarse-grain pipeline parallelism in cases that have remained out of reach for state-of-the-art compilers. In cases where traditional compilation techniques do find parallelism, our approach allows to discover higher degrees of parallelism, allowing a 40% speedup over traditional compilation techniques. Moreover, we demonstrate real speedups on multiple hardware platforms.
Resumo:
As a result of resource limitations, state in branch predictors is frequently shared between uncorrelated branches. This interference can significantly limit prediction accuracy. In current predictor designs, the branches sharing prediction information are determined by their branch addresses and thus branch groups are arbitrarily chosen during compilation. This feasibility study explores a more analytic and systematic approach to classify branches into clusters with similar behavioral characteristics. We present several ways to incorporate this cluster information as an additional information source in branch predictors.
Resumo:
Engagement with globalisation is growing in the field of youth transitions from out of home care. This includes cross national exchange of research, policy and practise, regional advocacy networking and global policy development. Furthering this emerging international child welfare perspective requires extending it to countries in the developing world and building conceptual frameworks which encompass a social ecology of care leaving, including its global dimension, the latter needs to address not only the needs, expectations and rights of care leavers but also the theories of change underpinning service design and delivery. Such a model is presented combining resilience and social capital as personal assets situated within a social ecology of support. To illustrate how this provides a means to help engage with the experience of countries where there appears to be very little information available on care leaving, a small scale South African initiative is considered. SA-YES is a youth mentoring project for young people leaving a variety of out of home placements. Planned as a three-year pilot, initial results are encouraging but require more rigorous evaluation focusing on program process and outcomes, quality of interpersonal relationships and synchronisation with cultural expectations and policy environment.
Resumo:
A cyberwar exists between malware writers and antimalware researchers. At this war's heart rages a weapons race that originated in the 80s with the first computer virus. Obfuscation is one of the latest strategies to camouflage the telltale signs of malware, undermine antimalware software, and thwart malware analysis. Malware writers use packers, polymorphic techniques, and metamorphic techniques to evade intrusion detection systems. The need exists for new antimalware approaches that focus on what malware is doing rather than how it's doing it.
Resumo:
This paper proposes a hybrid scanning antenna architecture for applications in mm-wave intelligent mobile sensing and communications. We experimentally demonstrate suitable W-band leaky-wave antenna prototypes in substrate integrated waveguide (SIW) technology. Three SIW antennas have been designed that within a 6.5 % fractional bandwidth provide beam scanning over three adjacent angular sectors. Prototypes have been fabricated and their performance has been experimentally evaluated. The measured radiation patterns have shown three frequency scanning beams covering angles from 11 to 56 degrees with beamwidth of 10?±?3 degrees within the 88-94 GHz frequency range.
Resumo:
Autonomic management can be used to improve the QoS provided by parallel/distributed applications. We discuss behavioural skeletons introduced in earlier work: rather than relying on programmer ability to design “from scratch” efficient autonomic policies, we encapsulate general autonomic controller features into algorithmic skeletons. Then we leave to the programmer the duty of specifying the parameters needed to specialise the skeletons to the needs of the particular application at hand. This results in the programmer having the ability to fast prototype and tune distributed/parallel applications with non-trivial autonomic management capabilities. We discuss how behavioural skeletons have been implemented in the framework of GCM(the Grid ComponentModel developed within the CoreGRID NoE and currently being implemented within the GridCOMP STREP project). We present results evaluating the overhead introduced by autonomic management activities as well as the overall behaviour of the skeletons. We also present results achieved with a long running application subject to autonomic management and dynamically adapting to changing features of the target architecture.
Overall the results demonstrate both the feasibility of implementing autonomic control via behavioural skeletons and the effectiveness of our sample behavioural skeletons in managing the “functional replication” pattern(s).
Resumo:
Mixture of Gaussians (MoG) modelling [13] is a popular approach to background subtraction in video sequences. Although the algorithm shows good empirical performance, it lacks theoretical justification. In this paper, we give a justification for it from an online stochastic expectation maximization (EM) viewpoint and extend it to a general framework of regularized online classification EM for MoG with guaranteed convergence. By choosing a special regularization function, l1 norm, we derived a new set of updating equations for l1 regularized online MoG. It is shown empirically that l1 regularized online MoG converge faster than the original online MoG .