459 resultados para T-parallelism
Resumo:
The work presented in this report is aimed to implement a cost-effective offline mission path planner for aerial inspection tasks of large linear infrastructures. Like most real-world optimisation problems, mission path planning involves a number of objectives which ideally should be minimised simultaneously. Understandably, the objectives of a practical optimisation problem are conflicting each other and the minimisation of one of them necessarily implies the impossibility to minimise the other ones. This leads to the need to find a set of optimal solutions for the problem; once such a set of available options is produced, the mission planning problem is reduced to a decision making problem for the mission specialists, who will choose the solution which best fit the requirements of the mission. The goal of this work is then to develop a Multi-Objective optimisation tool able to provide the mission specialists a set of optimal solutions for the inspection task amongst which the final trajectory will be chosen, given the environment data, the mission requirements and the definition of the objectives to minimise. All the possible optimal solutions of a Multi-Objective optimisation problem are said to form the Pareto-optimal front of the problem. For any of the Pareto-optimal solutions, it is impossible to improve one objective without worsening at least another one. Amongst a set of Pareto-optimal solutions, no solution is absolutely better than another and the final choice must be a trade-off of the objectives of the problem. Multi-Objective Evolutionary Algorithms (MOEAs) are recognised to be a convenient method for exploring the Pareto-optimal front of Multi-Objective optimization problems. Their efficiency is due to their parallelism architecture which allows to find several optimal solutions at each time
Resumo:
Multiprocessor systems which afford a high degree of parallelism are used in a variety of applications. The extremely stringent reliability requirement has made the provision of fault-tolerance an important aspect in the design of such systems. This paper presents a review of the various approaches towards tolerating hardware faults in multiprocessor systems. It. emphasizes the basic concepts of fault tolerant design and the various problems to be taken care of by the designer. An indepth survey of the various models, techniques and methods for fault diagnosis is given. Further, we consider the strategies for fault-tolerance in specialized multiprocessor architectures which have the ability of dynamic reconfiguration and are suited to VLSI implementation. An analysis of the state-óf-the-art is given which points out the major aspects of fault-tolerance in such architectures.
Resumo:
The further development of Taqman quantitative real-time PCR (qPCR) assays for the absolute quantitation of Marek's disease virus serotype 1 (MDV1) and Herpesvirus of turkeys (HVT) viruses is described and the sensitivity and reproducibility of each assay reported. Using plasmid DNA copies, the lower limit of detection was determined to be 5 copies for the MDV1 assay and 75 copies for the HVT assay. Both assays were found to be highly reproducible for Ct values and calculated copy numbers with mean intra- and inter-assay coefficients of variation being less than 5% for Ct and 20% for calculated copy number. The genome copy number of MDV1 and HVT viruses was quantified in PBL and feather tips from experimentally infected chickens, and field poultry dust samples. Parallelism was demonstrated between the plasmid-based standard curves, and standard curves derived from infected spleen material containing both viral and host DNA, allowing the latter to be used for absolute quantification. These methods should prove useful for the reliable differentiation and absolute quantitation of MDV1 and HVT viruses in a wide range of samples.
Resumo:
The StreamIt programming model has been proposed to exploit parallelism in streaming applications oil general purpose multicore architectures. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on accelerators such as Graphics Processing Units (GPUs) or CellBE which support abundant parallelism in hardware. In this paper, we describe a novel method to orchestrate the execution of if StreamIt program oil a multicore platform equipped with an accelerator. The proposed approach identifies, using profiling, the relative benefits of executing a task oil the superscalar CPU cores and the accelerator. We formulate the problem of partitioning the work between the CPU cores and the GPU, taking into account the latencies for data transfers and the required buffer layout transformations associated with the partitioning, as all integrated Integer Linear Program (ILP) which can then be solved by an ILP solver. We also propose an efficient heuristic algorithm for the work-partitioning between the CPU and the GPU, which provides solutions which are within 9.05% of the optimal solution on an average across the benchmark Suite. The partitioned tasks are then software pipelined to execute oil the multiple CPU cores and the Streaming Multiprocessors (SMs) of the GPU. The software pipelining algorithm orchestrates the execution between CPU cores and the GPU by emitting the code for the CPU and the GPU, and the code for the required data transfers. Our experiments on a platform with 8 CPU cores and a GeForce 8800 GTS 512 GPU show a geometric mean speedup of 6.94X with it maximum of 51.96X over it single threaded CPU execution across the StreamIt benchmarks. This is a 18.9% improvement over it partitioning strategy that maps only the filters that cannot be executed oil the GPU - the filters with state that is persistent across firings - onto the CPU.
Resumo:
This paper presents an inverse dynamic formulation by the Newton–Euler approach for the Stewart platform manipulator of the most general architecture and models all the dynamic and gravity effects as well as the viscous friction at the joints. It is shown that a proper elimination procedure results in a remarkably economical and fast algorithm for the solution of actuator forces, which makes the method quite suitable for on-line control purposes. In addition, the parallelism inherent in the manipulator and in the modelling makes the algorithm quite efficient in a parallel computing environment, where it can be made as fast as the corresponding formulation for the 6-dof serial manipulator. The formulation has been implemented in a program and has been used for a few trajectories planned for a test manipulator. Results of simulation presented in the paper reveal the nature of the variation of actuator forces in the Stewart platform and justify the dynamic modelling for control.
Resumo:
Previous studies have shown that buffering packets in DRAM is a performance bottleneck. In order to understand the impediments in accessing the DRAM, we developed a detailed Petri net model of IP forwarding application on IXP2400 that models the different levels of the memory hierarchy. The cell based interface used to receive and transmit packets in a network processor leads to some small size DRAM accesses. Such narrow accesses to the DRAM expose the bank access latency, reducing the bandwidth that can be realized. With real traces up to 30% of the accesses are smaller than the cell size, resulting in 7.7% reduction in DRAM bandwidth. To overcome this problem, we propose buffering these small chunks of data in the on chip scratchpad memory. This scheme also exploits greater degree of parallelism between different levels of the memory hierarchy. Using real traces from the internet, we show that the transmit rate can be improved by an average of 21% over the base scheme without the use of additional hardware. Further, the impact of different traffic patterns on the network processor resources is studied. Under real traffic conditions, we show that the data bus which connects the off-chip packet buffer to the micro-engines, is the obstacle in achieving higher throughput.
Resumo:
The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.
Resumo:
REDEFINE is a reconfigurable SoC architecture that provides a unique platform for high performance and low power computing by exploiting the synergistic interaction between coarse grain dynamic dataflow model of computation (to expose abundant parallelism in applications) and runtime composition of efficient compute structures (on the reconfigurable computation resources). We propose and study the throttling of execution in REDEFINE to maximize the architecture efficiency. A feature specific fast hybrid (mixed level) simulation framework for early in design phase study is developed and implemented to make the huge design space exploration practical. We do performance modeling in terms of selection of important performance criteria, ranking of the explored throttling schemes and investigate effectiveness of the design space exploration using statistical hypothesis testing. We find throttling schemes which give appreciable (24.8%) overall performance gain in the architecture and 37% resource usage gain in the throttling unit simultaneously.
Resumo:
Administration of 3,5-diethoxy carbonyl-1,4-dihydrocollidine (DDC) to mice resulted in a striking increase in the level of δ-aminolevulinic acid (ALA) synthetase in liver. Although the enzyme activity was primarily localized in mitochondria and postmicrosomal supernatant fluid, a significant level of activity was also detected in purified nuclei. The time course of induction showed a close parallelism between the bound and free enzyme activities with the former always accounting for a higher percentage of the total activity as compared to the latter. Studies with cycloheximide indicated a half-life of around 3 hr for both the bound and free ALA synthetase. Actinomycin D and hemin prevented enzyme induction when administered along with DDC, but when administered 12 hr after DDC treatment Actinomycin D did not lead to a decay of either the bound or free enzyme activity and hemin inhibited the bound enzyme activity but not the free enzyme level. The molecular sizes of the mitochondrial and cytosolic ALA synthetase(s) were found to be similar on sephadex columns.
Resumo:
The announcement of Turkey as a European Union (EU) candidate country in the Helsinki Summit (10, 11 December 1999) marked a distinct change of identity policy and attitudes towards its citizens. A result in the shift of mindset has been the launch of the first public service broadcasting TV channel for Kurdish people on the 1st of January 2009. TRT 6 (Şeş) broadcasting in unofficial Kurdish language is run by Turkish Radio and Television Corporation (TRT). The thesis attempts to elaborate on the discussions surrounding the launch of TRT 6, Turkey’s first public service broadcasting TV channel for its Kurdish citizens. The research aims at finding the discourses of multiculturalism and public service broadcasting through the mainstream Turkish newspapers, Cumhuriyet, Hurriyet, Sabah, Taraf and Zaman. The method used for the research is Critical Discourse Analysis (CDA) and the representative newspapers of the Turkish print media are under the question: How has the launch of TRT 6, as the first public service broadcasting channel of Turkey in Kurdish language, been discussed by Turkish daily newspapers in terms of multiculturalism and minority media? The most significant results of the research is that the concerning newspapers have mostly discussed the launch of TRT 6 in the same line with their political affiliation. Thus it is comprehensively concluded that the selected newspapers proved holding a high level of political parallelism, and low professionalism. However, it should be noted that Taraf differs itself from others while challenging the hegemonic discourses embedded in the articles of the other newspapers. Moreover, the study detected three types of discourses: Pro-multiculturalism discourse, Unification discourse, and Assimilation discourse. It can be concluded that in Turkey, media owners and even individual journalists have incentives to form ideological alliances with political parties, and media appears to be an instrument of power struggle. Today, Turkey seems to restore Kurdish identity in its identity policy and aims to proceed with the negotiation for membership of the European Union (EU). The country still strives to transform from the traditional nation-state to a multiethnic democratic state, with multiculturalism as a policy discussed throughout the two terms that the AKP government has been in power. However, this transformation is not an easy process because of the deep-rooted traditions of the nation-state structure that has also polarized the Turkish press.
Resumo:
Massively parallel SIMD computing is applied to obtain an order of magnitude improvement in the executional speed of an important algorithm in VLSI design automation. The physical design of a VLSI circuit involves logic module placement as a subtask. The paper is concerned with accelerating the well known Min-cut placement technique for logic cell placement. The inherent parallelism of the Min-cut algorithm is identified, and it is shown that a parallel machine based on the efficient execution of the placement procedure.
Resumo:
Conventional Random access scan (RAS) for testing has lower test application time, low power dissipation, and low test data volume compared to standard serial scan chain based design In this paper, we present two cluster based techniques, namely, Serial Input Random Access Scan and Variable Word Length Random Access Scan to reduce test application time even further by exploiting the parallelism among the clusters and performing write operations on multiple bits Experimental results on benchmarks circuits show on an average 2-3 times speed up in test write time and average 60% reduction in write test data volume
Resumo:
We present a distributed algorithm that finds a maximal edge packing in O(Δ + log* W) synchronous communication rounds in a weighted graph, independent of the number of nodes in the network; here Δ is the maximum degree of the graph and W is the maximum weight. As a direct application, we have a distributed 2-approximation algorithm for minimum-weight vertex cover, with the same running time. We also show how to find an f-approximation of minimum-weight set cover in O(f2k2 + fk log* W) rounds; here k is the maximum size of a subset in the set cover instance, f is the maximum frequency of an element, and W is the maximum weight of a subset. The algorithms are deterministic, and they can be applied in anonymous networks.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0 for nonnegative matrices A and C. We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound.