960 resultados para Multi-threaded parallel tasks


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modular product architectures have generated numerous benefits for companies in terms of cost, lead-time and quality. The defined interfaces and the module’s properties decrease the effort to develop new product variants, and provide an opportunity to perform parallel tasks in design, manufacturing and assembly. The background of this thesis is that companies perform verifications (tests, inspections and controls) of products late, when most of the parts have been assembled. This extends the lead-time to delivery and ruins benefits from a modular product architecture; specifically when the verifications are extensive and the frequency of detected defects is high. Due to the number of product variants obtained from the modular product architecture, verifications must handle a wide range of equipment, instructions and goal values to ensure that high quality products can be delivered. As a result, the total benefits from a modular product architecture are difficult to achieve. This thesis describes a method for planning and performing verifications within a modular product architecture. The method supports companies by utilizing the defined modules for verifications already at module level, so called MPV (Module Property Verification). With MPV, defects are detected at an earlier point, compared to verification of a complete product, and the number of verifications is decreased. The MPV method is built up of three phases. In Phase A, candidate modules are evaluated on the basis of costs and lead-time of the verifications and the repair of defects. An MPV-index is obtained which quantifies the module and indicates if the module should be verified at product level or by MPV. In Phase B, the interface interaction between the modules is evaluated, as well as the distribution of properties among the modules. The purpose is to evaluate the extent to which supplementary verifications at product level is needed. Phase C supports a selection of the final verification strategy. The cost and lead-time for the supplementary verifications are considered together with the results from Phase A and B. The MPV method is based on a set of qualitative and quantitative measures and tools which provide an overview and support the achievement of cost and time efficient company specific verifications. A practical application in industry shows how the MPV method can be used, and the subsequent benefits

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data classification is a task with high applicability in a lot of areas. Most methods for treating classification problems found in the literature dealing with single-label or traditional problems. In recent years has been identified a series of classification tasks in which the samples can be labeled at more than one class simultaneously (multi-label classification). Additionally, these classes can be hierarchically organized (hierarchical classification and hierarchical multi-label classification). On the other hand, we have also studied a new category of learning, called semi-supervised learning, combining labeled data (supervised learning) and non-labeled data (unsupervised learning) during the training phase, thus reducing the need for a large amount of labeled data when only a small set of labeled samples is available. Thus, since both the techniques of multi-label and hierarchical multi-label classification as semi-supervised learning has shown favorable results with its use, this work is proposed and used to apply semi-supervised learning in hierarchical multi-label classication tasks, so eciently take advantage of the main advantages of the two areas. An experimental analysis of the proposed methods found that the use of semi-supervised learning in hierarchical multi-label methods presented satisfactory results, since the two approaches were statistically similar results

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unzipping carbon nanotubes (CNTs) is considered one of the most promising approaches for the controlled and large-scale production of graphene nanoribbons (GNR). These structures are considered of great importance for the development of nanoelectronics because of its dimensions and intrinsic nonzero band gap value. Despite many years of investigations some details on the dynamics of the CNT fracture/unzipping processes remain unclear. In this work we have investigated some of these process through molecular dynamics simulations using reactive force fields (ReaxFF), as implemented in the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) code. We considered multi-walled CNTs of different dimensions and chiralities and under induced mechanical stretching. Our preliminary results show that the unzipping mechanisms are highly dependent on CNT chirality. Well-defined and distinct fracture patterns were observed for the different chiralities. Armchair CNTs favor the creation of GNRs with well-defined armchair edges, while zigzag and chiral ones produce GNRs with less defined and defective edges. © 2012 Materials Research Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La presente tesi è dedicata al riuso nel software. Eccettuata un'introduzione organica al tema, l'analisi è a livello dei meccanismi offerti dai linguaggi di programmazione e delle tecniche di sviluppo, con speciale attenzione rivolta al tema della concorrenza. Il primo capitolo fornisce un quadro generale nel quale il riuso del software è descritto, assieme alle ragioni che ne determinano l'importanza e ai punti cruciali relativi alla sua attuazione. Si individuano diversi livelli di riuso sulla base dell'astrazione e degli artefatti in gioco, e si sottolinea come i linguaggi contribuiscano alla riusabilità e alla realizzazione del riuso. In seguito, viene esplorato, con esempi di codice, il supporto al riuso da parte del paradigma ad oggetti, in termini di incapsulamento, ereditarietà, polimorfismo, composizione. La trattazione prosegue analizzando differenti feature – tipizzazione, interfacce, mixin, generics – offerte da vari linguaggi di programmazione, mostrando come esse intervengano sulla riusabilità dei componenti software. A chiudere il capitolo, qualche parola contestualizzata sull'inversione di controllo, la programmazione orientata agli aspetti, e il meccanismo della delega. Il secondo capitolo abbraccia il tema della concorrenza. Dopo aver introdotto l'argomento, vengono approfonditi alcuni significativi modelli di concorrenza: programmazione multi-threaded, task nel linguaggio Ada, SCOOP, modello ad Attori. Essi vengono descritti negli elementi fondamentali e ne vengono evidenziati gli aspetti cruciali in termini di contributo al riuso, con esempi di codice. Relativamente al modello ad Attori, viene presentata la sua implementazione in Scala/Akka come caso studio. Infine, viene esaminato il problema dell'inheritance anomaly, sulla base di esempi e delle tre classi principali di anomalia, e si analizza la suscettibilità del supporto di concorrenza di Scala/Akka a riscontrare tali problemi. Inoltre, in questo capitolo si nota come alcuni aspetti relativi al binomio riuso/concorrenza, tra cui il significato profondo dello stesso, non siano ancora stati adeguatamente affrontati dalla comunità informatica. Il terzo e ultimo capitolo esordisce con una panoramica dell'agent-oriented programming, prendendo il linguaggio simpAL come riferimento. In seguito, si prova ad estendere al caso degli agenti la nozione di riuso approfondita nei capitoli precedenti.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Highly available software systems occasionally need to be updated while avoiding downtime. Dynamic software updates reduce down-time, but still require the system to reach a quiescent state in which a global update can be performed. This can be difficult for multi-threaded systems. We present a novel approach to dynamic updates using first-class contexts, called Theseus. First-class contexts make global updates unnecessary: existing threads run to termination in an old context, while new threads start in a new, updated context; consistency between contexts is ensured with the help of bidirectional transformations. We show that for multi-threaded systems with coherent memory, first-class contexts offer a practical and flexible approach to dynamic updates, with acceptable overhead.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Factorial designs for clinical trials are often encountered in medical, dental, and orthodontic research. Factorial designs assess two or more interventions simultaneously and the main advantage of this design is its efficiency in terms of sample size as more than one intervention may be assessed on the same participants. However, the factorial design is efficient only under the assumption of no interaction (no effect modification) between the treatments under investigation and, therefore, this should be considered at the design stage. Conversely, the factorial study design may also be used for the purpose of detecting an interaction between two interventions if the study is powered accordingly. However, a factorial design powered to detect an interaction has no advantage in terms of the required sample size compared to a multi-arm parallel trial for assessing more than one intervention. It is the purpose of this article to highlight the methodological issues that should be considered when planning, analysing, and reporting the simplest form of this design, which is the 2 × 2 factorial design. An example from the field of orthodontics using two parameters (bracket type and wire type) on maxillary incisor torque loss will be utilized in order to explain the design requirements, the advantages and disadvantages of this design, and its application in orthodontic research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also,a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with medium to large parallel execution overheads.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also, a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with médium to large parallel execution overheads.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrieving large amounts of information over wide area networks, including the Internet, is problematic due to issues arising from latency of response, lack of direct memory access to data serving resources, and fault tolerance. This paper describes a design pattern for solving the issues of handling results from queries that return large amounts of data. Typically these queries would be made by a client process across a wide area network (or Internet), with one or more middle-tiers, to a relational database residing on a remote server. The solution involves implementing a combination of data retrieval strategies, including the use of iterators for traversing data sets and providing an appropriate level of abstraction to the client, double-buffering of data subsets, multi-threaded data retrieval, and query slicing. This design has recently been implemented and incorporated into the framework of a commercial software product developed at Oracle Corporation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since the development of large scale power grid interconnections and power markets, research on available transfer capability (ATC) has attracted great attention. The challenges for accurate assessment of ATC originate from the numerous uncertainties in electricity generation, transmission, distribution and utilization sectors. Power system uncertainties can be mainly described as two types: randomness and fuzziness. However, the traditional transmission reliability margin (TRM) approach only considers randomness. Based on credibility theory, this paper firstly built models of generators, transmission lines and loads according to their features of both randomness and fuzziness. Then a random fuzzy simulation is applied, along with a novel method proposed for ATC assessment, in which both randomness and fuzziness are considered. The bootstrap method and multi-core parallel computing technique are introduced to enhance the processing speed. By implementing simulation for the IEEE-30-bus system and a real-life system located in Northwest China, the viability of the models and the proposed method is verified.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Processors with large numbers of cores are becoming commonplace. In order to utilise the available resources in such systems, the programming paradigm has to move towards increased parallelism. However, increased parallelism does not necessarily lead to better performance. Parallel programming models have to provide not only flexible ways of defining parallel tasks, but also efficient methods to manage the created tasks. Moreover, in a general-purpose system, applications residing in the system compete for the shared resources. Thread and task scheduling in such a multiprogrammed multithreaded environment is a significant challenge. In this thesis, we introduce a new task-based parallel reduction model, called the Glasgow Parallel Reduction Machine (GPRM). Our main objective is to provide high performance while maintaining ease of programming. GPRM supports native parallelism; it provides a modular way of expressing parallel tasks and the communication patterns between them. Compiling a GPRM program results in an Intermediate Representation (IR) containing useful information about tasks, their dependencies, as well as the initial mapping information. This compile-time information helps reduce the overhead of runtime task scheduling and is key to high performance. Generally speaking, the granularity and the number of tasks are major factors in achieving high performance. These factors are even more important in the case of GPRM, as it is highly dependent on tasks, rather than threads. We use three basic benchmarks to provide a detailed comparison of GPRM with Intel OpenMP, Cilk Plus, and Threading Building Blocks (TBB) on the Intel Xeon Phi, and with GNU OpenMP on the Tilera TILEPro64. GPRM shows superior performance in almost all cases, only by controlling the number of tasks. GPRM also provides a low-overhead mechanism, called “Global Sharing”, which improves performance in multiprogramming situations. We use OpenMP, as the most popular model for shared-memory parallel programming as the main GPRM competitor for solving three well-known problems on both platforms: LU factorisation of Sparse Matrices, Image Convolution, and Linked List Processing. We focus on proposing solutions that best fit into the GPRM’s model of execution. GPRM outperforms OpenMP in all cases on the TILEPro64. On the Xeon Phi, our solution for the LU Factorisation results in notable performance improvement for sparse matrices with large numbers of small blocks. We investigate the overhead of GPRM’s task creation and distribution for very short computations using the Image Convolution benchmark. We show that this overhead can be mitigated by combining smaller tasks into larger ones. As a result, GPRM can outperform OpenMP for convolving large 2D matrices on the Xeon Phi. Finally, we demonstrate that our parallel worksharing construct provides an efficient solution for Linked List processing and performs better than OpenMP implementations on the Xeon Phi. The results are very promising, as they verify that our parallel programming framework for manycore processors is flexible and scalable, and can provide high performance without sacrificing productivity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hardware vendors make an important effort creating low-power CPUs that keep battery duration and durability above acceptable levels. In order to achieve this goal and provide good performance-energy for a wide variety of applications, ARM designed the big.LITTLE architecture. This heterogeneous multi-core architecture features two different types of cores: big cores oriented to performance and little cores, slower and aimed to save energy consumption. As all the cores have access to the same memory, multi-threaded applications must resort to some mutual exclusion mechanism to coordinate the access to shared data by the concurrent threads. Transactional Memory (TM) represents an optimistic approach for shared-memory synchronization. To take full advantage of the features offered by software TM, but also benefit from the characteristics of the heterogeneous big.LITTLE architectures, our focus is to propose TM solutions that take into account the power/performance requirements of the application and what it is offered by the architecture. In order to understand the current state-of-the-art and obtain useful information for future power-aware software TM solutions, we have performed an analysis of a popular TM library running on top of an ARM big.LITTLE processor. Experiments show, in general, better scalability for the LITTLE cores for most of the applications except for one, which requires the computing performance that the big cores offer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concurrent software executes multiple threads or processes to achieve high performance. However, concurrency results in a huge number of different system behaviors that are difficult to test and verify. The aim of this dissertation is to develop new methods and tools for modeling and analyzing concurrent software systems at design and code levels. This dissertation consists of several related results. First, a formal model of Mondex, an electronic purse system, is built using Petri nets from user requirements, which is formally verified using model checking. Second, Petri nets models are automatically mined from the event traces generated from scientific workflows. Third, partial order models are automatically extracted from some instrumented concurrent program execution, and potential atomicity violation bugs are automatically verified based on the partial order models using model checking. Our formal specification and verification of Mondex have contributed to the world wide effort in developing a verified software repository. Our method to mine Petri net models automatically from provenance offers a new approach to build scientific workflows. Our dynamic prediction tool, named McPatom, can predict several known bugs in real world systems including one that evades several other existing tools. McPatom is efficient and scalable as it takes advantage of the nature of atomicity violations and considers only a pair of threads and accesses to a single shared variable at one time. However, predictive tools need to consider the tradeoffs between precision and coverage. Based on McPatom, this dissertation presents two methods for improving the coverage and precision of atomicity violation predictions: 1) a post-prediction analysis method to increase coverage while ensuring precision; 2) a follow-up replaying method to further increase coverage. Both methods are implemented in a completely automatic tool.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In this work, we present a multi-camera surveillance system based on the use of self-organizing neural networks to represent events on video. The system processes several tasks in parallel using GPUs (graphic processor units). It addresses multiple vision tasks at various levels, such as segmentation, representation or characterization, analysis and monitoring of the movement. These features allow the construction of a robust representation of the environment and interpret the behavior of mobile agents in the scene. It is also necessary to integrate the vision module into a global system that operates in a complex environment by receiving images from multiple acquisition devices at video frequency. Offering relevant information to higher level systems, monitoring and making decisions in real time, it must accomplish a set of requirements, such as: time constraints, high availability, robustness, high processing speed and re-configurability. We have built a system able to represent and analyze the motion in video acquired by a multi-camera network and to process multi-source data in parallel on a multi-GPU architecture.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Modern multicore processors for the embedded market are often heterogeneous in nature. One feature often available are multiple sleep states with varying transition cost for entering and leaving said sleep states. This research effort explores the energy efficient task-mapping on such a heterogeneous multicore platform to reduce overall energy consumption of the system. This is performed in the context of a partitioned scheduling approach and a very realistic power model, which improves over some of the simplifying assumptions often made in the state-of-the-art. The developed heuristic consists of two phases, in the first phase, tasks are allocated to minimise their active energy consumption, while the second phase trades off a higher active energy consumption for an increased ability to exploit savings through more efficient sleep states. Extensive simulations demonstrate the effectiveness of the approach.