936 resultados para Thread safe parallel run-time
Resumo:
As plataformas com múltiplos núcleos tornaram a programação paralela/concorrente num tópico de interesse geral. Diversos modelos de programação têm vindo a ser propostos, facilitando aos programadores a identificação de regiões de código potencialmente paralelizáveis, deixando ao sistema operativo a tarefa de as escalonar dinamicamente em tempo de execução, explorando o maior grau possível de paralelismo. O Java não foge a esta tendência, disponibilizando ao programador um número crescente de bibliotecas de mecanismos de sincronização e paralelização de código. Neste contexto, esta tese apresenta e discute um conjunto de resultados obtidos através de testes intensivos à eficiência de algoritmos de ordenação implementados com recurso aos mecanismos de concorrência da API do Java 8 (Threads, Threadpools, ExecutorService, CountdownLach, ExecutorCompletionService e ForkJoinPools) em sistemas com um número de núcleos variável. Para cada um dos mecanismos, são apresentadas conclusões sobre o seu funcionamento e discutidos os cenários em que o seu uso pode ser rentabilizado de modo a serem obtidos melhores tempos de execução.
Resumo:
This study examined the validity and reliability of a sequential "Run-Bike-Run" test (RBR) in age-group triathletes. Eight Olympic distance (OD) specialists (age 30.0 ± 2.0 years, mass 75.6 ± 1.6 kg, run VO2max 63.8 ± 1.9 ml· kg(-1)· min(-1), cycle VO2peak 56.7 ± 5.1 ml· kg(-1)· min(-1)) performed four trials over 10 days. Trial 1 (TRVO2max) was an incremental treadmill running test. Trials 2 and 3 (RBR1 and RBR2) involved: 1) a 7-min run at 15 km· h(-1) (R1) plus a 1-min transition to 2) cycling to fatigue (2 W· kg(-1) body mass then 30 W each 3 min); 3) 10-min cycling at 3 W· kg(-1) (Bsubmax); another 1-min transition and 4) a second 7-min run at 15 km· h(-1) (R2). Trial 4 (TT) was a 30-min cycle - 20-min run time trial. No significant differences in absolute oxygen uptake (VO2), heart rate (HR), or blood lactate concentration ([BLA]) were evidenced between RBR1 and RBR2. For all measured physiological variables, the limits of agreement were similar, and the mean differences were physiologically unimportant, between trials. Low levels of test-retest error (i.e. ICC <0.8, CV<10%) were observed for most (logged) measurements. However [BLA] post R1 (ICC 0.87, CV 25.1%), [BLA] post Bsubmax (ICC 0.99, CV 16.31) and [BLA] post R2 (ICC 0.51, CV 22.9%) were least reliable. These error ranges may help coaches detect real changes in training status over time. Moreover, RBR test variables can be used to predict discipline specific and overall TT performance. Cycle VO2peak, cycle peak power output, and the change between R1 and R2 (deltaR1R2) in [BLA] were most highly related to overall TT distance (r = 0.89, p < 0. 01; r = 0.94, p < 0.02; r = 0.86, p < 0.05, respectively). The percentage of TR VO2max at 15 km· h(-1), and deltaR1R2 HR, were also related to run TT distance (r = -0.83 and 0.86, both p < 0.05).
Resumo:
With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.
Resumo:
This thesis estimates long-run time variant conditional correlation between stock and bond returns of CIVETS (Colombia, Indonesia, Vietnam, Egypt, Turkey, and South Africa) nations. Further, aims to analyse the presence of asymmetric volatility effect in both asset returns, as well as, obverses increment or decrement in conditional correlation during pre-crisis and crisis period, which lead to make a reliable diversification decision. The Constant Conditional Correlation (CCC) GARCH model of Bollerslev (1990), the Dynamic Conditional Correlation (DCC) GARCH model (Engle 2002), and the Asymmetric Dynamic Conditional Correlation (ADCC) GARCH model of Cappiello, Engle, and Sheppard (2006) were implemented in the study. The analyses present strong evidence of time-varying conditional correlation in CIVETS markets, excluding Vietnam, during 2005-2013. In addition, negative innovation effects were found in both conditional variance and correlation of the asset returns. The results of this study recommend investors to include financial assets from these markets in portfolios, in order to obtain better stock-bond diversification benefits, especially during high volatility periods.
Resumo:
Statistical Machine Translation (SMT) is one of the potential applications in the field of Natural Language Processing. The translation process in SMT is carried out by acquiring translation rules automatically from the parallel corpora. However, for many language pairs (e.g. Malayalam- English), they are available only in very limited quantities. Therefore, for these language pairs a huge portion of phrases encountered at run-time will be unknown. This paper focuses on methods for handling such out-of-vocabulary (OOV) words in Malayalam that cannot be translated to English using conventional phrase-based statistical machine translation systems. The OOV words in the source sentence are pre-processed to obtain the root word and its suffix. Different inflected forms of the OOV root are generated and a match is looked up for the word variants in the phrase translation table of the translation model. A Vocabulary filter is used to choose the best among the translations of these word variants by finding the unigram count. A match for the OOV suffix is also looked up in the phrase entries and the target translations are filtered out. Structuring of the filtered phrases is done and SMT translation model is extended by adding OOV with its new phrase translations. By the results of the manual evaluation done it is observed that amount of OOV words in the input has been reduced considerably
Resumo:
The increasing demand for cheaper-faster-better services anytime and anywhere has made radio network optimisation much more complex than ever before. In order to dynamically optimise the serving network, Dynamic Network Optimisation (DNO), is proposed as the ultimate solution and future trend. The realization of DNO, however, has been hindered by a significant bottleneck of the optimisation speed as the network complexity grows. This paper presents a multi-threaded parallel solution to accelerate complicated proprietary network optimisation algorithms, under a rigid condition of numerical consistency. ariesoACP product from Arieso Ltd serves as the platform for parallelisation. This parallel solution has been benchmarked and results exhibit a high scalability and a run-time reduction by 11% to 42% based on the technology, subscriber density and blocking rate of a given network in comparison with the original version. Further, it is highly essential that the parallel version produces equivalent optimisation quality in terms of identical optimisation outputs.
Resumo:
Hybrid multiprocessor architectures which combine re-configurable computing and multiprocessors on a chip are being proposed to transcend the performance of standard multi-core parallel systems. Both fine-grained and coarse-grained parallel algorithm implementations are feasible in such hybrid frameworks. A compositional strategy for designing fine-grained multi-phase regular processor arrays to target hybrid architectures is presented in this paper. The method is based on deriving component designs using classical regular array techniques and composing the components into a unified global design. Effective designs with phase-changes and data routing at run-time are characteristics of these designs. In order to describe the data transfer between phases, the concept of communication domain is introduced so that the producer–consumer relationship arising from multi-phase computation can be treated in a unified way as a data routing phase. This technique is applied to derive new designs of multi-phase regular arrays with different dataflow between phases of computation.
Resumo:
The aim of this study was to develop a fast capillary electrophoresis method for the determination of propranolol in pharmaceutical preparations. In the method development the pH and constituents of the background electrolyte were selected using the effective mobility versus pH curves. Benzylamine was used as the internal standard. The background electrolyte was composed of 60 mmol L(-1) tris(hydroxymethyl)aminomethane and 30 mmol L(-1) 2-hydroxyisobutyric acid,at pH 8.1. Separation was conducted in a fused-silica capillary (32 cm total length and 8.5 cm effective length, 50 mu m I.D.) with a short-end injection configuration and direct UV detection at 214 nm. The run time was only 14 s. Three different strategies were studied in order to develop a fast CE method with low total analysis time for propranolol analysis: low flush time (Lflush) 35 runs/h, without flush (Wflush) 52 runs/h, and Invert (switched polarity) 45 runs/h. Since the three strategies developed are statistically equivalent, Mush was selected due to the higher analytical frequency in comparison with the other methods. A few figures of merit of the proposed method include: good linearity (R(2) > 0.9999); limit of detection of 0.5 mg L(-1): inter-day precision better than 1.03% (n = 9) and recovery in the range of 95.1-104.5%. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Conventional debugging tools present developers with means to explore the run-time context in which an error has occurred. In many cases this is enough to help the developer discover the faulty source code and correct it. However, rather often errors occur due to code that has executed in the past, leaving certain objects in an inconsistent state. The actual run-time error only occurs when these inconsistent objects are used later in the program. So-called back-in-time debuggers help developers step back through earlier states of the program and explore execution contexts not available to conventional debuggers. Nevertheless, even back-in-time debuggers do not help answer the question, ``Where did this object come from?'' The Object-Flow Virtual Machine, which we have proposed in previous work, tracks the flow of objects to answer precisely such questions, but this VM does not provide dedicated debugging support to explore faulty programs. In this paper we present a novel debugger, called Compass, to navigate between conventional run-time stack-oriented control flow views and object flows. Compass enables a developer to effectively navigate from an object contributing to an error back-in-time through all the code that has touched the object. We present the design and implementation of Compass, and we demonstrate how flow-centric, back-in-time debugging can be used to effectively locate the source of hard-to-find bugs.
Resumo:
Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also,a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with medium to large parallel execution overheads.
Resumo:
Studying independence of goals has proven very useful in the context of logic programming. In particular, it has provided a formal basis for powerful automatic parallelization tools, since independence ensures that two goals may be evaluated in parallel while preserving correctness and eciency. We extend the concept of independence to constraint logic programs (CLP) and prove that it also ensures the correctness and eciency of the parallel evaluation of independent goals. Independence for CLP languages is more complex than for logic programming as search space preservation is necessary but no longer sucient for ensuring correctness and eciency. Two additional issues arise. The rst is that the cost of constraint solving may depend upon the order constraints are encountered. The second is the need to handle dynamic scheduling. We clarify these issues by proposing various types of search independence and constraint solver independence, and show how they can be combined to allow dierent optimizations, from parallelism to intelligent backtracking. Sucient conditions for independence which can be evaluated \a priori" at run-time are also proposed. Our study also yields new insights into independence in logic programming languages. In particular, we show that search space preservation is not only a sucient but also a necessary condition for ensuring correctness and eciency of parallel execution.
Resumo:
This paper addresses the issue of the practicality of global flow analysis in logic program compilation, in terms of speed of the analysis, precisión, and usefulness of the information obtained. To this end, design and implementation aspects are discussed for two practical abstract interpretation-based flow analysis systems: MA , the MCC And-parallel Analyzer and Annotator; and Ms, an experimental mode inference system developed for SB-Prolog. The paper also provides performance data obtained (rom these implementations and, as an example of an application, a study of the usefulness of the mode information obtained in reducing run-time checks in independent and-parallelism.Based on the results obtained, it is concluded that the overhead of global flow analysis is not prohibitive, while the results of analysis can be quite precise and useful.
Resumo:
We propose a general framework for assertion-based debugging of constraint logic programs. Assertions are linguistic constructions for expressing properties of programs. We define several assertion schemas for writing (partial) specifications for constraint logic programs using quite general properties, including user-defined programs. The framework is aimed at detecting deviations of the program behavior (symptoms) with respect to the given assertions, either at compile-time (i.e., statically) or run-time (i.e., dynamically). We provide techniques for using information from global analysis both to detect at compile-time assertions which do not hold in at least one of the possible executions (i.e., static symptoms) and assertions which hold for all possible executions (i.e., statically proved assertions). We also provide program transformations which introduce tests in the program for checking at run-time those assertions whose status cannot be determined at compile-time. Both the static and the dynamic checking are provably safe in the sense that all errors flagged are definite violations of the pecifications. Finally, we report briefly on the currently implemented instances of the generic framework.
Resumo:
We propose a general framework for assertion-based debugging of constraint logic programs. Assertions are linguistic constructions which allow expressing properties of programs. We define assertion schemas which allow writing (partial) specifications for constraint logic programs using quite general properties, including user-defined programs. The framework is aimed at detecting deviations of the program behavior (symptoms) with respect to the given assertions, either at compile-time or run-time. We provide techniques for using information from global analysis both to detect at compile-time assertions which do not hold in at least one of the possible executions (i.e., static symptoms) and assertions which hold for all possible executions (i.e., statically proved assertions). We also provide program transformations which introduce tests in the program for checking at run-time those assertions whose status cannot be determined at compile-time. Both the static and the dynamic checking are provably safe in the sense that all errors flagged are definite violations of the specifications. Finally, we report on an implemented instance of the assertion language and framework.
Resumo:
Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also, a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with médium to large parallel execution overheads.