951 resultados para Worst-case execution-time
Resumo:
This paper analyzes the performance of a parallel implementation of Coupled Simulated Annealing (CSA) for the unconstrained optimization of continuous variables problems. Parallel processing is an efficient form of information processing with emphasis on exploration of simultaneous events in the execution of software. It arises primarily due to high computational performance demands, and the difficulty in increasing the speed of a single processing core. Despite multicore processors being easily found nowadays, several algorithms are not yet suitable for running on parallel architectures. The algorithm is characterized by a group of Simulated Annealing (SA) optimizers working together on refining the solution. Each SA optimizer runs on a single thread executed by different processors. In the analysis of parallel performance and scalability, these metrics were investigated: the execution time; the speedup of the algorithm with respect to increasing the number of processors; and the efficient use of processing elements with respect to the increasing size of the treated problem. Furthermore, the quality of the final solution was verified. For the study, this paper proposes a parallel version of CSA and its equivalent serial version. Both algorithms were analysed on 14 benchmark functions. For each of these functions, the CSA is evaluated using 2-24 optimizers. The results obtained are shown and discussed observing the analysis of the metrics. The conclusions of the paper characterize the CSA as a good parallel algorithm, both in the quality of the solutions and the parallel scalability and parallel efficiency
Power performance evaluation of an electric home fan with triac-based automatic speed control system
Resumo:
In order to provide a low cost system of thermal comfort, a common model of home fan, 40 cm diameter size, had its manual four-button control system replaced by an automatic speed control. The new control system has a temperature sensor feeding a microcontroller that, by using an optic coupling, DIAC or TRIAC-based circuit, varies the RMS value of the fan motor input voltage and its speed, according to the room temperature. Over a wide range of velocity, the fan net power and the motor fan input power were measured working under both control system. The temperature of the motor stator and the voltage waveforms were observed too. Measured values analysis showed that the TRIAC-based control system makes the fan motor work at a very low power factor and efficiency values. The worst case is at low velocity range where the higher fan motor stator temperatures were registered. The poor power factor and efficiency and the harmonics signals inserted in the motor input voltage wave by the TRIAC commutation procedure are correlated.
Resumo:
The increase of capacity to integrate transistors permitted to develop completed systems, with several components, in single chip, they are called SoC (System-on-Chip). However, the interconnection subsystem cans influence the scalability of SoCs, like buses, or can be an ad hoc solution, like bus hierarchy. Thus, the ideal interconnection subsystem to SoCs is the Network-on-Chip (NoC). The NoCs permit to use simultaneous point-to-point channels between components and they can be reused in other projects. However, the NoCs can raise the complexity of project, the area in chip and the dissipated power. Thus, it is necessary or to modify the way how to use them or to change the development paradigm. Thus, a system based on NoC is proposed, where the applications are described through packages and performed in each router between source and destination, without traditional processors. To perform applications, independent of number of instructions and of the NoC dimensions, it was developed the spiral complement algorithm, which finds other destination until all instructions has been performed. Therefore, the objective is to study the viability of development that system, denominated IPNoSys system. In this study, it was developed a tool in SystemC, using accurate cycle, to simulate the system that performs applications, which was implemented in a package description language, also developed to this study. Through the simulation tool, several result were obtained that could be used to evaluate the system performance. The methodology used to describe the application corresponds to transform the high level application in data-flow graph that become one or more packages. This methodology was used in three applications: a counter, DCT-2D and float add. The counter was used to evaluate a deadlock solution and to perform parallel application. The DCT was used to compare to STORM platform. Finally, the float add aimed to evaluate the efficiency of the software routine to perform a unimplemented hardware instruction. The results from simulation confirm the viability of development of IPNoSys system. They showed that is possible to perform application described in packages, sequentially or parallelly, without interruptions caused by deadlock, and also showed that the execution time of IPNoSys is more efficient than the STORM platform
Resumo:
The Reconfigurable Computing is an intermediate solution at the resolution of complex problems, making possible to combine the speed of the hardware with the flexibility of the software. An reconfigurable architecture possess some goals, among these the increase of performance. The use of reconfigurable architectures to increase the performance of systems is a well known technology, specially because of the possibility of implementing certain slow algorithms in the current processors directly in hardware. Amongst the various segments that use reconfigurable architectures the reconfigurable processors deserve a special mention. These processors combine the functions of a microprocessor with a reconfigurable logic and can be adapted after the development process. Reconfigurable Instruction Set Processors (RISP) are a subgroup of the reconfigurable processors, that have as goal the reconfiguration of the instruction set of the processor, involving issues such formats, operands and operations of the instructions. This work possess as main objective the development of a RISP processor, combining the techniques of configuration of the set of executed instructions of the processor during the development, and reconfiguration of itself in execution time. The project and implementation in VHDL of this RISP processor has as intention to prove the applicability and the efficiency of two concepts: to use more than one set of fixed instructions, with only one set active in a given time, and the possibility to create and combine new instructions, in a way that the processor pass to recognize and use them in real time as if these existed in the fixed set of instruction. The creation and combination of instructions is made through a reconfiguration unit, incorporated to the processor. This unit allows the user to send custom instructions to the processor, so that later he can use them as if they were fixed instructions of the processor. In this work can also be found simulations of applications involving fixed and custom instructions and results of the comparisons between these applications in relation to the consumption of power and the time of execution, which confirm the attainment of the goals for which the processor was developed
Resumo:
This work develops a new methodology in order to discriminate models for interval-censored data based on bootstrap residual simulation by observing the deviance difference from one model in relation to another, according to Hinde (1992). Generally, this sort of data can generate a large number of tied observations and, in this case, survival time can be regarded as discrete. Therefore, the Cox proportional hazards model for grouped data (Prentice & Gloeckler, 1978) and the logistic model (Lawless, 1982) can befitted by means of generalized linear models. Whitehead (1989) considered censoring to be an indicative variable with a binomial distribution and fitted the Cox proportional hazards model using complementary log-log as a link function. In addition, a logistic model can be fitted using logit as a link function. The proposed methodology arises as an alternative to the score tests developed by Colosimo et al. (2000), where such models can be obtained for discrete binary data as particular cases from the Aranda-Ordaz distribution asymmetric family. These tests are thus developed with a basis on link functions to generate such a fit. The example that motivates this study was the dataset from an experiment carried out on a flax cultivar planted on four substrata susceptible to the pathogen Fusarium oxysoprum. The response variable, which is the time until blighting, was observed in intervals during 52 days. The results were compared with the model fit and the AIC values.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
A performance comparison between a recently proposed novel technique known as fast orthogonal frequency-division multiplexing (FOFDM) and conventional orthogonal frequency-division multiplexing (OFDM) is undertaken over unamplified, intensity-modulated, and direct-detected directly modulated laser-based optical signals. Key transceiver parameters, such as the maximum achievable transmission capacity and the digital-to-analog/analog-to-digital converter (DAC/ADC) effects are explored thoroughly. It is shown that, similarly to conventional OFDM, the least complex and bandwidth efficient FOFDM can support up to similar to 20 Gb/s over 500 m worst-case multimode fiber (MMF) links having 3 dB effective bandwidths of similar to 200 MHz X km. For compensation of the DAC/ADC roll-off, a power-loading (PL) algorithm is adopted, leading to an FOFDM system improvement of similar to 4 dB. FOFDM and conventional OFDM give similar optimum DAC/ADC parameters over 500 m worst-case MMF, while over 50 km single-mode fiber a maximum deviation of only similar to 1 dB in clipping ratio is observed due to the imperfect chromatic dispersion compensation caused by one-tap equalizers.
Resumo:
An accurate switched-current (SI) memory cell and suitable for low-voltage low-power (LVLP) applications is proposed. Information is memorized as the gate-voltage of the input transistor, in a tunable gain-boosting triode-transconductor. Additionally, four-quadrant multiplication between the input voltage to the transconductor regulation-amplifier (X-operand) and the stored voltage (Y-operand) is provided. A simplified 2 x 2-memory array was prototyped according to a standard 0.8 mum n-well CMOS process and 1.8-V supply. Measured current-reproduction error is less than 0.26% for 0.25 muA less than or equal to I-SAMPLE less than or equal to 0.75 muA. Standby consumption is 6.75 muW per cell @I-SAMPLE = 0.75 muA. At room temperature, leakage-rate is 1.56 nA/ms. Four-quadrant multiplier (4QM) full-scale operands are 2x(max) = 320 mV(pp) and 2y(max). = 448 mV(pp), yielding a maximum output swing of 0.9 muA(pp). 4QM worst-case nonlinearity is 7.9%.
Resumo:
Although cluster environments have an enormous potential processing power, real applications that take advantage of this power remain an elusive goal. This is due, in part, to the lack of understanding about the characteristics of the applications best suited for these environments. This paper focuses on Master/Slave applications for large heterogeneous clusters. It defines application, cluster and execution models to derive an analytic expression for the execution time. It defines speedup and derives speedup bounds based on the inherent parallelism of the application and the aggregated computing power of the cluster. The paper derives an analytical expression for efficiency and uses it to define scalability of the algorithm-cluster combination based on the isoefficiency metric. Furthermore, the paper establishes necessary and sufficient conditions for an algorithm-cluster combination to be scalable which are easy to verify and use in practice. Finally, it covers the impact of network contention as the number of processors grow. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
A constant-current stimulator for high-impedance loads using only low-cost standard high-voltage components Is presented. A voltage-regulator powers an oscillator built across the primary of a step-up transformer whose secondary supplies, after rectification, the high voltage to a switched current-mirror in the driving stage. Adjusting the regulated voltage controls the pulsed-current intensity. A prototype produces stimulus of amplitude and pulsewidth within 0 less than or equal to I-skin less than or equal to 20 mA and 50 mus less than or equal to T-pulse less than or equal to 1 ms, respectively. Pulse-repetition spans from 1 Hz to 10 Hz. Worst case ripple is 3.7% at I-skin = 1 mA. Overall consumption is 5.6 W at I-skin = 20 mA.
Resumo:
An important stage in the solution of active vibration control in flexible structures is the optimal placement of sensors and actuators. In many works, the positioning of these devices in systems governed for parameter distributed is, mainly, based, in controllability approach or criteria of performance. The positions that enhance such parameters are considered optimal. These techniques do not take in account the space variation of disturbances. An way to enhance the robustness of the control design would be to locate the actuators considering the space distribution of the worst case of disturbances. This paper is addressed to include in the formulation of problem of optimal location of sensors and piezoelectric actuators the effect of external disturbances. The paper concludes with a numerical simulation in a truss structure considering that the disturbance is applied in a known point a priori. As objective function the C norm system is used. The LQR (Linear Quadratic Regulator) controller was used to quantify performance of different sensors/actuators configurations.
Resumo:
This work intends to analyze the application and execution time of a numerical algorithm that simulates incompressible and isothermal flows. It was used the explicit scheme of the Characteristic Based Split (CBS) algorithm and the Artificial Compressibility (AC) scheme for coupling pressure-velocity equations. The discretization was done with the finite element method using a bilinear elements grid. The free software GNU-Octave was used for implementation and execution of routines. The results were analyzed using the classic lid-driven cavity problem. This work shows results for tests with several Reynolds' number. The results for these tests show a good agreement when compared with previous ones obtained from bibliography. The code runtime's analysis shows yet that the matrix's assembly is the part of greater consumption time in the implementation.
Resumo:
The multi-relational Data Mining approach has emerged as alternative to the analysis of structured data, such as relational databases. Unlike traditional algorithms, the multi-relational proposals allow mining directly multiple tables, avoiding the costly join operations. In this paper, is presented a comparative study involving the traditional Patricia Mine algorithm and its corresponding multi-relational proposed, MR-Radix in order to evaluate the performance of two approaches for mining association rules are used for relational databases. This study presents two original contributions: the proposition of an algorithm multi-relational MR-Radix, which is efficient for use in relational databases, both in terms of execution time and in relation to memory usage and the presentation of the empirical approach multirelational advantage in performance over several tables, which avoids the costly join operations from multiple tables. © 2011 IEEE.
Resumo:
In high energy heavy ion collisions a hot and dense medium is formed, where the hadronic masses may be shifted from their asymptotic values. If this mass modification occurs, squeezed back-to-back correlations (BBC) of particle-antiparticle pairs are predicted to appear, both in the femionic (fBBC) and in the bosonic (bBBC) sectors. Although they have unlimited intensity even for finite-size expanding systems, these hadronic squeezed correlations are very sensitive to their time emission distribution. Here we discuss results in case this time emission is parameterized by a Lévy-type distribution, showing that it reduces the signal even more dramatically than a Lorentzian distribution, which already reduces the intensity of the effect by orders of magnitude, as compared to the sudden emission. However, we show that the signal could still survive if the duration of the process is short, and if the effect is searched for lighter mesons, such as kaons. We compare some of our results to recent PHENIX preliminary data on squeezed correlations of K +K - pairs. © 2011 Pleiades Publishing, Ltd.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)