961 resultados para graphics processing unit (GPU)
Resumo:
This paper aims to design and develop a control and monitoring system of vending machines, based on a Central Processing Unit with peripheral Internet communication. Coupled with the condom vending machines, a data acquisition module will be connected to the original circuits in order to collect and send, via internet, the information to the healthy government agencies, in the form of charts and reports. In the face of this, such agencies may analyze these data and compare them with the rates of reduction, in medium or long term, of the STD/AIDS in their respective regions, after the implementation of these vending machines, together with the conventional preventing programs. Reading the methodology, this paper is about an explaining and bibliography research, with the aspect of a qualitative-quantitative methodology, presenting a deductive method of approach and an indirect documentation technique research. About the results of the tests and simulations, we concluded that the implementation of this system will have the same success in any other type of dispenser machine
Resumo:
This paper traces the development of a software tool, based oil a combination of artificial neural networks (ANN) and a few process equations. aiming to serve as a backup operation instrument in the reference generation for real-time controllers of a steel tandem cold mill By emulating the mathematical model responsible for generating presets under normal operational conditions, the system works as ail option to maintain plant operation in the event of a failure in the processing unit that executes the mathematical model. The system, built from the production data collected over six years of plant operation, steered to the replacement of the former backup operation mode (based oil a lookup table). which degraded both product quality and plant productivity. The study showed that ANN are appropriated tools for the intended purpose and that by this instrument it is possible to achieve nearly the totality of the presets needed by this land of process. The text characterizes the problem, relates the investigated options to solve it. justifies the choice of the ANN approach, describes the methodology and system implementation and, finally, shows and discusses the attained results. (C) 2009 Elsevier Ltd. All rights reserved
Resumo:
Este artigo apresenta um estudo experimental de técnicas de identificação paramétrica aplicadas à modelagem dinâmica de um servidor web Apache. Foi desenvolvido um arranjo experimental para simular variações de carga no servidor. O arranjo é composto por dois computadores PC, sendo um deles utilizado para executar o servidor Apache e o outro utilizado como um gerador de carga, solicitando requisições de serviço ao servidor Apache. Foram estimados modelos paramétricos auto-regressivos (AR) para diferentes pontos de operação e de condição de carga. Cada ponto de operação foi definido em termos dos valores médios para o parâmetro de entrada MaxClients (parâmetro utilizado para definir o número máximo de processos ativos) e a saída percentual de consumo de CPU (Central Processing Unit) do servidor Apache. Para cada ponto de operação foram coletadas 600 amostras, com um intervalo de amostragem de 5 segundos. Metade do conjunto de amostras coletadas em cada ponto de operação foi utilizada para estimação do modelo, enquanto que a outra metade foi utilizada para validação. Um estudo da ordem mais adequada do modelo mostrou que, para um ponto de operação com valor reduzido de MaxClients, um modelo AR de 7a ordem pode ser satisfatório. Para valores mais elevados de MaxClients, os resultados mostraram que são necessários modelos de ordem mais elevada, devido às não-linearidades inerentes ao sistema.
Resumo:
In general, pattern recognition techniques require a high computational burden for learning the discriminating functions that are responsible to separate samples from distinct classes. As such, there are several studies that make effort to employ machine learning algorithms in the context of big data classification problems. The research on this area ranges from Graphics Processing Units-based implementations to mathematical optimizations, being the main drawback of the former approaches to be dependent on the graphic video card. Here, we propose an architecture-independent optimization approach for the optimum-path forest (OPF) classifier, that is designed using a theoretical formulation that relates the minimum spanning tree with the minimum spanning forest generated by the OPF over the training dataset. The experiments have shown that the approach proposed can be faster than the traditional one in five public datasets, being also as accurate as the original OPF. (C) 2014 Elsevier B. V. All rights reserved.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The manufacture of plywood panel already has about eighty years in Brazil. In general, plywood panels differ by their nature, composition and manufacturing; the most common MDF, OSB, particle and the joist. Some of these products can be generated fromindustrial waste, as the panelblockboard, you can use in their training waste destop boards (Battens), from industrial processes or processing of primary wood. This work aims to present the detailed study of the economic viability of utilization of industrial solid waste timber through the acquisition of a machine amarradeira of joists in a wood processing unit in city Itapeva-SP. We studied the application of wastein the composition of kernels blockboard panel tied its commercial acceptance and future prospects market. We noticed great economic advantage in investing in the study, compared with the transformation process of the joists in biomass energy, although the investment decision involves not only financial matters but also the acceptance market the product to be manufactured
Resumo:
Pós-graduação em Biofísica Molecular - IBILCE
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Providing support for multimedia applications on low-power mobile devices remains a significant research challenge. This is primarily due to two reasons: • Portable mobile devices have modest sizes and weights, and therefore inadequate resources, low CPU processing power, reduced display capabilities, limited memory and battery lifetimes as compared to desktop and laptop systems. • On the other hand, multimedia applications tend to have distinctive QoS and processing requirementswhichmake themextremely resource-demanding. This innate conflict introduces key research challenges in the design of multimedia applications and device-level power optimization. Energy efficiency in this kind of platforms can be achieved only via a synergistic hardware and software approach. In fact, while System-on-Chips are more and more programmable thus providing functional flexibility, hardwareonly power reduction techniques cannot maintain consumption under acceptable bounds. It is well understood both in research and industry that system configuration andmanagement cannot be controlled efficiently only relying on low-level firmware and hardware drivers. In fact, at this level there is lack of information about user application activity and consequently about the impact of power management decision on QoS. Even though operating system support and integration is a requirement for effective performance and energy management, more effective and QoSsensitive power management is possible if power awareness and hardware configuration control strategies are tightly integratedwith domain-specificmiddleware services. The main objective of this PhD research has been the exploration and the integration of amiddleware-centric energymanagement with applications and operating-system. We choose to focus on the CPU-memory and the video subsystems, since they are the most power-hungry components of an embedded system. A second main objective has been the definition and implementation of software facilities (like toolkits, API, and run-time engines) in order to improve programmability and performance efficiency of such platforms. Enhancing energy efficiency and programmability ofmodernMulti-Processor System-on-Chips (MPSoCs) Consumer applications are characterized by tight time-to-market constraints and extreme cost sensitivity. The software that runs on modern embedded systems must be high performance, real time, and even more important low power. Although much progress has been made on these problems, much remains to be done. Multi-processor System-on-Chip (MPSoC) are increasingly popular platforms for high performance embedded applications. This leads to interesting challenges in software development since efficient software development is a major issue for MPSoc designers. An important step in deploying applications on multiprocessors is to allocate and schedule concurrent tasks to the processing and communication resources of the platform. The problem of allocating and scheduling precedenceconstrained tasks on processors in a distributed real-time system is NP-hard. There is a clear need for deployment technology that addresses thesemulti processing issues. This problem can be tackled by means of specific middleware which takes care of allocating and scheduling tasks on the different processing elements and which tries also to optimize the power consumption of the entire multiprocessor platform. This dissertation is an attempt to develop insight into efficient, flexible and optimalmethods for allocating and scheduling concurrent applications tomultiprocessor architectures. It is a well-known problem in literature: this kind of optimization problems are very complex even in much simplified variants, therefore most authors propose simplified models and heuristic approaches to solve it in reasonable time. Model simplification is often achieved by abstracting away platform implementation ”details”. As a result, optimization problems become more tractable, even reaching polynomial time complexity. Unfortunately, this approach creates an abstraction gap between the optimization model and the real HW-SW platform. The main issue with heuristic or, more in general, with incomplete search is that they introduce an optimality gap of unknown size. They provide very limited or no information on the distance between the best computed solution and the optimal one. The goal of this work is to address both abstraction and optimality gaps, formulating accurate models which accounts for a number of ”non-idealities” in real-life hardware platforms, developing novel mapping algorithms that deterministically find optimal solutions, and implementing software infrastructures required by developers to deploy applications for the targetMPSoC platforms. Energy Efficient LCDBacklightAutoregulation on Real-LifeMultimediaAp- plication Processor Despite the ever increasing advances in Liquid Crystal Display’s (LCD) technology, their power consumption is still one of the major limitations to the battery life of mobile appliances such as smart phones, portable media players, gaming and navigation devices. There is a clear trend towards the increase of LCD size to exploit the multimedia capabilities of portable devices that can receive and render high definition video and pictures. Multimedia applications running on these devices require LCD screen sizes of 2.2 to 3.5 inches andmore to display video sequences and pictures with the required quality. LCD power consumption is dependent on the backlight and pixel matrix driving circuits and is typically proportional to the panel area. As a result, the contribution is also likely to be considerable in future mobile appliances. To address this issue, companies are proposing low power technologies suitable for mobile applications supporting low power states and image control techniques. On the research side, several power saving schemes and algorithms can be found in literature. Some of them exploit software-only techniques to change the image content to reduce the power associated with the crystal polarization, some others are aimed at decreasing the backlight level while compensating the luminance reduction by compensating the user perceived quality degradation using pixel-by-pixel image processing algorithms. The major limitation of these techniques is that they rely on the CPU to perform pixel-based manipulations and their impact on CPU utilization and power consumption has not been assessed. This PhDdissertation shows an alternative approach that exploits in a smart and efficient way the hardware image processing unit almost integrated in every current multimedia application processors to implement a hardware assisted image compensation that allows dynamic scaling of the backlight with a negligible impact on QoS. The proposed approach overcomes CPU-intensive techniques by saving system power without requiring either a dedicated display technology or hardware modification. Thesis Overview The remainder of the thesis is organized as follows. The first part is focused on enhancing energy efficiency and programmability of modern Multi-Processor System-on-Chips (MPSoCs). Chapter 2 gives an overview about architectural trends in embedded systems, illustrating the principal features of new technologies and the key challenges still open. Chapter 3 presents a QoS-driven methodology for optimal allocation and frequency selection for MPSoCs. The methodology is based on functional simulation and full system power estimation. Chapter 4 targets allocation and scheduling of pipelined stream-oriented applications on top of distributed memory architectures with messaging support. We tackled the complexity of the problem by means of decomposition and no-good generation, and prove the increased computational efficiency of this approach with respect to traditional ones. Chapter 5 presents a cooperative framework to solve the allocation, scheduling and voltage/frequency selection problem to optimality for energyefficient MPSoCs, while in Chapter 6 applications with conditional task graph are taken into account. Finally Chapter 7 proposes a complete framework, called Cellflow, to help programmers in efficient software implementation on a real architecture, the Cell Broadband Engine processor. The second part is focused on energy efficient software techniques for LCD displays. Chapter 8 gives an overview about portable device display technologies, illustrating the principal features of LCD video systems and the key challenges still open. Chapter 9 shows several energy efficient software techniques present in literature, while Chapter 10 illustrates in details our method for saving significant power in an LCD panel. Finally, conclusions are drawn, reporting the main research contributions that have been discussed throughout this dissertation.
Resumo:
Modern embedded systems embrace many-core shared-memory designs. Due to constrained power and area budgets, most of them feature software-managed scratchpad memories instead of data caches to increase the data locality. It is therefore programmers’ responsibility to explicitly manage the memory transfers, and this make programming these platform cumbersome. Moreover, complex modern applications must be adequately parallelized before they can the parallel potential of the platform into actual performance. To support this, programming languages were proposed, which work at a high level of abstraction, and rely on a runtime whose cost hinders performance, especially in embedded systems, where resources and power budget are constrained. This dissertation explores the applicability of the shared-memory paradigm on modern many-core systems, focusing on the ease-of-programming. It focuses on OpenMP, the de-facto standard for shared memory programming. In a first part, the cost of algorithms for synchronization and data partitioning are analyzed, and they are adapted to modern embedded many-cores. Then, the original design of an OpenMP runtime library is presented, which supports complex forms of parallelism such as multi-level and irregular parallelism. In the second part of the thesis, the focus is on heterogeneous systems, where hardware accelerators are coupled to (many-)cores to implement key functional kernels with orders-of-magnitude of speedup and energy efficiency compared to the “pure software” version. However, three main issues rise, namely i) platform design complexity, ii) architectural scalability and iii) programmability. To tackle them, a template for a generic hardware processing unit (HWPU) is proposed, which share the memory banks with cores, and the template for a scalable architecture is shown, which integrates them through the shared-memory system. Then, a full software stack and toolchain are developed to support platform design and to let programmers exploiting the accelerators of the platform. The OpenMP frontend is extended to interact with it.
Resumo:
Theories and numerical modeling are fundamental tools for understanding, optimizing and designing present and future laser-plasma accelerators (LPAs). Laser evolution and plasma wave excitation in a LPA driven by a weakly relativistically intense, short-pulse laser propagating in a preformed parabolic plasma channel, is studied analytically in 3D including the effects of pulse steepening and energy depletion. At higher laser intensities, the process of electron self-injection in the nonlinear bubble wake regime is studied by means of fully self-consistent Particle-in-Cell simulations. Considering a non-evolving laser driver propagating with a prescribed velocity, the geometrical properties of the non-evolving bubble wake are studied. For a range of parameters of interest for laser plasma acceleration, The dependence of the threshold for self-injection in the non-evolving wake on laser intensity and wake velocity is characterized. Due to the nonlinear and complex nature of the Physics involved, computationally challenging numerical simulations are required to model laser-plasma accelerators operating at relativistic laser intensities. The numerical and computational optimizations, that combined in the codes INF&RNO and INF&RNO/quasi-static give the possibility to accurately model multi-GeV laser wakefield acceleration stages with present supercomputing architectures, are discussed. The PIC code jasmine, capable of efficiently running laser-plasma simulations on Graphics Processing Units (GPUs) clusters, is presented. GPUs deliver exceptional performance to PIC codes, but the core algorithms had to be redesigned for satisfying the constraints imposed by the intrinsic parallelism of the architecture. The simulation campaigns, run with the code jasmine for modeling the recent LPA experiments with the INFN-FLAME and CNR-ILIL laser systems, are also presented.
Resumo:
During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.
Resumo:
This paper proposes the optimization relaxation approach based on the analogue Hopfield Neural Network (HNN) for cluster refinement of pre-classified Polarimetric Synthetic Aperture Radar (PolSAR) image data. We consider the initial classification provided by the maximum-likelihood classifier based on the complex Wishart distribution, which is then supplied to the HNN optimization approach. The goal is to improve the classification results obtained by the Wishart approach. The classification improvement is verified by computing a cluster separability coefficient and a measure of homogeneity within the clusters. During the HNN optimization process, for each iteration and for each pixel, two consistency coefficients are computed, taking into account two types of relations between the pixel under consideration and its corresponding neighbors. Based on these coefficients and on the information coming from the pixel itself, the pixel under study is re-classified. Different experiments are carried out to verify that the proposed approach outperforms other strategies, achieving the best results in terms of separability and a trade-off with the homogeneity preserving relevant structures in the image. The performance is also measured in terms of computational central processing unit (CPU) times.
Resumo:
The analysis of complex nonlinear systems is often carried out using simpler piecewise linear representations of them. A principled and practical technique is proposed to linearize and evaluate arbitrary continuous nonlinear functions using polygonal (continuous piecewise linear) models under the L1 norm. A thorough error analysis is developed to guide an optimal design of two kinds of polygonal approximations in the asymptotic case of a large budget of evaluation subintervals N. The method allows the user to obtain the level of linearization (N) for a target approximation error and vice versa. It is suitable for, but not limited to, an efficient implementation in modern Graphics Processing Units (GPUs), allowing real-time performance of computationally demanding applications. The quality and efficiency of the technique has been measured in detail on two nonlinear functions that are widely used in many areas of scientific computing and are expensive to evaluate.
Resumo:
We propose and experimentally demonstrate a potentially integrable optical scheme to generate high order UWB pulses. The technique is based on exploiting the cross phase modulation generated in an InGaAsP Mach-Zehnder interferometer containing integrated semiconductor optical amplifiers, and is also adaptable to different pulse modulation formats through an optical processing unit which allows to control of the amplitude, polarity and time delay of the generated taps.