17 resultados para PROCESSOR
em Digital Commons at Florida International University
Resumo:
A multipurpose open architecture motion control system was developed with three platforms for control and monitoring. The Visual Basic user interface communicated with the operator and gave instructions to the electronic components. The first platform had a BASIC Stamp based controller and three stepping motors. The second platform had a controller, amplifiers and two DC servomotors. The third platform had a DSP module. In this study, each platform was used on machine tools either to move the table or to evaluate the incoming signal. The study indicated that by using advanced microcontrollers, which use high-level languages, motor controllers, DSPs (Digital Signal Processor) and microcomputers, the motion control of different systems could be realized in a short time. Although, the proposed systems had some limitations, their jobs were performed effectively. ^
Resumo:
Buffered crossbar switches have recently attracted considerable attention as the next generation of high speed interconnects. They are a special type of crossbar switches with an exclusive buffer at each crosspoint of the crossbar. They demonstrate unique advantages over traditional unbuffered crossbar switches, such as high throughput, low latency, and asynchronous packet scheduling. However, since crosspoint buffers are expensive on-chip memories, it is desired that each crosspoint has only a small buffer. This dissertation proposes a series of practical algorithms and techniques for efficient packet scheduling for buffered crossbar switches. To reduce the hardware cost of such switches and make them scalable, we considered partially buffered crossbars, whose crosspoint buffers can be of an arbitrarily small size. Firstly, we introduced a hybrid scheme called Packet-mode Asynchronous Scheduling Algorithm (PASA) to schedule best effort traffic. PASA combines the features of both distributed and centralized scheduling algorithms and can directly handle variable length packets without Segmentation And Reassembly (SAR). We showed by theoretical analysis that it achieves 100% throughput for any admissible traffic in a crossbar with a speedup of two. Moreover, outputs in PASA have a large probability to avoid the more time-consuming centralized scheduling process, and thus make fast scheduling decisions. Secondly, we proposed the Fair Asynchronous Segment Scheduling (FASS) algorithm to handle guaranteed performance traffic with explicit flow rates. FASS reduces the crosspoint buffer size by dividing packets into shorter segments before transmission. It also provides tight constant performance guarantees by emulating the ideal Generalized Processor Sharing (GPS) model. Furthermore, FASS requires no speedup for the crossbar, lowering the hardware cost and improving the switch capacity. Thirdly, we presented a bandwidth allocation scheme called Queue Length Proportional (QLP) to apply FASS to best effort traffic. QLP dynamically obtains a feasible bandwidth allocation matrix based on the queue length information, and thus assists the crossbar switch to be more work-conserving. The feasibility and stability of QLP were proved, no matter whether the traffic distribution is uniform or non-uniform. Hence, based on bandwidth allocation of QLP, FASS can also achieve 100% throughput for best effort traffic in a crossbar without speedup.
Resumo:
Today, most conventional surveillance networks are based on analog system, which has a lot of constraints like manpower and high-bandwidth requirements. It becomes the barrier for today's surveillance network development. This dissertation describes a digital surveillance network architecture based on the H.264 coding/decoding (CODEC) System-on-a-Chip (SoC) platform. The proposed digital surveillance network architecture includes three major layers: software layer, hardware layer, and the network layer. The following outlines the contributions to the proposed digital surveillance network architecture. (1) We implement an object recognition system and an object categorization system on the software layer by applying several Digital Image Processing (DIP) algorithms. (2) For better compression ratio and higher video quality transfer, we implement two new modules on the hardware layer of the H.264 CODEC core, i.e., the background elimination module and the Directional Discrete Cosine Transform (DDCT) module. (3) Furthermore, we introduce a Digital Signal Processor (DSP) sub-system on the main bus of H.264 SoC platforms as the major hardware support system for our software architecture. Thus we combine the software and hardware platforms to be an intelligent surveillance node. Lab results show that the proposed surveillance node can dramatically save the network resources like bandwidth and storage capacity.
Resumo:
High efficiency of power converters placed between renewable energy sources and the utility grid is required to maximize the utilization of these sources. Power quality is another aspect that requires large passive elements (inductors, capacitors) to be placed between these sources and the grid. The main objective is to develop higher-level high frequency-based power converter system (HFPCS) that optimizes the use of hybrid renewable power injected into the power grid. The HFPCS provides high efficiency, reduced size of passive components, higher levels of power density realization, lower harmonic distortion, higher reliability, and lower cost. The dynamic modeling for each part in this system is developed, simulated and tested. The steady-state performance of the grid-connected hybrid power system with battery storage is analyzed. Various types of simulations were performed and a number of algorithms were developed and tested to verify the effectiveness of the power conversion topologies. A modified hysteresis-control strategy for the rectifier and the battery charging/discharging system was developed and implemented. A voltage oriented control (VOC) scheme was developed to control the energy injected into the grid. The developed HFPCS was compared experimentally with other currently available power converters. The developed HFPCS was employed inside a microgrid system infrastructure, connecting it to the power grid to verify its power transfer capabilities and grid connectivity. Grid connectivity tests verified these power transfer capabilities of the developed converter in addition to its ability of serving the load in a shared manner. In order to investigate the performance of the developed system, an experimental setup for the HF-based hybrid generation system was constructed. We designed a board containing a digital signal processor chip on which the developed control system was embedded. The board was fabricated and experimentally tested. The system's high precision requirements were verified. Each component of the system was built and tested separately, and then the whole system was connected and tested. The simulation and experimental results confirm the effectiveness of the developed converter system for grid-connected hybrid renewable energy systems as well as for hybrid electric vehicles and other industrial applications.
Resumo:
Over the past few decades, we have been enjoying tremendous benefits thanks to the revolutionary advancement of computing systems, driven mainly by the remarkable semiconductor technology scaling and the increasingly complicated processor architecture. However, the exponentially increased transistor density has directly led to exponentially increased power consumption and dramatically elevated system temperature, which not only adversely impacts the system's cost, performance and reliability, but also increases the leakage and thus the overall power consumption. Today, the power and thermal issues have posed enormous challenges and threaten to slow down the continuous evolvement of computer technology. Effective power/thermal-aware design techniques are urgently demanded, at all design abstraction levels, from the circuit-level, the logic-level, to the architectural-level and the system-level. ^ In this dissertation, we present our research efforts to employ real-time scheduling techniques to solve the resource-constrained power/thermal-aware, design-optimization problems. In our research, we developed a set of simple yet accurate system-level models to capture the processor's thermal dynamic as well as the interdependency of leakage power consumption, temperature, and supply voltage. Based on these models, we investigated the fundamental principles in power/thermal-aware scheduling, and developed real-time scheduling techniques targeting at a variety of design objectives, including peak temperature minimization, overall energy reduction, and performance maximization. ^ The novelty of this work is that we integrate the cutting-edge research on power and thermal at the circuit and architectural-level into a set of accurate yet simplified system-level models, and are able to conduct system-level analysis and design based on these models. The theoretical study in this work serves as a solid foundation for the guidance of the power/thermal-aware scheduling algorithms development in practical computing systems.^
Resumo:
Fueled by increasing human appetite for high computing performance, semiconductor technology has now marched into the deep sub-micron era. As transistor size keeps shrinking, more and more transistors are integrated into a single chip. This has increased tremendously the power consumption and heat generation of IC chips. The rapidly growing heat dissipation greatly increases the packaging/cooling costs, and adversely affects the performance and reliability of a computing system. In addition, it also reduces the processor's life span and may even crash the entire computing system. Therefore, dynamic thermal management (DTM) is becoming a critical problem in modern computer system design. Extensive theoretical research has been conducted to study the DTM problem. However, most of them are based on theoretically idealized assumptions or simplified models. While these models and assumptions help to greatly simplify a complex problem and make it theoretically manageable, practical computer systems and applications must deal with many practical factors and details beyond these models or assumptions. The goal of our research was to develop a test platform that can be used to validate theoretical results on DTM under well-controlled conditions, to identify the limitations of existing theoretical results, and also to develop new and practical DTM techniques. This dissertation details the background and our research efforts in this endeavor. Specifically, in our research, we first developed a customized test platform based on an Intel desktop. We then tested a number of related theoretical works and examined their limitations under the practical hardware environment. With these limitations in mind, we developed a new reactive thermal management algorithm for single-core computing systems to optimize the throughput under a peak temperature constraint. We further extended our research to a multicore platform and developed an effective proactive DTM technique for throughput maximization on multicore processor based on task migration and dynamic voltage frequency scaling technique. The significance of our research lies in the fact that our research complements the current extensive theoretical research in dealing with increasingly critical thermal problems and enabling the continuous evolution of high performance computing systems.
Resumo:
Memory (cache, DRAM, and disk) is in charge of providing data and instructions to a computer's processor. In order to maximize performance, the speeds of the memory and the processor should be equal. However, using memory that always match the speed of the processor is prohibitively expensive. Computer hardware designers have managed to drastically lower the cost of the system with the use of memory caches by sacrificing some performance. A cache is a small piece of fast memory that stores popular data so it can be accessed faster. Modern computers have evolved into a hierarchy of caches, where a memory level is the cache for a larger and slower memory level immediately below it. Thus, by using caches, manufacturers are able to store terabytes of data at the cost of cheapest memory while achieving speeds close to the speed of the fastest one.^ The most important decision about managing a cache is what data to store in it. Failing to make good decisions can lead to performance overheads and over-provisioning. Surprisingly, caches choose data to store based on policies that have not changed in principle for decades. However, computing paradigms have changed radically leading to two noticeably different trends. First, caches are now consolidated across hundreds to even thousands of processes. And second, caching is being employed at new levels of the storage hierarchy due to the availability of high-performance flash-based persistent media. This brings four problems. First, as the workloads sharing a cache increase, it is more likely that they contain duplicated data. Second, consolidation creates contention for caches, and if not managed carefully, it translates to wasted space and sub-optimal performance. Third, as contented caches are shared by more workloads, administrators need to carefully estimate specific per-workload requirements across the entire memory hierarchy in order to meet per-workload performance goals. And finally, current cache write policies are unable to simultaneously provide performance and consistency guarantees for the new levels of the storage hierarchy.^ We addressed these problems by modeling their impact and by proposing solutions for each of them. First, we measured and modeled the amount of duplication at the buffer cache level and contention in real production systems. Second, we created a unified model of workload cache usage under contention to be used by administrators for provisioning, or by process schedulers to decide what processes to run together. Third, we proposed methods for removing cache duplication and to eliminate wasted space because of contention for space. And finally, we proposed a technique to improve the consistency guarantees of write-back caches while preserving their performance benefits.^
Resumo:
In his discussion - Database As A Tool For Hospitality Management - William O'Brien, Assistant Professor, School of Hospitality Management at Florida International University, O’Brien offers at the outset, “Database systems offer sweeping possibilities for better management of information in the hospitality industry. The author discusses what such systems are capable of accomplishing.” The author opens with a bit of background on database system development, which also lends an impression as to the complexion of the rest of the article; uh, it’s a shade technical. “In early 1981, Ashton-Tate introduced dBase 11. It was the first microcomputer database management processor to offer relational capabilities and a user-friendly query system combined with a fast, convenient report writer,” O’Brien informs. “When 16-bit microcomputers such as the IBM PC series were introduced late the following year, more powerful database products followed: dBase 111, Friday!, and Framework. The effect on the entire business community, and the hospitality industry in particular, has been remarkable”, he further offers with his informed outlook. Professor O’Brien offers a few anecdotal situations to illustrate how much a comprehensive data-base system means to a hospitality operation, especially when billing is involved. Although attitudes about computer systems, as well as the systems themselves have changed since this article was written, there is pertinent, fundamental information to be gleaned. In regards to the digression of the personal touch when a customer is engaged with a computer system, O’Brien says, “A modern data processing system should not force an employee to treat valued customers as numbers…” He also cautions, “Any computer system that decreases the availability of the personal touch is simply unacceptable.” In a system’s ability to process information, O’Brien suggests that in the past businesses were so enamored with just having an automated system that they failed to take full advantage of its capabilities. O’Brien says that a lot of savings, in time and money, went un-noticed and/or under-appreciated. Today, everyone has an integrated system, and the wise business manager is the business manager who takes full advantage of all his resources. O’Brien invokes the 80/20 rule, and offers, “…the last 20 percent of results costs 80 percent of the effort. But times have changed. Everyone is automating data management, so that last 20 percent that could be ignored a short time ago represents a significant competitive differential.” The evolution of data systems takes center stage for much of the article; pitfalls also emerge.
Resumo:
Today, modern System-on-a-Chip (SoC) systems have grown rapidly due to the increased processing power, while maintaining the size of the hardware circuit. The number of transistors on a chip continues to increase, but current SoC designs may not be able to exploit the potential performance, especially with energy consumption and chip area becoming two major concerns. Traditional SoC designs usually separate software and hardware. Thus, the process of improving the system performance is a complicated task for both software and hardware designers. The aim of this research is to develop hardware acceleration workflow for software applications. Thus, system performance can be improved with constraints of energy consumption and on-chip resource costs. The characteristics of software applications can be identified by using profiling tools. Hardware acceleration can have significant performance improvement for highly mathematical calculations or repeated functions. The performance of SoC systems can then be improved, if the hardware acceleration method is used to accelerate the element that incurs performance overheads. The concepts mentioned in this study can be easily applied to a variety of sophisticated software applications. The contributions of SoC-based hardware acceleration in the hardware-software co-design platform include the following: (1) Software profiling methods are applied to H.264 Coder-Decoder (CODEC) core. The hotspot function of aimed application is identified by using critical attributes such as cycles per loop, loop rounds, etc. (2) Hardware acceleration method based on Field-Programmable Gate Array (FPGA) is used to resolve system bottlenecks and improve system performance. The identified hotspot function is then converted to a hardware accelerator and mapped onto the hardware platform. Two types of hardware acceleration methods – central bus design and co-processor design, are implemented for comparison in the proposed architecture. (3) System specifications, such as performance, energy consumption, and resource costs, are measured and analyzed. The trade-off of these three factors is compared and balanced. Different hardware accelerators are implemented and evaluated based on system requirements. 4) The system verification platform is designed based on Integrated Circuit (IC) workflow. Hardware optimization techniques are used for higher performance and less resource costs. Experimental results show that the proposed hardware acceleration workflow for software applications is an efficient technique. The system can reach 2.8X performance improvements and save 31.84% energy consumption by applying the Bus-IP design. The Co-processor design can have 7.9X performance and save 75.85% energy consumption.
Resumo:
Hardware/software (HW/SW) cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI) design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA) technology is presented in this paper. The major contributions of this work are: (1) a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL) to reduce memory consumption and load on the processor. (2) The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z). (3) The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.
Resumo:
During the past two decades, many researchers have developed methods for the detection of structural defects at the early stages to operate the aerospace vehicles safely and to reduce the operating costs. The Surface Response to Excitation (SuRE) method is one of these approaches developed at FIU to reduce the cost and size of the equipment. The SuRE method excites the surface at a series of frequencies and monitors the propagation characteristics of the generated waves. The amplitude of the waves reaching to any point on the surface varies with frequency; however, it remains consistent as long as the integrity and strain distribution on the part is consistent. These spectral characteristics change when cracks develop or the strain distribution changes. The SHM methods may be used for many applications, from the detection of loose screws to the monitoring of manufacturing operations. A scanning laser vibrometer was used in this study to investigate the characteristics of the spectral changes at different points on the parts. The study started with detecting a load on a plate and estimating its location. The modifications on the part with manufacturing operations were detected and the Part-Based Manufacturing Process Performance Monitoring (PbPPM) method was developed. Hardware was prepared to demonstrate the feasibility of the proposed methods in real time. Using low-cost piezoelectric elements and the non-contact scanning laser vibrometer successfully, the data was collected for the SuRE and PbPPM methods. Locational force, loose bolts and material loss could be easily detected by comparing the spectral characteristics of the arriving waves. On-line methods used fast computational methods for estimating the spectrum and detecting the changing operational conditions from sum of the squares of the variations. Neural networks classified the spectrums when the desktop – DSP combination was used. The results demonstrated the feasibility of the SuRE and PbPPM methods.
Resumo:
During the past two decades, many researchers have developed methods for the detection of structural defects at the early stages to operate the aerospace vehicles safely and to reduce the operating costs. The Surface Response to Excitation (SuRE) method is one of these approaches developed at FIU to reduce the cost and size of the equipment. The SuRE method excites the surface at a series of frequencies and monitors the propagation characteristics of the generated waves. The amplitude of the waves reaching to any point on the surface varies with frequency; however, it remains consistent as long as the integrity and strain distribution on the part is consistent. These spectral characteristics change when cracks develop or the strain distribution changes. The SHM methods may be used for many applications, from the detection of loose screws to the monitoring of manufacturing operations. A scanning laser vibrometer was used in this study to investigate the characteristics of the spectral changes at different points on the parts. The study started with detecting a load on a plate and estimating its location. The modifications on the part with manufacturing operations were detected and the Part-Based Manufacturing Process Performance Monitoring (PbPPM) method was developed. Hardware was prepared to demonstrate the feasibility of the proposed methods in real time. Using low-cost piezoelectric elements and the non-contact scanning laser vibrometer successfully, the data was collected for the SuRE and PbPPM methods. Locational force, loose bolts and material loss could be easily detected by comparing the spectral characteristics of the arriving waves. On-line methods used fast computational methods for estimating the spectrum and detecting the changing operational conditions from sum of the squares of the variations. Neural networks classified the spectrums when the desktop – DSP combination was used. The results demonstrated the feasibility of the SuRE and PbPPM methods.
Resumo:
Communication has become an essential function in our civilization. With the increasing demand for communication channels, it is now necessary to find ways to optimize the use of their bandwidth. One way to achieve this is by transforming the information before it is transmitted. This transformation can be performed by several techniques. One of the newest of these techniques is the use of wavelets. Wavelet transformation refers to the act of breaking down a signal into components called details and trends by using small waveforms that have a zero average in the time domain. After this transformation the data can be compressed by discarding the details, transmitting the trends. In the receiving end, the trends are used to reconstruct the image. In this work, the wavelet used for the transformation of an image will be selected from a library of available bases. The accuracy of the reconstruction, after the details are discarded, is dependent on the wavelets chosen from the wavelet basis library. The system developed in this thesis takes a 2-D image and decomposes it using a wavelet bank. A digital signal processor is used to achieve near real-time performance in this transformation task. A contribution of this thesis project is the development of DSP-based test bed for the future development of new real-time wavelet transformation algorithms.
Resumo:
Unequaled improvements in processor and I/O speeds make many applications such as databases and operating systems to be increasingly I/O bound. Many schemes such as disk caching and disk mirroring have been proposed to address the problem. In this thesis we focus only on disk mirroring. In disk mirroring, a logical disk image is maintained on two physical disks allowing a single disk failure to be transparent to application programs. Although disk mirroring improves data availability and reliability, it has two major drawbacks. First, writes are expensive because both disks must be updated. Second, load balancing during failure mode operation is poor because all requests are serviced by the surviving disk. Distorted mirrors was proposed to address the write problem and interleaved declustering to address the load balancing problem. In this thesis we perform a comparative study of these two schemes under various operating modes. In addition we also study traditional mirroring to provide a common basis for comparison.
Resumo:
The purpose of this study was to analyze the network performance by observing the effect of varying network size and data link rate on one of the most commonly found network configurations. Computer networks have been growing explosively. Networking is used in every aspect of business, including advertising, production, shipping, planning, billing, and accounting. Communication takes place through networks that form the basis of transfer of information. The number and type of components may vary from network to network depending on several factors such as requirement and actual physical placement of the networks. There is no fixed size of the networks and they can be very small consisting of say five to six nodes or very large consisting of over two thousand nodes. The varying network sizes make it very important to study the network performance so as to be able to predict the functioning and the suitability of the network. The findings demonstrated that the network performance parameters such as global delay, load, router processor utilization, router processor delay, etc. are affected. The findings demonstrated that the network performance parameters such as global delay, load, router processor utilization, router processor delay, etc. are affected significantly due to the increase in the size of the network and that there exists a correlation between the various parameters and the size of the network. These variations are not only dependent on the magnitude of the change in the actual physical area of the network but also on the data link rate used to connect the various components of the network.