985 resultados para Field Programmable Gate Array (FPGA)
Resumo:
Esta tesis est incluida dentro del campo del campo de Multiband Orthogonal Frequency Division Multiplexing Ultra Wideband (MB-OFDM UWB), el cual ha adquirido una gran importancia en las comunicaciones inalmbricas de alta tasa de datos en la ltima dcada. UWB surgi con el objetivo de satisfacer la creciente demanda de conexiones inalmbricas en interiores y de uso domstico, con bajo coste y alta velocidad. La disponibilidad de un ancho de banda grande, el potencial para alta velocidad de transmisin, baja complejidad y bajo consumo de energa, unido al bajo coste de implementacin, representa una oportunidad nica para que UWB se convierta en una solucin ampliamente utilizada en aplicaciones de Wireless Personal Area Network (WPAN). UWB est definido como cualquier transmisin que ocupa un ancho de banda de ms de 20% de su frecuencia central, o ms de 500 MHz. En 2002, la Comisin Federal de Comunicaciones (FCC) defini que el rango de frecuencias de transmisin de UWB legal es de 3.1 a 10.6 GHz, con una energa de transmisin de -41.3 dBm/Hz. Bajo las directrices de FCC, el uso de la tecnologa UWB puede aportar una enorme capacidad en las comunicaciones de corto alcance. Considerando las ecuaciones de capacidad de Shannon, incrementar la capacidad del canal requiere un incremento lineal en el ancho de banda, mientras que un aumento similar de la capacidad de canal requiere un aumento exponencial en la energa de transmisin. En los ltimos aos, s diferentes desarrollos del UWB han sido extensamente estudiados en diferentes reas, entre los cuales, el protocolo de comunicaciones inalmbricas MB-OFDM UWB est considerado como la mejor eleccin y ha sido adoptado como estndar ISO/IEC para los WPANs. Combinando la modulacin OFDM y la transmisin de datos utilizando las tcnicas de salto de frecuencia, el sistema MB-OFDM UWB es capaz de soportar tasas de datos con que pueden variar de los 55 a los 480 Mbps, alcanzando una distancia mxima de hasta 10 metros. Se esperara que la tecnologa MB-OFDM tenga un consumo energtico muy bajo copando un are muy reducida en silicio, proporcionando soluciones de bajo coste que satisfagan las demandas del mercado. Para cumplir con todas estas expectativas, el desarrollo y la investigacin del MBOFDM UWB deben enfrentarse a varios retos, como son la sincronizacin de alta sensibilidad, las restricciones de baja complejidad, las estrictas limitaciones energticas, la escalabilidad y la flexibilidad. Tales retos requieren un procesamiento digital de la seal de ltima generacin, capaz de desarrollar sistemas que puedan aprovechar por completo las ventajas del espectro UWB y proporcionar futuras aplicaciones inalmbricas en interiores. Esta tesis se centra en la completa optimizacin de un sistema de transceptor de banda base MB-OFDM UWB digital, cuyo objetivo es investigar y disear un subsistema de comunicacin inalmbrica para la aplicacin de las Redes de Sensores Inalmbricas Visuales. La complejidad inherente de los procesadores FFT/IFFT y el sistema de sincronizacin as como la alta frecuencia de operacin para todos los elementos de procesamiento, se convierten en el cuello de la botella para el diseo y la implementacin del sistema de UWB digital en base de banda basado en MB-OFDM de baja energa. El objetivo del transceptor propuesto es conseguir baja energa y baja complejidad bajo la premisa de un alto rendimiento. Las optimizaciones estn realizadas tanto a nivel algortmico como a nivel arquitectural para todos los elementos del sistema. Una arquitectura hardware eficiente en consumo se propone en primer lugar para aquellos mdulos correspondientes a ncleos de computacin. Para el procesado de la Transformada Rpida de Fourier (FFT/IFFT), se propone un algoritmo mixed-radix, basado en una arquitectura con pipeline y se ha desarrollado un mdulo de Decodificador de Viterbi (VD) equilibrado en coste-velocidad con el objetivo de reducir el consumo energtico e incrementar la velocidad de procesamiento. Tambin se ha implementado un correlador signo-bit simple basado en la sincronizacin del tiempo de smbolo es presentado. Este correlador es usado para detectar y sincronizar los paquetes de OFDM de forma robusta y precisa. Para el desarrollo de los subsitemas de procesamiento y realizar la integracin del sistema completo se han empleado tecnologas de ltima generacin. El dispositivo utilizado para el sistema propuesto es una FPGA Virtex 5 XC5VLX110T del fabricante Xilinx. La validacin el propuesta para el sistema transceptor se ha implementado en dicha placa de FPGA. En este trabajo se presenta un algoritmo, y una arquitectura, diseado con filosofa de co-diseo hardware/software para el desarrollo de sistemas de FPGA complejos. El objetivo principal de la estrategia propuesta es de encontrar una metodologa eficiente para el diseo de un sistema de FPGA configurable optimizado con el empleo del mnimo esfuerzo posible en el sistema de procedimiento de verificacin, por tanto acelerar el periodo de desarrollo del sistema. La metodologa de co-diseo presentada tiene la ventaja de ser fcil de usar, contiene todos los pasos desde la propuesta del algoritmo hasta la verificacin del hardware, y puede ser ampliamente extendida para casi todos los tipos de desarrollos de FPGAs. En este trabajo se ha desarrollado slo el sistema de transceptor digital de banda base por lo que la comprobacin de seales transmitidas a travs del canal inalmbrico en los entornos reales de comunicacin sigue requiriendo componentes RF y un front-end analgico. No obstante, utilizando la metodologa de co-simulacin hardware/software citada anteriormente, es posible comunicar el sistema de transmisor y el receptor digital utilizando los modelos de canales propuestos por IEEE 802.15.3a, implementados en MATLAB. Por tanto, simplemente ajustando las caractersticas de cada modelo de canal, por ejemplo, un incremento del retraso y de la frecuencia central, podemos estimar el comportamiento del sistema propuesto en diferentes escenarios y entornos. Las mayores contribuciones de esta tesis son: Se ha propuesto un nuevo algoritmo 128-puntos base mixto FFT usando la arquitectura pipeline multi-ruta. Los complejos multiplicadores para cada etapa de procesamiento son diseados usando la arquitectura modificada shiftadd. Los sistemas word length y twiddle word length son comparados y seleccionados basndose en la seal para cuantizacin del SQNR y el anlisis de energas. El desempeo del procesador IFFT es analizado bajo diferentes situaciones aritmticas de bloques de punto flotante (BFP) para el control de desbordamiento, por tanto, para encontrar la arquitectura perfecta del algoritmo IFFT basado en el procesador FFT propuesto. Para el sistema de receptor MB-OFDM UWB se ha empleado una sincronizacin del tiempo innovadora, de baja complejidad y esquema de compensacin, que consiste en funciones de Detector de Paquetes (PD) y Estimacin del Offset del tiempo. Simplificando el cross-correlation y maximizar las funciones probables solo a sign-bit, la complejidad computacional se ve reducida significativamente. Se ha propuesto un sistema de decodificadores Viterbi de 64 estados de decisin-dbil usando velocidad base-4 de arquitectura suma-comparaselecciona. El algoritmo Two-pointer Even tambin es introducido en la unidad de rastreador de origen con el objetivo de conseguir la eficiencia en el hardware. Se han integrado varias tecnologas de ltima generacin en el completo sistema transceptor basebanda , con el objetivo de implementar un sistema de comunicacin UWB altamente optimizado. Un diseo de flujo mejorado es propuesto para el complejo sistema de implementacin, el cual puede ser usado para diseos de Cadena de puertas de campo programable general (FPGA). El diseo mencionado no slo reduce dramticamente el tiempo para la verificacin funcional, sino tambin provee un anlisis automtico como los errores del retraso del output para el sistema de hardware implementado. Un ambiente de comunicacin virtual es establecido para la validacin del propuesto sistema de transceptores MB-OFDM. Este mtodo es provisto para facilitar el uso y la conveniencia de analizar el sistema digital de basebanda sin parte frontera analgica bajo diferentes ambientes de comunicacin. Esta tesis doctoral est organizada en seis captulos. En el primer captulo se encuentra una breve introduccin al campo del UWB, tanto relacionado con el proyecto como la motivacin del desarrollo del sistema de MB-OFDM. En el captulo 2, se presenta la informacin general y los requisitos del protocolo de comunicacin inalmbrica MBOFDM UWB. En el captulo 3 se habla de la arquitectura del sistema de transceptor digital MB-OFDM de banda base . El diseo del algoritmo propuesto y la arquitectura para cada elemento del procesamiento est detallado en este captulo. Los retos de diseo del sistema que involucra un compromiso de discusin entre la complejidad de diseo, el consumo de energa, el coste de hardware, el desempeo del sistema, y otros aspectos. En el captulo 4, se ha descrito la co-diseada metodologa de hardware/software. Cada parte del flujo del diseo ser detallado con algunos ejemplos que se ha hecho durante el desarrollo del sistema. Aprovechando esta estrategia de diseo, el procedimiento de comunicacin virtual es llevado a cabo para probar y analizar la arquitectura del transceptor propuesto. Los resultados experimentales de la co-simulacin y el informe sinttico de la implementacin del sistema FPGA son reflejados en el captulo 5. Finalmente, en el captulo 6 se incluye las conclusiones y los futuros proyectos, y tambin los resultados derivados de este proyecto de doctorado. ABSTRACT In recent years, the Wireless Visual Sensor Network (WVSN) has drawn great interest in wireless communication research area. They enable a wealth of new applications such as building security control, image sensing, and target localization. However, nowadays wireless communication protocols (ZigBee, Wi-Fi, and Bluetooth for example) cannot fully satisfy the demands of high data rate, low power consumption, short range, and high robustness requirements. New communication protocol is highly desired for such kind of applications. The Ultra Wideband (UWB) wireless communication protocol, which has increased in importance for high data rate wireless communication field, are emerging as an important topic for WVSN research. UWB has emerged as a technology that offers great promise to satisfy the growing demand for low-cost, high-speed digital wireless indoor and home networks. The large bandwidth available, the potential for high data rate transmission, and the potential for low complexity and low power consumption, along with low implementation cost, all present a unique opportunity for UWB to become a widely adopted radio solution for future Wireless Personal Area Network (WPAN) applications. UWB is defined as any transmission that occupies a bandwidth of more than 20% of its center frequency, or more than 500 MHz. In 2002, the Federal Communications Commission (FCC) has mandated that UWB radio transmission can legally operate in the range from 3.1 to 10.6 GHz at a transmitter power of 41.3 dBm/Hz. Under the FCC guidelines, the use of UWB technology can provide enormous capacity over short communication ranges. Considering Shannons capacity equations, increasing the channel capacity requires linear increasing in bandwidth, whereas similar channel capacity increases would require exponential increases in transmission power. In recent years, several different UWB developments has been widely studied in different area, among which, the MB-OFDM UWB wireless communication protocol is considered to be the leading choice and has recently been adopted in the ISO/IEC standard for WPANs. By combing the OFDM modulation and data transmission using frequency hopping techniques, the MB-OFDM UWB system is able to support various data rates, ranging from 55 to 480 Mbps, over distances up to 10 meters. The MB-OFDM technology is expected to consume very little power and silicon area, as well as provide low-cost solutions that can satisfy consumer market demands. To fulfill these expectations, MB-OFDM UWB research and development have to cope with several challenges, which consist of high-sensitivity synchronization, low- complexity constraints, strict power limitations, scalability, and flexibility. Such challenges require state-of-the-art digital signal processing expertise to develop systems that could fully take advantages of the UWB spectrum and support future indoor wireless applications. This thesis focuses on fully optimization for the MB-OFDM UWB digital baseband transceiver system, aiming at researching and designing a wireless communication subsystem for the Wireless Visual Sensor Networks (WVSNs) application. The inherent high complexity of the FFT/IFFT processor and synchronization system, and high operation frequency for all processing elements, becomes the bottleneck for low power MB-OFDM based UWB digital baseband system hardware design and implementation. The proposed transceiver system targets low power and low complexity under the premise of high performance. Optimizations are made at both algorithm and architecture level for each element of the transceiver system. The low-power hardwareefficient structures are firstly proposed for those core computation modules, i.e., the mixed-radix algorithm based pipelined architecture is proposed for the Fast Fourier Transform (FFT/IFFT) processor, and the cost-speed balanced Viterbi Decoder (VD) module is developed, in the aim of lowering the power consumption and increasing the processing speed. In addition, a low complexity sign-bit correlation based symbol timing synchronization scheme is presented so as to detect and synchronize the OFDM packets robustly and accurately. Moreover, several state-of-the-art technologies are used for developing other processing subsystems and an entire MB-OFDM digital baseband transceiver system is integrated. The target device for the proposed transceiver system is Xilinx Virtex 5 XC5VLX110T FPGA board. In order to validate the proposed transceiver system in the FPGA board, a unified algorithm-architecture-circuit hardware/software co-design environment for complex FPGA system development is presented in this work. The main objective of the proposed strategy is to find an efficient methodology for designing a configurable optimized FPGA system by using as few efforts as possible in system verification procedure, so as to speed up the system development period. The presented co-design methodology has the advantages of easy to use, covering all steps from algorithm proposal to hardware verification, and widely spread for almost all kinds of FPGA developments. Because only the digital baseband transceiver system is developed in this thesis, the validation of transmitting signals through wireless channel in real communication environments still requires the analog front-end and RF components. However, by using the aforementioned hardware/software co-simulation methodology, the transmitter and receiver digital baseband systems get the opportunity to communicate with each other through the channel models, which are proposed from the IEEE 802.15.3a research group, established in MATLAB. Thus, by simply adjust the characteristics of each channel model, e.g. mean excess delay and center frequency, we can estimate the transmission performance of the proposed transceiver system through different communication situations. The main contributions of this thesis are: A novel mixed radix 128-point FFT algorithm by using multipath pipelined architecture is proposed. The complex multipliers for each processing stage are designed by using modified shift-add architectures. The system wordlength and twiddle word-length are compared and selected based on Signal to Quantization Noise Ratio (SQNR) and power analysis. IFFT processor performance is analyzed under different Block Floating Point (BFP) arithmetic situations for overflow control, so as to find out the perfect architecture of IFFT algorithm based on the proposed FFT processor. An innovative low complex timing synchronization and compensation scheme, which consists of Packet Detector (PD) and Timing Offset Estimation (TOE) functions, for MB-OFDM UWB receiver system is employed. By simplifying the cross-correlation and maximum likelihood functions to signbit only, the computational complexity is significantly reduced. A 64 state soft-decision Viterbi Decoder system by using high speed radix-4 Add-Compare-Select architecture is proposed. Two-pointer Even algorithm is also introduced into the Trace Back unit in the aim of hardware-efficiency. Several state-of-the-art technologies are integrated into the complete baseband transceiver system, in the aim of implementing a highly-optimized UWB communication system. An improved design flow is proposed for complex system implementation which can be used for general Field-Programmable Gate Array (FPGA) designs. The design method not only dramatically reduces the time for functional verification, but also provides automatic analysis such as errors and output delays for the implemented hardware systems. A virtual communication environment is established for validating the proposed MB-OFDM transceiver system. This methodology is proved to be easy for usage and convenient for analyzing the digital baseband system without analog frontend under different communication environments. This PhD thesis is organized in six chapters. In the chapter 1 a brief introduction to the UWB field, as well as the related work, is done, along with the motivation of MBOFDM system development. In the chapter 2, the general information and requirement of MB-OFDM UWB wireless communication protocol is presented. In the chapter 3, the architecture of the MB-OFDM digital baseband transceiver system is presented. The design of the proposed algorithm and architecture for each processing element is detailed in this chapter. Design challenges of such system involve trade-off discussions among design complexity, power consumption, hardware cost, system performance, and some other aspects. All these factors are analyzed and discussed. In the chapter 4, the hardware/software co-design methodology is proposed. Each step of this design flow will be detailed by taking some examples that we met during system development. Then, taking advantages of this design strategy, the Virtual Communication procedure is carried out so as to test and analyze the proposed transceiver architecture. Experimental results from the co-simulation and synthesis report of the implemented FPGA system are given in the chapter 5. The chapter 6 includes conclusions and future work, as well as the results derived from this PhD work.
Resumo:
Conventional dual-rail precharge logic suffers from difficult implementations of dual-rail structure for obtaining strict compensation between the counterpart rails. As a light-weight and high-speed dual-rail style, balanced cell-based dual-rail logic (BCDL) uses synchronised compound gates with global precharge signal to provide high resistance against differential power or electromagnetic analyses. BCDL can be realised from generic field programmable gate array (FPGA) design flows with constraints. However, routings still exist as concerns because of the deficient flexibility on routing control, which unfavourably results in bias between complementary nets in security-sensitive parts. In this article, based on a routing repair technique, novel verifications towards routing effect are presented. An 8 bit simplified advanced encryption processing (AES)-co-processor is executed that is constructed on block random access memory (RAM)-based BCDL in Xilinx Virtex-5 FPGAs. Since imbalanced routing are major defects in BCDL, the authors can rule out other influences and fairly quantify the security variants. A series of asymptotic correlation electromagnetic (EM) analyses are launched towards a group of circuits with consecutive routing schemes to be able to verify routing impact on side channel analyses. After repairing the non-identical routings, Mutual information analyses are executed to further validate the concrete security increase obtained from identical routing pairs in BCDL.
Resumo:
La obtencin de energa a partir de la fusin nuclear por confinamiento magntico del plasma, es uno de los principales objetivos dentro de la comunidad cientfica dedicada a la energa nuclear. Desde la construccin del primer dispositivo de fusin, hasta la actualidad, se han llevado a cabo multitud de experimentos, que hoy en da, gran parte de ellos dan soporte al proyecto International Thermonuclear Experimental Reactor (ITER). El principal problema al que se enfrenta ITER, se basa en la monitorizacin y el control del plasma. Gracias a las nuevas tecnologas, los sistemas de instrumentacin y control permiten acercarse ms a la solucin del problema, pero a su vez, es ms complicado estandarizar los sistemas de adquisicin de datos que se usan, no solo en ITER, sino en otros proyectos de igual complejidad. Desarrollar nuevas implementaciones hardware y software bajo los requisitos de los diagnsticos definidos por los cientficos, supone una gran inversin de tiempo, retrasando la ejecucin de nuevos experimentos. Por ello, la solucin que plantea esta tesis, consiste en la definicin de una metodologa de diseo que permite implementar sistemas de adquisicin de datos inteligentes y su fcil integracin en entornos de fusin para la implementacin de diagnsticos. Esta metodologa requiere del uso de los dispositivos Reconfigurable Input/Output (RIO) y Flexible RIO (FlexRIO), que son sistemas embebidos basados en tecnologa Field-Programmable Gate Array (FPGA). Para completar la metodologa de diseo, estos dispositivos van a ser soportados por un software basado en EPICS Device Support utilizando la tecnologa EPICS software asynDriver. Esta metodologa se ha evaluado implementando prototipos para los controladores rpidos de planta de ITER, tanto para casos prcticos de mbito general como adquisicin de datos e imgenes, como para casos concretos como el diagnstico del fission chamber, implementando pre-procesado en tiempo real. Adems de casos prcticos, esta metodologa se ha utilizado para implementar casos reales, como el Ion Source Hydrogen Positive (ISHP), desarrollada por el European Spallation Source (ESS Bilbao) y la Universidad del Pas Vasco. Finalmente, atendiendo a las necesidades que los experimentos en los entornos de fusin requieren, se ha diseado un mecanismo mediante el cual los sistemas de adquisicin de datos, que pueden ser implementados mediante la metodologa de diseo propuesta, pueden integrar un reloj hardware capaz de sincronizarse con el protocolo IEEE1588-V2, permitiendo a estos, obtener los TimeStamps de las muestras adquiridas con una exactitud y precisin de decenas de nanosegundos y realizar streaming de datos con TimeStamps. ABSTRACT Fusion energy reaching by means of nuclear fusion plasma confinement is one of the main goals inside nuclear energy scientific community. Since the first fusion device was built, many experiments have been carried out and now, most of them give support to the International Thermonuclear Experimental Reactor (ITER) project. The main difficulty that ITER has to overcome is the plasma monitoring and control. Due to new technologies, the instrumentation and control systems allow an approaching to the solution, but in turn, the standardization of the used data acquisition systems, not only in ITER but also in other similar projects, is more complex. To develop new hardware and software implementations under scientific diagnostics requirements, entail time costs, delaying new experiments execution. Thus, this thesis presents a solution that consists in a design methodology definition, that permits the implementation of intelligent data acquisition systems and their easy integration into fusion environments for diagnostic purposes. This methodology requires the use of Reconfigurable Input/Output (RIO) and Flexible RIO (FlexRIO) devices, based on Field-Programmable Gate Array (FPGA) embedded technology. In order to complete the design methodology, these devices are going to be supported by an EPICS Device Support software, using asynDriver technology. This methodology has been evaluated implementing ITER PXIe fast controllers prototypes, as well as data and image acquisition, so as for concrete solutions like the fission chamber diagnostic use case, using real time preprocessing. Besides of these prototypes solutions, this methodology has been applied for the implementation of real experiments like the Ion Source Hydrogen Positive (ISHP), developed by the European Spallation Source and the Basque country University. Finally, a hardware mechanism has been designed to integrate a hardware clock into RIO/FlexRIO devices, to get synchronization with the IEEE1588-V2 precision time protocol. This implementation permits to data acquisition systems implemented under the defined methodology, to timestamp all data acquired with nanoseconds accuracy, permitting high throughput timestamped data streaming.
Resumo:
Esta tesis se centra en el estudio y desarrollo de algoritmos de guerra electrnica {electronic warfare, EW) y radar para su implementacin en sistemas de tiempo real. La llegada de los sistemas de radio, radar y navegacin al terreno militar llev al desarrollo de tecnologas para combatirlos. As, el objetivo de los sistemas de guerra electrnica es el control del espectro electomagntico. Una de la funciones de la guerra electrnica es la inteligencia de seales {signals intelligence, SIGINT), cuya labor es detectar, almacenar, analizar, clasificar y localizar la procedencia de todo tipo de seales presentes en el espectro. El subsistema de inteligencia de seales dedicado a las seales radar es la inteligencia electrnica {electronic intelligence, ELINT). Un sistema de tiempo real es aquel cuyo factor de mrito depende tanto del resultado proporcionado como del tiempo en que se da dicho resultado. Los sistemas radar y de guerra electrnica tienen que proporcionar informacin lo ms rpido posible y de forma continua, por lo que pueden encuadrarse dentro de los sistemas de tiempo real. La introduccin de restricciones de tiempo real implica un proceso de realimentacin entre el diseo del algoritmo y su implementacin en plataformas hardware. Las restricciones de tiempo real son dos: latencia y rea de la implementacin. En esta tesis, todos los algoritmos presentados se han implementado en plataformas del tipo field programmable gate array (FPGA), ya que presentan un buen compromiso entre velocidad, coste total, consumo y reconfigurabilidad. La primera parte de la tesis est centrada en el estudio de diferentes subsistemas de un equipo ELINT: deteccin de seales mediante un detector canalizado, extraccin de los parmetros de pulsos radar, clasificacin de modulaciones y localization pasiva. La transformada discreta de Fourier {discrete Fourier transform, DFT) es un detector y estimador de frecuencia quasi-ptimo para seales de banda estrecha en presencia de ruido blanco. El desarrollo de algoritmos eficientes para el clculo de la DFT, conocidos como fast Fourier transform (FFT), han situado a la FFT como el algoritmo ms utilizado para la deteccin de seales de banda estrecha con requisitos de tiempo real. As, se ha diseado e implementado un algoritmo de deteccin y anlisis espectral para su implementacin en tiempo real. Los parmetros ms caractersticos de un pulso radar son su tiempo de llegada y anchura de pulso. Se ha diseado e implementado un algoritmo capaz de extraer dichos parmetros. Este algoritmo se puede utilizar con varios propsitos: realizar un reconocimiento genrico del radar que transmite dicha seal, localizar la posicin de dicho radar o bien puede utilizarse como la parte de preprocesado de un clasificador automtico de modulaciones. La clasificacin automtica de modulaciones es extremadamente complicada en entornos no cooperativos. Un clasificador automtico de modulaciones se divide en dos partes: preprocesado y el algoritmo de clasificacin. Los algoritmos de clasificacin basados en parmetros representativos calculan diferentes estadsticos de la seal de entrada y la clasifican procesando dichos estadsticos. Los algoritmos de localization pueden dividirse en dos tipos: triangulacin y sistemas cuadrticos. En los algoritmos basados en triangulacin, la posicin se estima mediante la interseccin de las rectas proporcionadas por la direccin de llegada de la seal. En cambio, en los sistemas cuadrticos, la posicin se estima mediante la interseccin de superficies con igual diferencia en el tiempo de llegada (time difference of arrival, TDOA) o diferencia en la frecuencia de llegada (frequency difference of arrival, FDOA). Aunque slo se ha implementado la estimacin del TDOA y FDOA mediante la diferencia de tiempos de llegada y diferencia de frecuencias, se presentan estudios exhaustivos sobre los diferentes algoritmos para la estimacin del TDOA, FDOA y localizacin pasiva mediante TDOA-FDOA. La segunda parte de la tesis est dedicada al diseo e implementacin filtros discretos de respuesta finita (finite impulse response, FIR) para dos aplicaciones radar: phased array de banda ancha mediante filtros retardadores (true-time delay, TTD) y la mejora del alcance de un radar sin modificar el hardware existente para que la solucin sea de bajo coste. La operacin de un phased array de banda ancha mediante desfasadores no es factible ya que el retardo temporal no puede aproximarse mediante un desfase. La solucin adoptada e implementada consiste en sustituir los desfasadores por filtros digitales con retardo programable. El mximo alcance de un radar depende de la relacin seal a ruido promedio en el receptor. La relacin seal a ruido depende a su vez de la energa de seal transmitida, potencia multiplicado por la anchura de pulso. Cualquier cambio hardware que se realice conlleva un alto coste. La solucin que se propone es utilizar una tcnica de compresin de pulsos, consistente en introducir una modulacin interna a la seal, desacoplando alcance y resolucin. ABSTRACT This thesis is focused on the study and development of electronic warfare (EW) and radar algorithms for real-time implementation. The arrival of radar, radio and navigation systems to the military sphere led to the development of technologies to fight them. Therefore, the objective of EW systems is the control of the electromagnetic spectrum. Signals Intelligence (SIGINT) is one of the EW functions, whose mission is to detect, collect, analyze, classify and locate all kind of electromagnetic emissions. Electronic intelligence (ELINT) is the SIGINT subsystem that is devoted to radar signals. A real-time system is the one whose correctness depends not only on the provided result but also on the time in which this result is obtained. Radar and EW systems must provide information as fast as possible on a continuous basis and they can be defined as real-time systems. The introduction of real-time constraints implies a feedback process between the design of the algorithms and their hardware implementation. Moreover, a real-time constraint consists of two parameters: Latency and area of the implementation. All the algorithms in this thesis have been implemented on field programmable gate array (FPGAs) platforms, presenting a trade-off among performance, cost, power consumption and reconfigurability. The first part of the thesis is related to the study of different key subsystems of an ELINT equipment: Signal detection with channelized receivers, pulse parameter extraction, modulation classification for radar signals and passive location algorithms. The discrete Fourier transform (DFT) is a nearly optimal detector and frequency estimator for narrow-band signals buried in white noise. The introduction of fast algorithms to calculate the DFT, known as FFT, reduces the complexity and the processing time of the DFT computation. These properties have placed the FFT as one the most conventional methods for narrow-band signal detection for real-time applications. An algorithm for real-time spectral analysis for user-defined bandwidth, instantaneous dynamic range and resolution is presented. The most characteristic parameters of a pulsed signal are its time of arrival (TOA) and the pulse width (PW). The estimation of these basic parameters is a fundamental task in an ELINT equipment. A basic pulse parameter extractor (PPE) that is able to estimate all these parameters is designed and implemented. The PPE may be useful to perform a generic radar recognition process, perform an emitter location technique and can be used as the preprocessing part of an automatic modulation classifier (AMC). Modulation classification is a difficult task in a non-cooperative environment. An AMC consists of two parts: Signal preprocessing and the classification algorithm itself. Featurebased algorithms obtain different characteristics or features of the input signals. Once these features are extracted, the classification is carried out by processing these features. A feature based-AMC for pulsed radar signals with real-time requirements is studied, designed and implemented. Emitter passive location techniques can be divided into two classes: Triangulation systems, in which the emitter location is estimated with the intersection of the different lines of bearing created from the estimated directions of arrival, and quadratic position-fixing systems, in which the position is estimated through the intersection of iso-time difference of arrival (TDOA) or iso-frequency difference of arrival (FDOA) quadratic surfaces. Although TDOA and FDOA are only implemented with time of arrival and frequency differences, different algorithms for TDOA, FDOA and position estimation are studied and analyzed. The second part is dedicated to FIR filter design and implementation for two different radar applications: Wideband phased arrays with true-time delay (TTD) filters and the range improvement of an operative radar with no hardware changes to minimize costs. Wideband operation of phased arrays is unfeasible because time delays cannot be approximated by phase shifts. The presented solution is based on the substitution of the phase shifters by FIR discrete delay filters. The maximum range of a radar depends on the averaged signal to noise ratio (SNR) at the receiver. Among other factors, the SNR depends on the transmitted signal energy that is power times pulse width. Any possible hardware change implies high costs. The proposed solution lies in the use of a signal processing technique known as pulse compression, which consists of introducing an internal modulation within the pulse width, decoupling range and resolution.
Resumo:
Esta tesis recoje un trabajo experimental centrado en profundizar sobre el conocimiento de los bloques detectores monolticos como alternativa a los detectores segmentados para tomografa por emisin de positrones (Positron Emission Tomography, PET). El trabajo llevado a cabo incluye el desarrollo, la caracterizacin, la puesta a punto y la evaluacin de prototipos demostradores PET utilizando bloques monolticos de ortosilicato de lutecio ytrio dopado con cerio (Cerium-Doped Lutetium Yttrium Orthosilicate, LYSO:Ce) usando sensores compatibles con altos campos magnticos, tanto fotodiodos de avalancha (Avalanche Photodiodes, APDs) como fotomultiplicadores de silicio (Silicon Photomultipliers, SiPMs). Los prototipos implementados con APDs se construyeron para estudiar la viabilidad de un prototipo PET de alta sensibilidad previamente simulado, denominado BrainPET. En esta memoria se describe y caracteriza la electrnica frontal integrada utilizada en estos prototipos junto con la electrnica de lectura desarrollada especficamente para los mismos. Se muestran los montajes experimentales para la obtencin de las imgenes tomogrficas PET y para el entrenamiento de los algoritmos de red neuronal utilizados para la estimacin de las posiciones de incidencia de los fotones sobre la superficie de los bloques monolticos. Con el prototipo BrainPET se obtuvieron resultados satisfactorios de resolucin energtica (13 % FWHM), precisin espacial de los bloques monolticos (~ 2 mm FWHM) y resolucin espacial de la imagen PET de 1,5 - 1,7 mm FWHM. Adems se demostr una capacidad resolutiva en la imagen PET de ~ 2 mm al adquirir simultneamente imgenes de fuentes radiactivas separadas a distancias conocidas. Sin embargo, con este prototipo se detectaron tambin dos limitaciones importantes. En primer lugar, se constat una falta de flexibilidad a la hora de trabajar con un circuito integrado de aplicacin especfica (Application Specific Integrated Circuit, ASIC) cuyo diseo electrnico no era propio sino comercial, unido al elevado coste que requieren las modificaciones del diseo de un ASIC con tales caractersticas. Por otra parte, la caracterizacin final de la electrnica integrada del BrainPET mostr una resolucin temporal con amplio margen de mejora (~ 13 ns FWHM). Tomando en cuenta estas limitaciones obtenidas con los prototipos BrainPET, junto con la evolucin tecnolgica hacia matrices de SiPM, el conocimiento adquirido con los bloques monolticos se traslad a la nueva tecnologa de sensores disponible, los SiPMs. A su vez se inici una nueva estrategia para la electrnica frontal, con el ASIC FlexToT, un ASIC de diseo propio basado en un esquema de medida del tiempo sobre umbral (Time over Threshold, ToT), en donde la duracin del pulso de salida es proporcional a la energa depositada. Una de las caractersticas ms interesantes de este esquema es la posibilidad de manejar directamente seales de pulsos digitales, en lugar de procesar la amplitud de las seales analgicas. Con esta arquitectura electrnica se sustituyen los conversores analgicos digitales (Analog to Digital Converter, ADCs) por conversores de tiempo digitales (Time to Digital Converter, TDCs), pudiendo implementar stos de forma sencilla en matrices de puertas programmable in situ (Field Programmable Gate Array, FPGA), reduciendo con ello el consumo y la complejidad del diseo. Se construy un nuevo prototipo demostrador FlexToT para validar dicho ASIC para bloques monolticos o segmentados. Se ha llevado a cabo el diseo y caracterizacin de la electrnica frontal necesaria para la lectura del ASIC FlexToT, evaluando su linealidad y rango dinmico, el comportamiento frente a ruido as como la no linealidad diferencial obtenida con los TDCs implementados en la FPGA. Adems, la electrnica presentada en este trabajo es capaz de trabajar con altas tasas de actividad y de discriminar diferentes centelleadores para aplicaciones phoswich. El ASIC FlexToT proporciona una excelente resolucin temporal en coincidencia para los eventos correspondientes con el fotopico de 511 keV (128 ps FWHM), solventando las limitaciones de resolucin temporal del prototipo BrainPET. Por otra parte, la resolucin energtica con bloques monolticos leidos por ASICs FlexToT proporciona una resolucin energtica de 15,4 % FWHM a 511 keV. Finalmente, se obtuvieron buenos resultados en la calidad de la imagen PET y en la capacidad resolutiva del demostrador FlexToT, proporcionando resoluciones espaciales en el centro del FoV en torno a 1,4 mm FWHM. ABSTRACT This thesis is focused on the development of experimental activities used to deepen the knowledge of monolithic detector blocks as an alternative to segmented detectors for Positron Emission Tomography (PET). It includes the development, characterization, setting up, running and evaluation of PET demonstrator prototypes with monolithic detector blocks of Cerium-doped Lutetium Yttrium Orthosilicate (LYSO:Ce) using magnetically compatible sensors such as Avalanche Photodiodes (APDs) and Silicon Photomultipliers (SiPMs). The prototypes implemented with APDs were constructed to validate the viability of a high-sensitivity PET prototype that had previously been simulated, denominated BrainPET. This work describes and characterizes the integrated front-end electronics used in these prototypes, as well as the electronic readout system developed especially for them. It shows the experimental set-ups to obtain the tomographic PET images and to train neural networks algorithms used for position estimation of photons impinging on the surface of monolithic blocks. Using the BrainPET prototype, satisfactory energy resolution (13 % FWHM), spatial precision of monolithic blocks (~ 2 mm FWHM) and spatial resolution of the PET image (1.5 1.7 mm FWHM) in the center of the Field of View (FoV) were obtained. Moreover, we proved the imaging capabilities of this demonstrator with extended sources, considering the acquisition of two simultaneous sources of 1 mm diameter placed at known distances. However, some important limitations were also detected with the BrainPET prototype. In the first place, it was confirmed that there was a lack of flexibility working with an Application Specific Integrated Circuit (ASIC) whose electronic design was not own but commercial, along with the high cost required to modify an ASIC design with such features. Furthermore, the final characterization of the BrainPET ASIC showed a timing resolution with room for improvement (~ 13 ns FWHM). Taking into consideration the limitations obtained with the BrainPET prototype, along with the technological evolution in magnetically compatible devices, the knowledge acquired with the monolithic blocks were transferred to the new technology available, the SiPMs. Moreover, we opted for a new strategy in the front-end electronics, the FlexToT ASIC, an own design ASIC based on a Time over Threshold (ToT) scheme. One of the most interesting features underlying a ToT architecture is the encoding of the analog input signal amplitude information into the duration of the output signals, delivering directly digital pulses. The electronic architecture helps substitute the Analog to Digital Converters (ADCs) for Time to Digital Converters (TDCs), and they are easily implemented in Field Programmable Gate Arrays (FPGA), reducing the consumption and the complexity of the design. A new prototype demonstrator based on SiPMs was implemented to validate the FlexToT ASIC for monolithic or segmented blocks. The design and characterization of the necessary front-end electronic to read-out the signals from the ASIC was carried out by evaluating its linearity and dynamic range, its performance with an external noise signal, as well as the differential nonlinearity obtained with the TDCs implemented in the FPGA. Furthermore, the electronic presented in this work is capable of working at high count rates and discriminates different phoswich scintillators. The FlexToT ASIC provides an excellent coincidence time resolution for events that correspond to 511 keV photopeak (128 ps FWHM), resolving the limitations of the poor timing resolution of the BrainPET prototype. Furthermore, the energy resolution with monolithic blocks read by FlexToT ASICs provides an energy resolution of 15.4 % FWHM at 511 keV. Finally, good results were obtained in the quality of the PET image and the resolving power of the FlexToT demonstrator, providing spatial resolutions in the centre of the FoV at about 1.4 mm FWHM.
Resumo:
O objetivo deste trabalho o projeto, construo e teste de um sistema que permita a medida de viscosidade de lquidos utilizando o mtodo do fio vibrante em regime livre. A principal contribuio original deste trabalho o modo de controlo do estmulo e aquisio de sinal no contexto da medio de viscosidade, utilizando uma interface grfica (GUI) onde o utilizador pode controlar o estmulo e a taxa de aquisio do sensor. A Field Programmable Gate Array (FPGA) utilizada para controlar e sincronizar todo o sistema e fazer a ligao entre a interface grfica e o hardware desenvolvido para condicionar o sinal de estmulo do fio vibrante e de resposta do fio vibrante. A amplitude eficaz mxima de corrente a estimular o fio vibrante permite utilizar fios vibrantes com um dimetro maior e efetuar medies em lquidos mais viscosos. Utilizando o prottipo desenvolvido para adquirir a resposta do fio vibrante fez-se o ajuste dos pontos experimentais equao que descreve o comportamento terico da resposta do fio vibrante, obtendo-se valores de frequncia e amortecimento da resposta do fio vibrante com um desvio padro de e respetivamente, permitindo assim calcular a viscosidade do lquido em estudo.
Resumo:
The purpose of this investigation was to develop new techniques to generate segmental assessments of body composition based on Segmental Bioelectrical Impedance Analysis (SBIA). An equally important consideration was the design, simulation, development, and the software and hardware integration of the SBIA system. This integration was carried out with a Very Large Scale Integration (VLSI) Field Programmable Gate Array (FPGA) microcontroller that analyzed the measurements obtained from segments of the body, and provided full body and segmental Fat Free Mass (FFM) and Fat Mass (FM) percentages. Also, the issues related to the estimate of the body's composition in persons with spinal cord injury (SCI) were addressed and investigated. This investigation demonstrated that the SBIA methodology provided accurate segmental body composition measurements. Disabled individuals are expected to benefit from these SBIA evaluations, as they are non-invasive methods, suitable for paralyzed individuals. The SBIA VLSI system may replace bulky, non flexible electronic modules attached to human bodies. ^
Resumo:
It has been well documented that traffic accidents that can be avoided occur when the motorists miss or ignore traffic signs. With the attention of drivers getting diverted due to distractions like cell phone conversations, missing traffic signs has become more prevalent. Also, poor weather and other unfriendly driving conditions sometimes makes the motorists not to be alert all the time and see every traffic sign on the road. Besides, most cars do not have any form of traffic assistance. Because of heavy traffic and proliferation of traffic signs on the roads, there is a need for a system that assists the driver not to miss a traffic sign to reduce the probability of an accident. Since visual information is critical for driving, processed video signals from cameras have been chosen to assist drivers. These inexpensive cameras can be easily mounted on the automobile. The objective of the present investigation and the traffic system development is to recognize the traffic signs electronically and alert drivers. For the case study and the system development, five important and critical traffic signs have been selected. They are: STOP, NO ENTER, NO RIGHT TURN, NO LEFT TURN, and YIELD. The system was evaluated processing still pictures taken from the public roads, and the recognition results were presented in an analysis table to indicate the correct identifications and the false ones. The system reached the acceptable recognition rate of 80% for all five traffic signs. The processing rate was about three seconds. The capabilities of MATLAB, VLSI design platforms and coding have been used to generate a visual warning to complement the visual driver support system with a Field Programmable Gate Array (FPGA) on a XUP Virtex-II Pro Development System.
Resumo:
Today, modern System-on-a-Chip (SoC) systems have grown rapidly due to the increased processing power, while maintaining the size of the hardware circuit. The number of transistors on a chip continues to increase, but current SoC designs may not be able to exploit the potential performance, especially with energy consumption and chip area becoming two major concerns. Traditional SoC designs usually separate software and hardware. Thus, the process of improving the system performance is a complicated task for both software and hardware designers. The aim of this research is to develop hardware acceleration workflow for software applications. Thus, system performance can be improved with constraints of energy consumption and on-chip resource costs. The characteristics of software applications can be identified by using profiling tools. Hardware acceleration can have significant performance improvement for highly mathematical calculations or repeated functions. The performance of SoC systems can then be improved, if the hardware acceleration method is used to accelerate the element that incurs performance overheads. The concepts mentioned in this study can be easily applied to a variety of sophisticated software applications. The contributions of SoC-based hardware acceleration in the hardware-software co-design platform include the following: (1) Software profiling methods are applied to H.264 Coder-Decoder (CODEC) core. The hotspot function of aimed application is identified by using critical attributes such as cycles per loop, loop rounds, etc. (2) Hardware acceleration method based on Field-Programmable Gate Array (FPGA) is used to resolve system bottlenecks and improve system performance. The identified hotspot function is then converted to a hardware accelerator and mapped onto the hardware platform. Two types of hardware acceleration methods central bus design and co-processor design, are implemented for comparison in the proposed architecture. (3) System specifications, such as performance, energy consumption, and resource costs, are measured and analyzed. The trade-off of these three factors is compared and balanced. Different hardware accelerators are implemented and evaluated based on system requirements. 4) The system verification platform is designed based on Integrated Circuit (IC) workflow. Hardware optimization techniques are used for higher performance and less resource costs. Experimental results show that the proposed hardware acceleration workflow for software applications is an efficient technique. The system can reach 2.8X performance improvements and save 31.84% energy consumption by applying the Bus-IP design. The Co-processor design can have 7.9X performance and save 75.85% energy consumption.
Resumo:
Hardware/software (HW/SW) cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI) design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA) technology is presented in this paper. The major contributions of this work are: (1) a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL) to reduce memory consumption and load on the processor. (2) The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z). (3) The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.
Resumo:
Computational Intelligence Methods have been expanding to industrial applications motivated by their ability to solve problems in engineering. Therefore, the embedded systems follow the same idea of using computational intelligence tools embedded on machines. There are several works in the area of embedded systems and intelligent systems. However, there are a few papers that have joined both areas. The aim of this study was to implement an adaptive fuzzy neural hardware with online training embedded on Field Programmable Gate Array FPGA. The system adaptation can occur during the execution of a given application, aiming online performance improvement. The proposed system architecture is modular, allowing different configurations of fuzzy neural network topologies with online training. The proposed system was applied to: mathematical function interpolation, pattern classification and selfcompensation of industrial sensors. The proposed system achieves satisfactory performance in both tasks. The experiments results shows the advantages and disadvantages of online training in hardware when performed in parallel and sequentially ways. The sequentially training method provides economy in FPGA area, however, increases the complexity of architecture actions. The parallel training method achieves high performance and reduced processing time, the pipeline technique is used to increase the proposed architecture performance. The study development was based on available tools for FPGA circuits.
Resumo:
The Artificial Neural Networks (ANN), which is one of the branches of Artificial Intelligence (AI), are being employed as a solution to many complex problems existing in several areas. To solve these problems, it is essential that its implementation is done in hardware. Among the strategies to be adopted and met during the design phase and implementation of RNAs in hardware, connections between neurons are the ones that need more attention. Recently, are RNAs implemented both in application specific integrated circuits's (Application Specific Integrated Circuits - ASIC) and in integrated circuits configured by the user, like the Field Programmable Gate Array (FPGA), which have the ability to be partially rewritten, at runtime, forming thus a system Partially Reconfigurable (SPR), the use of which provides several advantages, such as flexibility in implementation and cost reduction. It has been noted a considerable increase in the use of FPGAs for implementing ANNs. Given the above, it is proposed to implement an array of reconfigurable neurons for topologies Description of artificial neural network multilayer perceptrons (MLPs) in FPGA, in order to encourage feedback and reuse of neural processors (perceptrons) used in the same area of the circuit. It is further proposed, a communication network capable of performing the reuse of artificial neurons. The architecture of the proposed system will configure various topologies MLPs networks through partial reconfiguration of the FPGA. To allow this flexibility RNAs settings, a set of digital components (datapath), and a controller were developed to execute instructions that define each topology for MLP neural network.
Resumo:
The Artificial Neural Networks (ANN), which is one of the branches of Artificial Intelligence (AI), are being employed as a solution to many complex problems existing in several areas. To solve these problems, it is essential that its implementation is done in hardware. Among the strategies to be adopted and met during the design phase and implementation of RNAs in hardware, connections between neurons are the ones that need more attention. Recently, are RNAs implemented both in application specific integrated circuits's (Application Specific Integrated Circuits - ASIC) and in integrated circuits configured by the user, like the Field Programmable Gate Array (FPGA), which have the ability to be partially rewritten, at runtime, forming thus a system Partially Reconfigurable (SPR), the use of which provides several advantages, such as flexibility in implementation and cost reduction. It has been noted a considerable increase in the use of FPGAs for implementing ANNs. Given the above, it is proposed to implement an array of reconfigurable neurons for topologies Description of artificial neural network multilayer perceptrons (MLPs) in FPGA, in order to encourage feedback and reuse of neural processors (perceptrons) used in the same area of the circuit. It is further proposed, a communication network capable of performing the reuse of artificial neurons. The architecture of the proposed system will configure various topologies MLPs networks through partial reconfiguration of the FPGA. To allow this flexibility RNAs settings, a set of digital components (datapath), and a controller were developed to execute instructions that define each topology for MLP neural network.
Resumo:
The philosophy of minimalism in robotics promotes gaining an understanding of sensing and computational requirements for solving a task. This minimalist approach lies in contrast to the common practice of first taking an existing sensory motor system, and only afterwards determining how to apply the robotic system to the task. While it may seem convenient to simply apply existing hardware systems to the task at hand, this design philosophy often proves to be wasteful in terms of energy consumption and cost, along with unnecessary complexity and decreased reliability. While impressive in terms of their versatility, complex robots such as the PR2 (which cost hundreds of thousands of dollars) are impractical for many common applications. Instead, if a specific task is required, sensing and computational requirements can be determined specific to that task, and a clever hardware implementation can be built to accomplish the task. Since this minimalist hardware would be designed around accomplishing the specified task, significant reductions in hardware complexity can be obtained. This can lead to huge advantages in battery life, cost, and reliability. Even if cost is of no concern, battery life is often a limiting factor in many applications. Thus, a minimalist hardware system is critical in achieving the system requirements. In this thesis, we will discuss an implementation of a counting, tracking, and actuation system as it relates to ergodic bodies to illustrate a minimalist design methodology.
Resumo:
Today, modern System-on-a-Chip (SoC) systems have grown rapidly due to the increased processing power, while maintaining the size of the hardware circuit. The number of transistors on a chip continues to increase, but current SoC designs may not be able to exploit the potential performance, especially with energy consumption and chip area becoming two major concerns. Traditional SoC designs usually separate software and hardware. Thus, the process of improving the system performance is a complicated task for both software and hardware designers. The aim of this research is to develop hardware acceleration workflow for software applications. Thus, system performance can be improved with constraints of energy consumption and on-chip resource costs. The characteristics of software applications can be identified by using profiling tools. Hardware acceleration can have significant performance improvement for highly mathematical calculations or repeated functions. The performance of SoC systems can then be improved, if the hardware acceleration method is used to accelerate the element that incurs performance overheads. The concepts mentioned in this study can be easily applied to a variety of sophisticated software applications. The contributions of SoC-based hardware acceleration in the hardware-software co-design platform include the following: (1) Software profiling methods are applied to H.264 Coder-Decoder (CODEC) core. The hotspot function of aimed application is identified by using critical attributes such as cycles per loop, loop rounds, etc. (2) Hardware acceleration method based on Field-Programmable Gate Array (FPGA) is used to resolve system bottlenecks and improve system performance. The identified hotspot function is then converted to a hardware accelerator and mapped onto the hardware platform. Two types of hardware acceleration methods central bus design and co-processor design, are implemented for comparison in the proposed architecture. (3) System specifications, such as performance, energy consumption, and resource costs, are measured and analyzed. The trade-off of these three factors is compared and balanced. Different hardware accelerators are implemented and evaluated based on system requirements. 4) The system verification platform is designed based on Integrated Circuit (IC) workflow. Hardware optimization techniques are used for higher performance and less resource costs. Experimental results show that the proposed hardware acceleration workflow for software applications is an efficient technique. The system can reach 2.8X performance improvements and save 31.84% energy consumption by applying the Bus-IP design. The Co-processor design can have 7.9X performance and save 75.85% energy consumption.