979 resultados para minimalist hardware architecture


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The philosophy of minimalism in robotics promotes gaining an understanding of sensing and computational requirements for solving a task. This minimalist approach lies in contrast to the common practice of first taking an existing sensory motor system, and only afterwards determining how to apply the robotic system to the task. While it may seem convenient to simply apply existing hardware systems to the task at hand, this design philosophy often proves to be wasteful in terms of energy consumption and cost, along with unnecessary complexity and decreased reliability. While impressive in terms of their versatility, complex robots such as the PR2 (which cost hundreds of thousands of dollars) are impractical for many common applications. Instead, if a specific task is required, sensing and computational requirements can be determined specific to that task, and a clever hardware implementation can be built to accomplish the task. Since this minimalist hardware would be designed around accomplishing the specified task, significant reductions in hardware complexity can be obtained. This can lead to huge advantages in battery life, cost, and reliability. Even if cost is of no concern, battery life is often a limiting factor in many applications. Thus, a minimalist hardware system is critical in achieving the system requirements. In this thesis, we will discuss an implementation of a counting, tracking, and actuation system as it relates to ergodic bodies to illustrate a minimalist design methodology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a parallel hardware architecture for image feature detection based on the Scale Invariant Feature Transform algorithm and applied to the Simultaneous Localization And Mapping problem. The work also proposes specific hardware optimizations considered fundamental to embed such a robotic control system on-a-chip. The proposed architecture is completely stand-alone; it reads the input data directly from a CMOS image sensor and provides the results via a field-programmable gate array coupled to an embedded processor. The results may either be used directly in an on-chip application or accessed through an Ethernet connection. The system is able to detect features up to 30 frames per second (320 x 240 pixels) and has accuracy similar to a PC-based implementation. The achieved system performance is at least one order of magnitude better than a PC-based solution, a result achieved by investigating the impact of several hardware-orientated optimizations oil performance, area and accuracy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Amostras de DNA são encontradas em fragmentos, obtidos em vestígios de uma cena de crime, ou coletados de amostras de cabelo ou sangue, para testes genéticos ou de paternidade. Para identificar se esse fragmento pertence ou não a uma sequência de DNA, é necessário compará-los com uma sequência determinada, que pode estar armazenada em um banco de dados para, por exemplo, apontar um suspeito. Para tal, é preciso uma ferramenta eficiente para realizar o alinhamento da sequência de DNA encontrada com a armazenada no banco de dados. O alinhamento de sequências de DNA, em inglês DNA matching, é o campo da bioinformática que tenta entender a relação entre as sequências genéticas e suas relações funcionais e parentais. Essa tarefa é frequentemente realizada através de softwares que varrem clusters de base de dados, demandando alto poder computacional, o que encarece o custo de um projeto de alinhamento de sequências de DNA. Esta dissertação apresenta uma arquitetura de hardware paralela, para o algoritmo BLAST, que permite o alinhamento de um par de sequências de DNA. O algoritmo BLAST é um método heurístico e atualmente é o mais rápido. A estratégia do BLAST é dividir as sequências originais em subsequências menores de tamanho w. Após realizar as comparações nessas pequenas subsequências, as etapas do BLAST analisam apenas as subsequências que forem idênticas. Com isso, o algoritmo diminui o número de testes e combinações necessárias para realizar o alinhamento. Para cada sequência idêntica há três etapas, a serem realizadas pelo algoritmo: semeadura, extensão e avaliação. A solução proposta se inspira nas características do algoritmo para implementar um hardware totalmente paralelo e com pipeline entre as etapas básicas do BLAST. A arquitetura de hardware proposta foi implementada em FPGA e os resultados obtidos mostram a comparação entre área ocupada, número de ciclos e máxima frequência de operação permitida, em função dos parâmetros de alinhamento. O resultado é uma arquitetura de hardware em lógica reconfigurável, escalável, eficiente e de baixo custo, capaz de alinhar pares de sequências utilizando o algoritmo BLAST.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Desde os primórdios da humanidade, a descoberta do método de processamento cerebral do som, e consequentemente da música, fazem parte do imaginário humano. Portanto, as pesquisas relacionadas a este processo constituem um dos mais vastos campos de estudos das áreas de ciências. Dentre as inúmeras tentativas para compreensão do processamento biológico do som, o ser humano inventou o processo automático de composição musical, com o intuito de aferir a possibilidade da realização de composições musicais de qualidade sem a imposição sentimental, ou seja, apenas com a utilização das definições e estruturas de música existentes. Este procedimento automático de composição musical, também denominado música aleatória ou música do acaso, tem sido vastamente explorado ao longo dos séculos, já tendo sido utilizado por alguns dos grandes nomes do cenário musical, como por exemplo, Mozart. Os avanços nas áreas de engenharia e computação permitiram a evolução dos métodos utilizados para composição de música aleatória, tornando a aplicação de autômatos celulares uma alternativa viável para determinação da sequência de execução de notas musicais e outros itens utilizados durante a composição deste tipo de música. Esta dissertação propõe uma arquitetura para geração de música harmonizada a partir de intervalos melódicos determinados por autômatos celulares, implementada em hardware reconfigurável do tipo FPGA. A arquitetura proposta possui quatro tipos de autômatos celulares, desenvolvidos através dos modelos de vizinhança unidimensional de Wolfram, vizinhança bidimensional de Neumann, vizinhança bidimensional Moore e vizinhança tridimensional de Neumann, que podem ser combinados de 16 formas diferentes para geração de melodias. Os resultados do processamento realizado pela arquitetura proposta são melodias no formato .mid, compostas através da utilização de dois autômatos celulares, um para escolha das notas e outro para escolha dos instrumentos a serem emulados, de acordo com o protocolo MIDI. Para tal esta arquitetura é formada por três unidades principais, a unidade divisor de frequência, que é responsável pelo sincronismo das tarefas executadas pela arquitetura, a unidade de conjunto de autômatos celulares, que é responsável pelo controle e habilitação dos autômatos celulares, e a unidade máquina MIDI, que é responsável por organizar os resultados de cada iteração corrente dos autômatos celulares e convertê-los conforme a estrutura do protocolo MIDI, gerando-se assim o produto musical. A arquitetura proposta é parametrizável, de modo que a configuração dos dados que influenciam no produto musical gerado, como por exemplo, a definição dos conjuntos de regras para os autômatos celulares habilitados, fica a cargo do usuário, não havendo então limites para as combinações possíveis a serem realizadas na arquitetura. Para validação da funcionalidade e aplicabilidade da arquitetura proposta, alguns dos resultados obtidos foram apresentados e detalhados através do uso de técnicas de obtenção de informação musical.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Continuing achievements in hardware technology are bringing ubiquitous computing closer to reality. The notion of a connected, interactive and autonomous environment is common to all sensor networks, biosystems and radio frequency identification (RFID) devices, and the emergence of significant deployments and sophisticated applications can be expected. However, as more information is collected and transmitted, security issues will become vital for such a fully connected environment. In this study the authors consider adding security features to low-cost devices such as RFID tags. In particular, the authors consider the implementation of a digital signature architecture that can be used for device authentication, to prevent tag cloning, and for data authentication to prevent transmission forgery. The scheme is built around the signature variant of the cryptoGPS identification scheme and the SHA-1 hash function. When implemented on 130 nm CMOS the full design uses 7494 gates and consumes 4.72 mu W of power, making it smaller and more power efficient than previous low-cost digital signature designs. The study also presents a low-cost SHA-1 hardware architecture which is the smallest standardised hash function design to date.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A novel hardware architecture for elliptic curve cryptography (ECC) over GF(p) is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. © 2006 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As ubiquitous computing becomes a reality, sensitive information is increasingly processed and transmitted by smart cards, mobile devices and various types of embedded systems. This has led to the requirement of a new class of lightweight cryptographic algorithm to ensure security in these resource constrained environments. The International Organization for Standardization (ISO) has recently standardised two low-cost block ciphers for this purpose, Clefia and Present. In this paper we provide the first comprehensive hardware architecture comparison between these ciphers, as well as a comparison with the current National Institute of Standards and Technology (NIST) standard, the Advanced Encryption Standard.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Network management tools must be able to monitor and analyze traffic flowing through network systems. According to the OpenFlow protocol applied in Software-Defined Networking (SDN), packets are classified into flows that are searched in flow tables. Further actions, such as packet forwarding, modification, and redirection to a group table, are made in the flow table with respect to the search results. A novel hardware solution for SDN-enabled packet classification is presented in this paper. The proposed scheme is focused on a label-based search method, achieving high flexibility in memory usage. The implemented hardware architecture provides optimal lookup performance by configuring the search algorithm and by performing fast incremental update as programmed the software controller.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Field programmable gate array (FPGA) technology is a powerful platform for implementing computationally complex, digital signal processing (DSP) systems. Applications that are multi-modal, however, are designed for worse case conditions. In this paper, genetic sequencing techniques are applied to give a more sophisticated decomposition of the algorithmic variations, thus allowing an unified hardware architecture which gives a 10-25% area saving and 15% power saving for a digital radar receiver.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lattice-based cryptography has gained credence recently as a replacement for current public-key cryptosystems, due to its quantum-resilience, versatility, and relatively low key sizes. To date, encryption based on the learning with errors (LWE) problem has only been investigated from an ideal lattice standpoint, due to its computation and size efficiencies. However, a thorough investigation of standard lattices in practice has yet to be considered. Standard lattices may be preferred to ideal lattices due to their stronger security assumptions and less restrictive parameter selection process. In this paper, an area-optimised hardware architecture of a standard lattice-based cryptographic scheme is proposed. The design is implemented on a FPGA and it is found that both encryption and decryption fit comfortably on a Spartan-6 FPGA. This is the first hardware architecture for standard lattice-based cryptography reported in the literature to date, and thus is a benchmark for future implementations.
Additionally, a revised discrete Gaussian sampler is proposed which is the fastest of its type to date, and also is the first to investigate the cost savings of implementing with lamda_2-bits of precision. Performance results are promising in comparison to the hardware designs of the equivalent ring-LWE scheme, which in addition to providing a stronger security proof; generate 1272 encryptions per second and 4395 decryptions per second.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniques for maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables, and an approach for performing parallel addition of N input symbols.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we investigate various algorithms for performing Fast Fourier Transformation (FFT)/Inverse Fast Fourier Transformation (IFFT), and proper techniquesfor maximizing the FFT/IFFT execution speed, such as pipelining or parallel processing, and use of memory structures with pre-computed values (look up tables -LUT) or other dedicated hardware components (usually multipliers). Furthermore, we discuss the optimal hardware architectures that best apply to various FFT/IFFT algorithms, along with their abilities to exploit parallel processing with minimal data dependences of the FFT/IFFT calculations. An interesting approach that is also considered in this paper is the application of the integrated processing-in-memory Intelligent RAM (IRAM) chip to high speed FFT/IFFT computing. The results of the assessment study emphasize that the execution speed of the FFT/IFFT algorithms is tightly connected to the capabilities of the FFT/IFFT hardware to support the provided parallelism of the given algorithm. Therefore, we suggest that the basic Discrete Fourier Transform (DFT)/Inverse Discrete Fourier Transform (IDFT) can also provide high performances, by utilizing a specialized FFT/IFFT hardware architecture that can exploit the provided parallelism of the DFT/IDF operations. The proposed improvements include simplified multiplications over symbols given in polar coordinate system, using sinе and cosine look up tables,and an approach for performing parallel addition of N input symbols.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The real time hardware architecture of a deterministic video echo canceller (deghoster) system is presented. The deghoster is capable of calculating all the multipath channel distortion characteristics from terrestrial and cable television in one single pass while performing real time video in-line ghost cancellation. The results from the actual system are also presented in this paper.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work proposes hardware architecture, VHDL described, developed to embedded Artificial Neural Network (ANN), Multilayer Perceptron (MLP). The present work idealizes that, in this architecture, ANN applications could easily embed several different topologies of MLP network industrial field. The MLP topology in which the architecture can be configured is defined by a simple and specifically data input (instructions) that determines the layers and Perceptron quantity of the network. In order to set several MLP topologies, many components (datapath) and a controller were developed to execute these instructions. Thus, an user defines a group of previously known instructions which determine ANN characteristics. The system will guarantee the MLP execution through the neural processors (Perceptrons), the components of datapath and the controller that were developed. In other way, the biases and the weights must be static, the ANN that will be embedded must had been trained previously, in off-line way. The knowledge of system internal characteristics and the VHDL language by the user are not needed. The reconfigurable FPGA device was used to implement, simulate and test all the system, allowing application in several real daily problems