Biblioteca Digital

611 resultados para SCIL processor

Video anomaly detection in real-time on a Power-Aware Heterogeneous Platform

Relevância:

10.00% 10.00%

Publicador:

Resumo:

FPGAs and GPUs are often used when real-time performance in video processing is required. An accelerated processor is chosen based on task-specific priorities (power consumption, processing time and detection accuracy), and this decision is normally made once at design time. All three characteristics are important, particularly in battery-powered systems. Here we propose a method for moving selection of processing platform from a single design-time choice to a continuous run time one.We implement Histogram of Oriented Gradients (HOG) detectors for cars and people and Mixture of Gaussians (MoG) motion detectors running across FPGA, GPU and CPU in a heterogeneous system. We use this to detect illegally parked vehicles in urban scenes. Power, time and accuracy information for each detector is characterised. An anomaly measure is assigned to each detected object based on its trajectory and location, when compared to learned contextual movement patterns. This drives processor and implementation selection, so that scenes with high behavioural anomalies are processed with faster but more power hungry implementations, but routine or static time periods are processed with power-optimised, less accurate, slower versions. Real-time performance is evaluated on video datasets including i-LIDS. Compared to power-optimised static selection, automatic dynamic implementation mapping is 10% more accurate but draws 12W extra power in our testbed desktop system.

VLSI design and implementation of 2-D Inverse Discrete Wavelet Transform

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes a JPEG-2000 compliant architecture capable of computing the 2 -D Inverse Discrete Wavelet Transform. The proposed architecture uses a single processor and a row-based schedule to minimize control and routing complexity and to ensure that processor utilization is kept at 100%. The design incorporates the handling of borders through the use of symmetric extension. The architecture has been implemented on the Xilinx Virtex2 FPGA.

A low overhead error confinement method based on application statistical characteristics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reliability has emerged as a critical design constraint especially in memories. Designers are going to great lengths to guarantee fault free operation of the underlying silicon by adopting redundancy-based techniques, which essentially try to detect and correct every single error. However, such techniques come at a cost of large area, power and performance overheads which making many researchers to doubt their efficiency especially for error resilient systems where 100% accuracy is not always required. In this paper, we present an alternative method focusing on the confinement of the resulting output error induced by any reliability issues. By focusing on memory faults, rather than correcting every single error the proposed method exploits the statistical characteristics of any target application and replaces any erroneous data with the best available estimate of that data. To realize the proposed method a RISC processor is augmented with custom instructions and special-purpose functional units. We apply the method on the proposed enhanced processor by studying the statistical characteristics of the various algorithms involved in a popular multimedia application. Our experimental results show that in contrast to state-of-the-art fault tolerance approaches, we are able to reduce runtime and area overhead by 71.3% and 83.3% respectively.

Fabrication and Characterization of Efficient Optical Filter Arrays Implemented by 3D Nanoimprint

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In dieser Arbeit werden optische Filterarrays für hochqualitative spektroskopische Anwendungen im sichtbaren (VIS) Wellenlängenbereich untersucht. Die optischen Filter, bestehend aus Fabry-Pérot (FP)-Filtern für hochauflösende miniaturisierte optische Nanospektrometer, basieren auf zwei hochreflektierenden dielektrischen Spiegeln und einer zwischenliegenden Resonanzkavität aus Polymer. Jeder Filter erlaubt einem schmalbandigem spektralen Band (in dieser Arbeit Filterlinie genannt) ,abhängig von der Höhe der Resonanzkavität, zu passieren. Die Effizienz eines solchen optischen Filters hängt von der präzisen Herstellung der hochselektiven multispektralen Filterfelder von FP-Filtern mittels kostengünstigen und hochdurchsatz Methoden ab. Die Herstellung der multiplen Spektralfilter über den gesamten sichtbaren Bereich wird durch einen einzelnen Prägeschritt durch die 3D Nanoimprint-Technologie mit sehr hoher vertikaler Auflösung auf einem Substrat erreicht. Der Schlüssel für diese Prozessintegration ist die Herstellung von 3D Nanoimprint-Stempeln mit den gewünschten Feldern von Filterkavitäten. Die spektrale Sensitivität von diesen effizienten optischen Filtern hängt von der Genauigkeit der vertikalen variierenden Kavitäten ab, die durch eine großflächige ‚weiche„ Nanoimprint-Technologie, UV oberflächenkonforme Imprint Lithographie (UV-SCIL), ab. Die Hauptprobleme von UV-basierten SCIL-Prozessen, wie eine nichtuniforme Restschichtdicke und Schrumpfung des Polymers ergeben Grenzen in der potenziellen Anwendung dieser Technologie. Es ist sehr wichtig, dass die Restschichtdicke gering und uniform ist, damit die kritischen Dimensionen des funktionellen 3D Musters während des Plasmaätzens zur Entfernung der Restschichtdicke kontrolliert werden kann. Im Fall des Nanospektrometers variieren die Kavitäten zwischen den benachbarten FP-Filtern vertikal sodass sich das Volumen von jedem einzelnen Filter verändert , was zu einer Höhenänderung der Restschichtdicke unter jedem Filter führt. Das volumetrische Schrumpfen, das durch den Polymerisationsprozess hervorgerufen wird, beeinträchtigt die Größe und Dimension der gestempelten Polymerkavitäten. Das Verhalten des großflächigen UV-SCIL Prozesses wird durch die Verwendung von einem Design mit ausgeglichenen Volumen verbessert und die Prozessbedingungen werden optimiert. Das Stempeldesign mit ausgeglichen Volumen verteilt 64 vertikal variierenden Filterkavitäten in Einheiten von 4 Kavitäten, die ein gemeinsames Durchschnittsvolumen haben. Durch die Benutzung der ausgeglichenen Volumen werden einheitliche Restschichtdicken (110 nm) über alle Filterhöhen erhalten. Die quantitative Analyse der Polymerschrumpfung wird in iii lateraler und vertikaler Richtung der FP-Filter untersucht. Das Schrumpfen in vertikaler Richtung hat den größten Einfluss auf die spektrale Antwort der Filter und wird durch die Änderung der Belichtungszeit von 12% auf 4% reduziert. FP Filter die mittels des Volumengemittelten Stempels und des optimierten Imprintprozesses hergestellt wurden, zeigen eine hohe Qualität der spektralen Antwort mit linearer Abhängigkeit zwischen den Kavitätshöhen und der spektralen Position der zugehörigen Filterlinien.

3D nanoimprint technology for NIR Fabry-Pérot filter arrays

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Im Rahmen dieser Arbeit wird die Herstellung von miniaturisierten NIR-Spektrometern auf Basis von Fabry-Pérot (FP) Filter Arrays behandelt. Bisher ist die kostengünstige Strukturierung von homogenen und vertikal erweiterten Kavitäten für NIR FP-Filter mittels Nanoimprint Technologie noch nicht verfügbar, weil die Qualität der Schichten des Prägematerials unzureichend ist und die geringe Mobilität der Prägematerialien nicht ausreicht, um die vertikal erweiterten Kavitäten zu füllen. Diese Arbeit konzentriert sich auf die Reduzierung des technischen Aufwands zur Herstellung von homogenen und vertikal erweiterten Kavitäten. Zur Strukturierung der Kavitäten wird ein großflächiger substratkonformer UV-Nanoimprint Prozess (SCIL - Substrate Conformal Imprint Lithoghaphy) verwendet, der auf einem Hybridstempel basiert und Vorteile von harten und weichen Stempeln vereint. Um die genannten Limitierungen zu beseitigen, werden alternative Designs der Kavitäten untersucht und ein neues Prägematerial eingesetzt. Drei Designlösungen zur Herstellung von homogenen und erweiterten Kavitäten werden untersucht und verglichen: (i) Das Aufbringen des Prägematerials mittel mehrfacher Rotationsbeschichtung, um eine höhere Schichtdicke des Prägematerials vor dem Prägeprozess zu erzeugen, (ii) die Verwendung einer hybriden Kavität bestehend aus einer strukturierten Schicht des Prägematerials eingebettet zwischen zwei Siliziumoxidschichten, um die Schichtdicke der organischen Kavität zu erweitern und (iii) die Optimierung des Prägeprozesses durch Verwendung eines neuen Prägematerials. Die mit diesen drei Ansätzen hergestellten FP-Filter Arrays zeigen, hohe Transmissionen (beste Transmission > 90%) und kleine Linienbreiten (Halbwertsbreiten <5 nm).

Real-time tracking system using C80 DSPs and a binocular robotic head

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[EN]An active vision system to perform tracking of moving objects in real time is described. The main goal is to obtain a system integrating off-the-self components. These components includes a stereoscopic robotic-head, as active perception hardware; a DSP based board SDB C80, as massive data processor and image acquisition board; and finally, a Pentium PC running Windows NT that interconnects and manages the whole system. Real-time is achieved taking advantage of the special architecture of DSP. An evaluation of the performance is included.

Risque de prix et décisions de production et d'exportation : le cas de l'agriculture au Québec

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cette thèse porte sur l’effet du risque de prix sur la décision des agriculteurs et les transformateurs québécois. Elle se divise en trois chapitres. Le premier chapitre revient sur la littérature. Le deuxième chapitre examine l’effet du risque de prix sur la production de trois produits, à savoir le maïs grain, la viande de porc et la viande d’agneau dans la province Québec. Le dernier chapitre est centré sur l’analyse de changement des préférences du transformateur québécois de porc pour ce qui est du choix de marché. Le premier chapitre vise à montrer l’importance de l’effet du risque du prix sur la quantité produite par les agriculteurs, tel que mis en évidence par la littérature. En effet, la littérature révèle l’importance du risque de prix à l’exportation sur le commerce international. Le deuxième chapitre est consacré à l’étude des facteurs du risque (les anticipations des prix et la volatilité des prix) dans la fonction de l’offre. Un modèle d’hétéroscédasticité conditionnelle autorégressive généralisée (GARCH) est utilisé afin de modéliser ces facteurs du risque. Les paramètres du modèle sont estimés par la méthode de l’Information Complète Maximum Vraisemblance (FIML). Les résultats empiriques montrent l’effet négatif de la volatilité du prix sur la production alors que la prévisibilité des prix a un effet positif sur la quantité produite. Comme attendu, nous constatons que l’application du programme d’assurance-stabilisation des revenus agricoles (ASRA) au Québec induit une plus importante sensibilité de l’offre par rapport au prix effectif (le prix incluant la compensation de l’ASRA) que par rapport au prix du marché. Par ailleurs, l’offre est moins sensible au prix des intrants qu’au prix de l’output. La diminution de l’aversion au risque de producteur est une autre conséquence de l’application de ce programme. En outre, l’estimation de la prime marginale relative au risque révèle que le producteur du maïs est le producteur le moins averse au risque (comparativement à celui de porc ou d’agneau). Le troisième chapitre consiste en l’analyse du changement de préférence du transformateur québécois du porc pour ce qui est du choix de marché. Nous supposons que le transformateur a la possibilité de fournir les produits sur deux marchés : étranger et local. Le modèle théorique explique l’offre relative comme étant une fonction à la fois d’anticipation relative et de volatilité relative des prix. Ainsi, ce modèle révèle que la sensibilité de l’offre relative par rapport à la volatilité relative de prix dépend de deux facteurs : d’une part, la part de l’exportation dans la production totale et d’autre part, l’élasticité de substitution entre les deux marchés. Un modèle à correction d’erreurs est utilisé lors d’estimation des paramètres du modèle. Les résultats montrent l’effet positif et significatif de l’anticipation relative du prix sur l’offre relative à court terme. Ces résultats montrent donc qu’une hausse de la volatilité du prix sur le marché étranger par rapport à celle sur le marché local entraine une baisse de l’offre relative sur le marché étranger à long terme. De plus, selon les résultats, les marchés étranger et local sont plus substituables à long terme qu’à court terme.

Parallel Cyclostationarity-Exploiting Algorithm for Energy-Efficient Spectrum Sensing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The evolution of wireless communication systems leads to Dynamic Spectrum Allocation for Cognitive Radio, which requires reliable spectrum sensing techniques. Among the spectrum sensing methods proposed in the literature, those that exploit cyclostationary characteristics of radio signals are particularly suitable for communication environments with low signal-to-noise ratios, or with non-stationary noise. However, such methods have high computational complexity that directly raises the power consumption of devices which often have very stringent low-power requirements. We propose a strategy for cyclostationary spectrum sensing with reduced energy consumption. This strategy is based on the principle that p processors working at slower frequencies consume less power than a single processor for the same execution time. We devise a strict relation between the energy savings and common parallel system metrics. The results of simulations show that our strategy promises very significant savings in actual devices.

High-performance parallel systems for data-intensive computing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Modelling continuum mechanics phenomena using three dimensional unstructured meshes on massively parallel processors

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The difficulties encountered in implementing large scale CM codes on multiprocessor systems are now fairly well understood. Despite the claims of shared memory architecture manufacturers to provide effective parallelizing compilers, these have not proved to be adequate for large or complex programs. Significant programmer effort is usually required to achieve reasonable parallel efficiencies on significant numbers of processors. The paradigm of Single Program Multi Data (SPMD) domain decomposition with message passing, where each processor runs the same code on a subdomain of the problem, communicating through exchange of messages, has for some time been demonstrated to provide the required level of efficiency, scalability, and portability across both shared and distributed memory systems, without the need to re-author the code into a new language or even to support differing message passing implementations. Extension of the methods into three dimensions has been enabled through the engineering of PHYSICA, a framework for supporting 3D, unstructured mesh and continuum mechanics modeling. In PHYSICA, six inspectors are used. Part of the challenge for automation of parallelization is being able to prove the equivalence of inspectors so that they can be merged into as few as possible.

Partition alignment in three dimensional unstructured mesh multi-physics modelling

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Unstructured mesh codes for modelling continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Parallelisation of such codes using single Program Multi Data (SPMD) domain decomposition techniques implemented with message passing has been demonstrated to provide high parallel efficiency, scalability to large numbers of processors P and portability across a wide range of parallel platforms. High efficiency, especially for large P requires that load balance is achieved in each parallel loop. For a code in which loops span a variety of mesh entity types, for example, elements, faces and vertices, some compromise is required between load balance for each entity type and the quantity of inter-processor communication required to satisfy data dependence between processors.

Automatic implementation of dynamic load balancing strategies for structured computational mechanics codes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new dynamic load balancing technique for structured mesh computational mechanics codes in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to using coincidental limits in all of the partitioned dimensions. The partition range limits are 'staggered', allowing greater flexibility in obtaining a balanced load distribution in comparison to when the limits are changed 'globally'. as the load increase/decrease on one processor no longer restricts the load decrease/increase on a neighbouring processor. The automatic implementation of this 'staggered' load balancing strategy within an existing parallel code is presented in this paper, along with some preliminary results.

The MUSE Machine -- an Architecture for Structured Data Flow Computation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Computers employing some degree of data flow organisation are now well established as providing a possible vehicle for concurrent computation. Although data-driven computation frees the architecture from the constraints of the single program counter, processor and global memory, inherent in the classic von Neumann computer, there can still be problems with the unconstrained generation of fresh result tokens if a pure data flow approach is adopted. The advantages of allowing serial processing for those parts of a program which are inherently serial, and of permitting a demand-driven, as well as data-driven, mode of operation are identified and described. The MUSE machine described here is a structured architecture supporting both serial and parallel processing which allows the abstract structure of a program to be mapped onto the machine in a logical way.

Design and implementation of a floating point unit for rigel, a massively parallel accelerator

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Scientific applications rely heavily on floating point data types. Floating point operations are complex and require complicated hardware that is both area and power intensive. The emergence of massively parallel architectures like Rigel creates new challenges and poses new questions with respect to floating point support. The massively parallel aspect of Rigel places great emphasis on area efficient, low power designs. At the same time, Rigel is a general purpose accelerator and must provide high performance for a wide class of applications. This thesis presents an analysis of various floating point unit (FPU) components with respect to Rigel, and attempts to present a candidate design of an FPU that balances performance, area, and power and is suitable for massively parallel architectures like Rigel.

Development of a Real Time System in Algol 68.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

For various reasons, many Algol 68 compilers do not directly implement the parallel processing operations defined in the Revised Algol 68 Report. It is still possible however, to perform parallel processing, multitasking and simulation provided that the implementation permits the creation of a master routine for the coordination and initiation of processes under its control. The package described here is intended for real time applications and runs in conjunction with the Algol 68R system; it extends and develops the original Algol 68RT package, which was designed for use with multiplexers at the Royal Radar Establishment, Malvern. The facilities provided, in addition to the synchronising operations, include an interface to an ICL Communications Processor enabling the abstract processes to be realised as the interaction of several teletypes or visual display units with a real time program providing a useful service.

«
1
2
...
33
34
35
36
37
38
39
40
41
»