936 resultados para Architectures profondes
Resumo:
Il presente lavoro di tesi, svolto presso i laboratori dell'X-ray Imaging Group del Dipartimento di Fisica e Astronomia dell'Università di Bologna e all'interno del progetto della V Commissione Scientifica Nazionale dell'INFN, COSA (Computing on SoC Architectures), ha come obiettivo il porting e l’analisi di un codice di ricostruzione tomografica su architetture GPU installate su System-On-Chip low-power, al fine di sviluppare un metodo portatile, economico e relativamente veloce. Dall'analisi computazionale sono state sviluppate tre diverse versioni del porting in CUDA C: nella prima ci si è limitati a trasporre la parte più onerosa del calcolo sulla scheda grafica, nella seconda si sfrutta la velocità del calcolo matriciale propria del coprocessore (facendo coincidere ogni pixel con una singola unità di calcolo parallelo), mentre la terza è un miglioramento della precedente versione ottimizzata ulteriormente. La terza versione è quella definitiva scelta perché è la più performante sia dal punto di vista del tempo di ricostruzione della singola slice sia a livello di risparmio energetico. Il porting sviluppato è stato confrontato con altre due parallelizzazioni in OpenMP ed MPI. Si è studiato quindi, sia su cluster HPC, sia su cluster SoC low-power (utilizzando in particolare la scheda quad-core Tegra K1), l’efficienza di ogni paradigma in funzione della velocità di calcolo e dell’energia impiegata. La soluzione da noi proposta prevede la combinazione del porting in OpenMP e di quello in CUDA C. Tre core CPU vengono riservati per l'esecuzione del codice in OpenMP, il quarto per gestire la GPU usando il porting in CUDA C. Questa doppia parallelizzazione ha la massima efficienza in funzione della potenza e dell’energia, mentre il cluster HPC ha la massima efficienza in velocità di calcolo. Il metodo proposto quindi permetterebbe di sfruttare quasi completamente le potenzialità della CPU e GPU con un costo molto contenuto. Una possibile ottimizzazione futura potrebbe prevedere la ricostruzione di due slice contemporaneamente sulla GPU, raddoppiando circa la velocità totale e sfruttando al meglio l’hardware. Questo studio ha dato risultati molto soddisfacenti, infatti, è possibile con solo tre schede TK1 eguagliare e forse a superare, in seguito, la potenza di calcolo di un server tradizionale con il vantaggio aggiunto di avere un sistema portatile, a basso consumo e costo. Questa ricerca si va a porre nell’ambito del computing come uno tra i primi studi effettivi su architetture SoC low-power e sul loro impiego in ambito scientifico, con risultati molto promettenti.
Resumo:
After almost 10 years from “The Free Lunch Is Over” article, where the need to parallelize programs started to be a real and mainstream issue, a lot of stuffs did happened: • Processor manufacturers are reaching the physical limits with most of their approaches to boosting CPU performance, and are instead turning to hyperthreading and multicore architectures; • Applications are increasingly need to support concurrency; • Programming languages and systems are increasingly forced to deal well with concurrency. This thesis is an attempt to propose an overview of a paradigm that aims to properly abstract the problem of propagating data changes: Reactive Programming (RP). This paradigm proposes an asynchronous non-blocking approach to concurrency and computations, abstracting from the low-level concurrency mechanisms.
Resumo:
Supramolecular two-dimensional engineering epitomizes the design of complex molecular architectures through recognition events in multicomponent self-assembly. Despite being the subject of in-depth experimental studies, such articulated phenomena have not been yet elucidated in time and space with atomic precision. Here we use atomistic molecular dynamics to simulate the recognition of complementary hydrogen-bonding modules forming 2D porous networks on graphite. We describe the transition path from the melt to the crystalline hexagonal phase and show that self-assembly proceeds through a series of intermediate states featuring a plethora of polygonal types. Finally, we design a novel bicomponent system possessing kinetically improved self-healing ability in silico, thus demonstrating that a priori engineering of 2D self-assembly is possible.
Resumo:
This work covers the synthesis of second-generation, ethylene glycol dendrons covalently linked to a surface anchor that contains two, three, or four catechol groups, the molecular assembly in aqueous buffer on titanium oxide surfaces, and the evaluation of the resistance of the monomolecular adlayers against nonspecific protein adsorption in contact with full blood serum. The results were compared to those of a linear poly(ethylene glycol) (PEG) analogue with the same molecular weight. The adsorption kinetics as well as resulting surface coverages were monitored by ex situ spectroscopic ellipsometry (VASE), in situ optical waveguide lightmode spectroscopy (OWLS), and quartz crystal microbalance with dissipation (QCM-D) investigations. The expected compositions of the macromolecular films were verified by X-ray photoelectron spectroscopy (XPS). The results of the adsorption study, performed in a high ionic strength ("cloud-point") buffer at room temperature, demonstrate that the adsorption kinetics increase with increasing number of catechol binding moieties and exceed the values found for the linear PEG analogue. This is attributed to the comparatively smaller and more confined molecular volume of the dendritic macromolecules in solution, the improved presentation of the catechol anchor, and/or their much lower cloud-point in the chosen buffer (close to room temperature). Interestingly, in terms of mechanistic aspects of "nonfouling" surface properties, the dendron films were found to be much stiffer and considerably less hydrated in comparison to the linear PEG brush surface, closer in their physicochemical properties to oligo(ethylene glycol) alkanethiol self-assembled monolayers than to conventional brush surfaces. Despite these differences, both types of polymer architectures at saturation coverage proved to be highly resistant toward protein adsorption. Although associated with higher synthesis costs, dendritic macromolecules are considered to be an attractive alternative to linear polymers for surface (bio)functionalization in view of their spontaneous formation of ultrathin, confluent, and nonfouling monolayers at room temperature and their outstanding ability to present functional ligands (coupled to the termini of the dendritic structure) at high surface densities.
Resumo:
Content-centric networking is a novel paradigm for the Future Internet that treats content as a first class citizen. This paper argues that content-centric networking should be generalized towards a service-centric networking scheme. We propose a service-centric networking design based on an object-oriented approach, in which content and services are considered objects. We show implementation architectures for example services and how these can benefit from service-oriented networking.
Resumo:
Range estimation is the core of many positioning systems such as radar, and Wireless Local Positioning Systems (WLPS). The estimation of range is achieved by estimating Time-of-Arrival (TOA). TOA represents the signal propagation delay between a transmitter and a receiver. Thus, error in TOA estimation causes degradation in range estimation performance. In wireless environments, noise, multipath, and limited bandwidth reduce TOA estimation performance. TOA estimation algorithms that are designed for wireless environments aim to improve the TOA estimation performance by mitigating the effect of closely spaced paths in practical (positive) signal-to-noise ratio (SNR) regions. Limited bandwidth avoids the discrimination of closely spaced paths. This reduces TOA estimation performance. TOA estimation methods are evaluated as a function of SNR, bandwidth, and the number of reflections in multipath wireless environments, as well as their complexity. In this research, a TOA estimation technique based on Blind signal Separation (BSS) is proposed. This frequency domain method estimates TOA in wireless multipath environments for a given signal bandwidth. The structure of the proposed technique is presented and its complexity and performance are theoretically evaluated. It is depicted that the proposed method is not sensitive to SNR, number of reflections, and bandwidth. In general, as bandwidth increases, TOA estimation performance improves. However, spectrum is the most valuable resource in wireless systems and usually a large portion of spectrum to support high performance TOA estimation is not available. In addition, the radio frequency (RF) components of wideband systems suffer from high cost and complexity. Thus, a novel, multiband positioning structure is proposed. The proposed technique uses the available (non-contiguous) bands to support high performance TOA estimation. This system incorporates the capabilities of cognitive radio (CR) systems to sense the available spectrum (also called white spaces) and to incorporate white spaces for high-performance localization. First, contiguous bands that are divided into several non-equal, narrow sub-bands that possess the same SNR are concatenated to attain an accuracy corresponding to the equivalent full band. Two radio architectures are proposed and investigated: the signal is transmitted over available spectrum either simultaneously (parallel concatenation) or sequentially (serial concatenation). Low complexity radio designs that handle the concatenation process sequentially and in parallel are introduced. Different TOA estimation algorithms that are applicable to multiband scenarios are studied and their performance is theoretically evaluated and compared to simulations. Next, the results are extended to non-contiguous, non-equal sub-bands with the same SNR. These are more realistic assumptions in practical systems. The performance and complexity of the proposed technique is investigated as well. This study’s results show that selecting bandwidth, center frequency, and SNR levels for each sub-band can adapt positioning accuracy.
Resumo:
We report on our experiences with the Spy project, including implementation details and benchmark results. Spy is a re-implementation of the Squeak (i.e., Smalltalk-80) VM using the PyPy toolchain. The PyPy project allows code written in RPython, a subset of Python, to be translated to a multitude of different backends and architectures. During the translation, many aspects of the implementation can be independently tuned, such as the garbage collection algorithm or threading implementation. In this way, a whole host of interpreters can be derived from one abstract interpreter definition. Spy aims to bring these benefits to Squeak, allowing for greater portability and, eventually, improved performance. The current Spy codebase is able to run a small set of benchmarks that demonstrate performance superior to many similar Smalltalk VMs, but which still run slower than in Squeak itself. Spy was built from scratch over the course of a week during a joint Squeak-PyPy Sprint in Bern last autumn.
Resumo:
In this paper two models for the simulation of glucose-insulin metabolism of children with Type 1 diabetes are presented. The models are based on the combined use of Compartmental Models (CMs) and artificial Neural Networks (NNs). Data from children with Type 1 diabetes, stored in a database, have been used as input to the models. The data are taken from four children with Type 1 diabetes and contain information about glucose levels taken from continuous glucose monitoring system, insulin intake and food intake, along with corresponding time. The influences of taken insulin on plasma insulin concentration, as well as the effect of food intake on glucose input into the blood from the gut, are estimated from the CMs. The outputs of CMs, along with previous glucose measurements, are fed to a NN, which provides short-term prediction of glucose values. For comparative reasons two different NN architectures have been tested: a Feed-Forward NN (FFNN) trained with the back-propagation algorithm with adaptive learning rate and momentum, and a Recurrent NN (RNN), trained with the Real Time Recurrent Learning (RTRL) algorithm. The results indicate that the best prediction performance can be achieved by the use of RNN.
Resumo:
The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed.
Resumo:
Wireless Mesh Networks (WMN) have proven to be a key technology for increased network coverage of Internet infrastructures. The development process for new protocols and architectures in the area of WMN is typically split into evaluation by network simulation and testing of a prototype in a test-bed. Testing a prototype in a real test-bed is time-consuming and expensive. Irrepressible external interferences can occur which makes debugging difficult. Moreover, the test-bed usually supports only a limited number of test topologies. Finally, mobility tests are impractical. Therefore, we propose VirtualMesh as a new testing architecture which can be used before going to a real test-bed. It provides instruments to test the real communication software including the network stack inside a controlled environment. VirtualMesh is implemented by capturing real traffic through a virtual interface at the mesh nodes. The traffic is then redirected to the network simulator OMNeT++. In our experiments, VirtualMesh has proven to be scalable and introduces moderate delays. Therefore, it is suitable for predeployment testing of communication software for WMNs.
Resumo:
PDP++ is a freely available, open source software package designed to support the development, simulation, and analysis of research-grade connectionist models of cognitive processes. It supports most popular parallel distributed processing paradigms and artificial neural network architectures, and it also provides an implementation of the LEABRA computational cognitive neuroscience framework. Models are typically constructed and examined using the PDP++ graphical user interface, but the system may also be extended through the incorporation of user-written C++ code. This article briefly reviews the features of PDP++, focusing on its utility for teaching cognitive modeling concepts and skills to university undergraduate and graduate students. An informal evaluation of the software as a pedagogical tool is provided, based on the author’s classroom experiences at three research universities and several conference-hosted tutorials.
Resumo:
As education providers increasingly integrate digital learning media into their education processes, the need for the systematic management of learning materials and learning arrangements becomes clearer. Digital repositories, often called Learning Object Repositories (LOR), promise to provide an answer to this challenge. This article is composed of two parts. In this part, we derive technological and pedagogical requirements for LORs from a concretization of information quality criteria for e-learning technology. We review the evolution of learning object repositories and discuss their core features in the context of pedagogical requirements, information quality demands, and e-learning technology standards. We conclude with an outlook in Part 2, which presents concrete technical solutions, in particular networked repository architectures.
Resumo:
In Part 1 of this article we discussed the need for information quality and the systematic management of learning materials and learning arrangements. Digital repositories, often called Learning Object Repositories (LOR), were introduced as a promising answer to this challenge. We also derived technological and pedagogical requirements for LORs from a concretization of information quality criteria for e-learning technology. This second part presents technical solutions that particularly address the demands of open education movements, which aspire to a global reuse and sharing culture. From this viewpoint, we develop core requirements for scalable network architectures for educational content management. We then present edu-sharing, an advanced example of a network of homogeneous repositories for learning resources, and discuss related technology. We conclude with an outlook in terms of emerging developments towards open and networked system architectures in e-learning.