10 resultados para Computer Vision and Pattern Recognition

em AMS Tesi di Laurea - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La tesi tratta i temi di computer vision connessi alle problematiche di inserimento in una piattaforma Web. Nel testo sono spiegate alcune soluzioni per includere una libreria software per l'emotion recognition in un'applicazione web e tecnologie per la registrazione di un video, catturando le immagine da una webcam.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vision systems are powerful tools playing an increasingly important role in modern industry, to detect errors and maintain product standards. With the enlarged availability of affordable industrial cameras, computer vision algorithms have been increasingly applied in industrial manufacturing processes monitoring. Until a few years ago, industrial computer vision applications relied only on ad-hoc algorithms designed for the specific object and acquisition setup being monitored, with a strong focus on co-designing the acquisition and processing pipeline. Deep learning has overcome these limits providing greater flexibility and faster re-configuration. In this work, the process to be inspected consists in vials’ pack formation entering a freeze-dryer, which is a common scenario in pharmaceutical active ingredient packaging lines. To ensure that the machine produces proper packs, a vision system is installed at the entrance of the freeze-dryer to detect eventual anomalies with execution times compatible with the production specifications. Other constraints come from sterility and safety standards required in pharmaceutical manufacturing. This work presents an overview about the production line, with particular focus on the vision system designed, and about all trials conducted to obtain the final performance. Transfer learning, alleviating the requirement for a large number of training data, combined with data augmentation methods, consisting in the generation of synthetic images, were used to effectively increase the performances while reducing the cost of data acquisition and annotation. The proposed vision algorithm is composed by two main subtasks, designed respectively to vials counting and discrepancy detection. The first one was trained on more than 23k vials (about 300 images) and tested on 5k more (about 75 images), whereas 60 training images and 52 testing images were used for the second one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usage of Optical Character Recognition’s (OCR, systems is a widely spread technology into the world of Computer Vision and Machine Learning. It is a topic that interest many field, for example the automotive, where becomes a specialized task known as License Plate Recognition, useful for many application from the automation of toll road to intelligent payments. However, OCR systems need to be very accurate and generalizable in order to be able to extract the text of license plates under high variable conditions, from the type of camera used for acquisition to light changes. Such variables compromise the quality of digitalized real scenes causing the presence of noise and degradation of various type, which can be minimized with the application of modern approaches for image iper resolution and noise reduction. Oneclass of them is known as Generative Neural Networks, which are very strong ally for the solution of this popular problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lo studio dell’intelligenza artificiale si pone come obiettivo la risoluzione di una classe di problemi che richiedono processi cognitivi difficilmente codificabili in un algoritmo per essere risolti. Il riconoscimento visivo di forme e figure, l’interpretazione di suoni, i giochi a conoscenza incompleta, fanno capo alla capacità umana di interpretare input parziali come se fossero completi, e di agire di conseguenza. Nel primo capitolo della presente tesi sarà costruito un semplice formalismo matematico per descrivere l’atto di compiere scelte. Il processo di “apprendimento” verrà descritto in termini della massimizzazione di una funzione di prestazione su di uno spazio di parametri per un ansatz di una funzione da uno spazio vettoriale ad un insieme finito e discreto di scelte, tramite un set di addestramento che descrive degli esempi di scelte corrette da riprodurre. Saranno analizzate, alla luce di questo formalismo, alcune delle più diffuse tecniche di artificial intelligence, e saranno evidenziate alcune problematiche derivanti dall’uso di queste tecniche. Nel secondo capitolo lo stesso formalismo verrà applicato ad una ridefinizione meno intuitiva ma più funzionale di funzione di prestazione che permetterà, per un ansatz lineare, la formulazione esplicita di un set di equazioni nelle componenti del vettore nello spazio dei parametri che individua il massimo assoluto della funzione di prestazione. La soluzione di questo set di equazioni sarà trattata grazie al teorema delle contrazioni. Una naturale generalizzazione polinomiale verrà inoltre mostrata. Nel terzo capitolo verranno studiati più nel dettaglio alcuni esempi a cui quanto ricavato nel secondo capitolo può essere applicato. Verrà introdotto il concetto di grado intrinseco di un problema. Verranno inoltre discusse alcuni accorgimenti prestazionali, quali l’eliminazione degli zeri, la precomputazione analitica, il fingerprinting e il riordino delle componenti per lo sviluppo parziale di prodotti scalari ad alta dimensionalità. Verranno infine introdotti i problemi a scelta unica, ossia quella classe di problemi per cui è possibile disporre di un set di addestramento solo per una scelta. Nel quarto capitolo verrà discusso più in dettaglio un esempio di applicazione nel campo della diagnostica medica per immagini, in particolare verrà trattato il problema della computer aided detection per il rilevamento di microcalcificazioni nelle mammografie.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we focus on pattern recognition methods related to EMG upper-limb prosthetic control. After giving a detailed review of the most widely used classification methods, we propose a new classification approach. It comes as a result of comparison in the Fourier analysis between able-bodied and trans-radial amputee subjects. We thus suggest a different classification method which considers each surface electrodes contribute separately, together with five time domain features, obtaining an average classification accuracy equals to 75% on a sample of trans-radial amputees. We propose an automatic feature selection procedure as a minimization problem in order to improve the method and its robustness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nell'elaborato viene introdotto l'ambito della Computer Vision e come l'algoritmo SIFT si inserisce nel suo panorama. Viene inoltre descritto SIFT stesso, le varie fasi di cui si compone e un'applicazione al problema dell'object recognition. Infine viene presentata un'implementazione di SIFT in linguaggio Python creata per ottenere un'applicazione didattica interattiva e vengono mostrati esempi di questa applicazione.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis work has been developed in collaboration between the Department of Physics and Astronomy of the University of Bologna and the IRCCS Rizzoli Orthopedic Institute during an internship period. The study aims to investigate the sensitivity of single-sided NMR in detecting structural differences of the articular cartilage tissue and their correlation with mechanical behavior. Suitable cartilage indicators for osteoarthritis (OA) severity (e.g., water and proteoglycans content, collagen structure) were explored through four NMR parameters: T2, T1, D, and Slp. Structural variations of the cartilage among its three layers (i.e., superficial, middle, and deep) were investigated performing several NMR pulses sequences on bovine knee joint samples using the NMR-MOUSE device. Previously, cartilage degradation studies were carried out, performing tests in three different experimental setups. The monitoring of the parameters and the best experimental setup were determined. An NMR automatized procedure based on the acquisition of these quantitative parameters was implemented, tested, and used for the investigation of the layers of twenty bovine cartilage samples. Statistical and pattern recognition analyses on these parameters have been performed. The results obtained from the analyses are very promising: the discrimination of the three cartilage layers shows very good results in terms of significance, paving the way for extensive use of NMR single-sided devices for biomedical applications. These results will be also integrated with analyses of tissue mechanical properties for a complete evaluation of cartilage changes throughout OA disease. The use of low-priced and mobile devices towards clinical applications could concern the screening of diseases related to cartilage tissue. This could have a positive impact both economically (including for underdeveloped countries) and socially, providing screening possibilities to a large part of the population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Miniaturized flying robotic platforms, called nano-drones, have the potential to revolutionize the autonomous robots industry sector thanks to their very small form factor. The nano-drones’ limited payload only allows for a sub-100mW microcontroller unit for the on-board computations. Therefore, traditional computer vision and control algorithms are too computationally expensive to be executed on board these palm-sized robots, and we are forced to rely on artificial intelligence to trade off accuracy in favor of lightweight pipelines for autonomous tasks. However, relying on deep learning exposes us to the problem of generalization since the deployment scenario of a convolutional neural network (CNN) is often composed by different visual cues and different features from those learned during training, leading to poor inference performances. Our objective is to develop and deploy and adaptation algorithm, based on the concept of latent replays, that would allow us to fine-tune a CNN to work in new and diverse deployment scenarios. To do so we start from an existing model for visual human pose estimation, called PULPFrontnet, which is used to identify the pose of a human subject in space through its 4 output variables, and we present the design of our novel adaptation algorithm, which features automatic data gathering and labeling and on-device deployment. We therefore showcase the ability of our algorithm to adapt PULP-Frontnet to new deployment scenarios, improving the R2 scores of the four network outputs, with respect to an unknown environment, from approximately [−0.2, 0.4, 0.0,−0.7] to [0.25, 0.45, 0.2, 0.1]. Finally we demonstrate how it is possible to fine-tune our neural network in real time (i.e., under 76 seconds), using the target parallel ultra-low power GAP 8 System-on-Chip on board the nano-drone, and we show how all adaptation operations can take place using less than 2mWh of energy, a small fraction of the available battery power.