784 resultados para deep learning, convolutional neural network, computer aided detection, mammografie
Resumo:
Negli ultimi due anni, per via della pandemia generata dal virus Covid19, la vita in ogni angolo del nostro pianeta è drasticamente cambiata. Ad oggi, nel mondo, sono oltre duecentoventi milioni le persone che hanno contratto questo virus e sono quasi cinque milioni le persone decedute. In alcuni periodi si è arrivati ad avere anche un milione di nuovi contagiati al giorno e mediamente, negli ultimi sei mesi, questo dato è stato di più di mezzo milione al giorno. Gli ospedali, soprattutto nei paesi meno sviluppati, hanno subito un grande stress e molte volte hanno avuto una carenza di risorse per fronteggiare questa grave pandemia. Per questo motivo ogni ricerca in questo campo diventa estremamente importante, soprattutto quelle che, con l'ausilio dell'intelligenza artificiale, riescono a dare supporto ai medici. Queste tecnologie una volta sviluppate e approvate possono essere diffuse a costi molto bassi e accessibili a tutti. In questo elaborato sono stati sperimentati e valutati due diversi approcci alla diagnosi del Covid-19 a partire dalle radiografie toraciche dei pazienti: il primo metodo si basa sul transfer learning di una rete convoluzionale inizialmente pensata per la classificazione di immagini. Il secondo approccio utilizza i Vision Transformer (ViT), un'architettura ampiamente diffusa nel campo del Natural Language Processing adattata ai task di Visione Artificiale. La prima soluzione ha ottenuto un’accuratezza di 0.85 mentre la seconda di 0.92, questi risultati, soprattutto il secondo, sono molto incoraggianti soprattutto vista la minima quantità di dati di training necessaria.
Resumo:
L’obiettivo principale della tesi, è quello di mettere a confronto soluzioni basate su tecnologie diverse e individuare la soluzione migliore che permetta di stabilire se le persone inquadrate in un’immagine indossano correttamente o meno la mascherina protettiva come previsto dalle norme anti-covid. Per raggiungere l’obiettivo verranno confrontate diverse architetture costruite per lo stesso scopo e che si basano sui principi di Machine Learning e Deep Learning, e verranno messe in funzione su insieme di dataset individuati, che sono stati creati per propositi affini.
Resumo:
I recenti sviluppi nel campo dell’intelligenza artificiale hanno permesso una più adeguata classificazione del segnale EEG. Negli ultimi anni è stato dimostrato come sia possibile ottenere ottime performance di classificazione impiegando tecniche di Machine Learning (ML) e di Deep Learning (DL), facendo uso, per quest’ultime, di reti neurali convoluzionali (Convolutional Neural Networks, CNN). In particolare, il Deep Learning richiede molti dati di training mentre spesso i dataset per EEG sono limitati ed è difficile quindi raggiungere prestazioni elevate. I metodi di Data Augmentation possono alleviare questo problema. Partendo da dati reali, questa tecnica permette, la creazione di dati artificiali fondamentali per aumentare le dimensioni del dataset di partenza. L’applicazione più comune è quella di utilizzare i Data Augmentation per aumentare le dimensioni del training set, in modo da addestrare il modello/rete neurale su un numero di campioni più esteso, riducendo gli errori di classificazione. Partendo da questa idea, i Data Augmentation sono stati applicati in molteplici campi e in particolare per la classificazione del segnale EEG. In questo elaborato di tesi, inizialmente, vengono descritti metodi di Data Augmentation implementati nel corso degli anni, utilizzabili anche nell’ambito di applicazioni EEG. Successivamente, si presentano alcuni studi specifici che applicano metodi di Data Augmentation per migliorare le presentazioni di classificatori basati su EEG per l’identificazione dello stato sonno/veglia, per il riconoscimento delle emozioni, e per la classificazione di immaginazione motoria.
Resumo:
Application of dataset fusion techniques to an object detection task, involving the use of deep learning as convolutional neural networks, to manage to create a single RCNN architecture able to inference with good performances on two distinct datasets with different domains.
Resumo:
L’objectif de cette thèse par articles est de présenter modestement quelques étapes du parcours qui mènera (on espère) à une solution générale du problème de l’intelligence artificielle. Cette thèse contient quatre articles qui présentent chacun une différente nouvelle méthode d’inférence perceptive en utilisant l’apprentissage machine et, plus particulièrement, les réseaux neuronaux profonds. Chacun de ces documents met en évidence l’utilité de sa méthode proposée dans le cadre d’une tâche de vision par ordinateur. Ces méthodes sont applicables dans un contexte plus général, et dans certains cas elles on tété appliquées ailleurs, mais ceci ne sera pas abordé dans le contexte de cette de thèse. Dans le premier article, nous présentons deux nouveaux algorithmes d’inférence variationelle pour le modèle génératif d’images appelé codage parcimonieux “spike- and-slab” (CPSS). Ces méthodes d’inférence plus rapides nous permettent d’utiliser des modèles CPSS de tailles beaucoup plus grandes qu’auparavant. Nous démontrons qu’elles sont meilleures pour extraire des détecteur de caractéristiques quand très peu d’exemples étiquetés sont disponibles pour l’entraînement. Partant d’un modèle CPSS, nous construisons ensuite une architecture profonde, la machine de Boltzmann profonde partiellement dirigée (MBP-PD). Ce modèle a été conçu de manière à simplifier d’entraînement des machines de Boltzmann profondes qui nécessitent normalement une phase de pré-entraînement glouton pour chaque couche. Ce problème est réglé dans une certaine mesure, mais le coût d’inférence dans le nouveau modèle est relativement trop élevé pour permettre de l’utiliser de manière pratique. Dans le deuxième article, nous revenons au problème d’entraînement joint de machines de Boltzmann profondes. Cette fois, au lieu de changer de famille de modèles, nous introduisons un nouveau critère d’entraînement qui donne naissance aux machines de Boltzmann profondes à multiples prédictions (MBP-MP). Les MBP-MP sont entraînables en une seule étape et ont un meilleur taux de succès en classification que les MBP classiques. Elles s’entraînent aussi avec des méthodes variationelles standard au lieu de nécessiter un classificateur discriminant pour obtenir un bon taux de succès en classification. Par contre, un des inconvénients de tels modèles est leur incapacité de générer deséchantillons, mais ceci n’est pas trop grave puisque la performance de classification des machines de Boltzmann profondes n’est plus une priorité étant donné les dernières avancées en apprentissage supervisé. Malgré cela, les MBP-MP demeurent intéressantes parce qu’elles sont capable d’accomplir certaines tâches que des modèles purement supervisés ne peuvent pas faire, telles que celle de classifier des données incomplètes ou encore celle de combler intelligemment l’information manquante dans ces données incomplètes. Le travail présenté dans cette thèse s’est déroulé au milieu d’une période de transformations importantes du domaine de l’apprentissage à réseaux neuronaux profonds qui a été déclenchée par la découverte de l’algorithme de “dropout” par Geoffrey Hinton. Dropout rend possible un entraînement purement supervisé d’architectures de propagation unidirectionnel sans être exposé au danger de sur- entraînement. Le troisième article présenté dans cette thèse introduit une nouvelle fonction d’activation spécialement con ̧cue pour aller avec l’algorithme de Dropout. Cette fonction d’activation, appelée maxout, permet l’utilisation de aggrégation multi-canal dans un contexte d’apprentissage purement supervisé. Nous démontrons comment plusieurs tâches de reconnaissance d’objets sont mieux accomplies par l’utilisation de maxout. Pour terminer, sont présentons un vrai cas d’utilisation dans l’industrie pour la transcription d’adresses de maisons à plusieurs chiffres. En combinant maxout avec une nouvelle sorte de couche de sortie pour des réseaux neuronaux de convolution, nous démontrons qu’il est possible d’atteindre un taux de succès comparable à celui des humains sur un ensemble de données coriace constitué de photos prises par les voitures de Google. Ce système a été déployé avec succès chez Google pour lire environ cent million d’adresses de maisons.
Resumo:
Deep Brain Stimulator devices are becoming widely used for therapeutic benefits in movement disorders such as Parkinson's disease. Prolonging the battery life span of such devices could dramatically reduce the risks and accumulative costs associated with surgical replacement. This paper demonstrates how an artificial neural network can be trained using pre-processing frequency analysis of deep brain electrode recordings to detect the onset of tremor in Parkinsonian patients. Implementing this solution into an 'intelligent' neurostimulator device will remove the need for continuous stimulation currently used, and open up the possibility of demand-driven stimulation. Such a methodology could potentially decrease the power consumption of a deep brain pulse generator.
Resumo:
In this paper, a computer-aided diagnostic (CAD) system for the classification of hepatic lesions from computed tomography (CT) images is presented. Regions of interest (ROIs) taken from nonenhanced CT images of normal liver, hepatic cysts, hemangiomas, and hepatocellular carcinomas have been used as input to the system. The proposed system consists of two modules: the feature extraction and the classification modules. The feature extraction module calculates the average gray level and 48 texture characteristics, which are derived from the spatial gray-level co-occurrence matrices, obtained from the ROIs. The classifier module consists of three sequentially placed feed-forward neural networks (NNs). The first NN classifies into normal or pathological liver regions. The pathological liver regions are characterized by the second NN as cyst or "other disease." The third NN classifies "other disease" into hemangioma or hepatocellular carcinoma. Three feature selection techniques have been applied to each individual NN: the sequential forward selection, the sequential floating forward selection, and a genetic algorithm for feature selection. The comparative study of the above dimensionality reduction methods shows that genetic algorithms result in lower dimension feature vectors and improved classification performance.
Resumo:
Correctness of information gathered in production environments is an essential part of quality assurance processes in many industries, this task is often performed by human resources who visually take annotations in various steps of the production flow. Depending on the performed task the correlation between where exactly the information is gathered and what it represents is more than often lost in the process. The lack of labeled data places a great boundary on the application of deep neural networks aimed at object detection tasks, moreover supervised training of deep models requires a great amount of data to be available. Reaching an adequate large collection of labeled images through classic techniques of data annotations is an exhausting and costly task to perform, not always suitable for every scenario. A possible solution is to generate synthetic data that replicates the real one and use it to fine-tune a deep neural network trained on one or more source domains to a different target domain. The purpose of this thesis is to show a real case scenario where the provided data were both in great scarcity and missing the required annotations. Sequentially a possible approach is presented where synthetic data has been generated to address those issues while standing as a training base of deep neural networks for object detection, capable of working on images taken in production-like environments. Lastly, it compares performance on different types of synthetic data and convolutional neural networks used as backbones for the model.
Resumo:
La crescente disponibilità di scanner 3D ha reso più semplice l’acquisizione di modelli 3D dall’ambiente. A causa delle inevitabili imperfezioni ed errori che possono avvenire durante la fase di scansione, i modelli acquisiti possono risultare a volte inutilizzabili ed affetti da rumore. Le tecniche di denoising hanno come obiettivo quello di rimuovere dalla superficie della mesh 3D scannerizzata i disturbi provocati dal rumore, ristabilendo le caratteristiche originali della superficie senza introdurre false informazioni. Per risolvere questo problema, un approccio innovativo è quello di utilizzare il Geometric Deep Learning per addestrare una Rete Neurale in maniera da renderla in grado di eseguire efficacemente il denoising di mesh. L’obiettivo di questa tesi è descrivere il Geometric Deep Learning nell’ambito del problema sotto esame.
Resumo:
Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes puissantes pour l'apprentissage automatique. L'idée est de combiner des couches de facteurs latents en hierarchies. Cela requiert souvent un coût computationel plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur coût et aussi d'améliorer leur régularisation et leur optimization. Cette thèse adresse cette question sur ces trois perspectives. Nous étudions tout d'abord le problème de réduire le coût de certains algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines de Boltzmann restreintes et des auto-encodeurs débruitants sur des distributions sparses à haute dimension. Ceci est important pour l'application de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes (Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage par importance pour échantilloner l'objectif de ces modèles. Nous observons que cela réduit significativement le temps d'entrainement. L'accéleration atteint 2 ordres de magnitude sur plusieurs bancs d'essai. Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012). Dans Rifai et al. (2011), nous proposons un nouveau régularisateur qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992). Cette méthode exploite des principes géometriques et permit au moment de la publication d'atteindre des résultats à l'état de l'art. Finalement, nous considérons le problème d'optimiser des surfaces non-convexes à haute dimensionalité comme celle des réseaux de neurones. Tradionellement, l'abondance de minimum locaux était considéré comme la principale difficulté dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons à partir de résultats en statistique physique, de la théorie des matrices aléatoires, de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une difficulté plus profonde provient de la prolifération de points-selle. Dans ce papier nous proposons aussi une nouvelle méthode pour l'optimisation non-convexe.
Resumo:
Neural Network has emerged as the topic of the day. The spectrum of its application is as wide as from ECG noise filtering to seismic data analysis and from elementary particle detection to electronic music composition. The focal point of the proposed work is an application of a massively parallel connectionist model network for detection of a sonar target. This task is segmented into: (i) generation of training patterns from sea noise that contains radiated noise of a target, for teaching the network;(ii) selection of suitable network topology and learning algorithm and (iii) training of the network and its subsequent testing where the network detects, in unknown patterns applied to it, the presence of the features it has already learned in. A three-layer perceptron using backpropagation learning is initially subjected to a recursive training with example patterns (derived from sea ambient noise with and without the radiated noise of a target). On every presentation, the error in the output of the network is propagated back and the weights and the bias associated with each neuron in the network are modified in proportion to this error measure. During this iterative process, the network converges and extracts the target features which get encoded into its generalized weights and biases.In every unknown pattern that the converged network subsequently confronts with, it searches for the features already learned and outputs an indication for their presence or absence. This capability for target detection is exhibited by the response of the network to various test patterns presented to it.Three network topologies are tried with two variants of backpropagation learning and a grading of the performance of each combination is subsequently made.
Resumo:
Foundation construction process has been an important key point in a successful construction engineering. The frequency of using diaphragm wall construction method among many deep excavation construction methods in Taiwan is the highest in the world. The traditional view of managing diaphragm wall unit in the sequencing of construction activities is to establish each phase of the sequencing of construction activities by heuristics. However, it conflicts final phase of engineering construction with unit construction and effects planning construction time. In order to avoid this kind of situation, we use management of science in the study of diaphragm wall unit construction to formulate multi-objective combinational optimization problem. Because the characteristic (belong to NP-Complete problem) of problem mathematic model is multi-objective and combining explosive, it is advised that using the 2-type Self-Learning Neural Network (SLNN) to solve the N=12, 24, 36 of diaphragm wall unit in the sequencing of construction activities program problem. In order to compare the liability of the results, this study will use random researching method in comparison with the SLNN. It is found that the testing result of SLNN is superior to random researching method in whether solution-quality or Solving-efficiency.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this article, an implementation of structural health monitoring process automation based on vibration measurements is proposed. The work presents an alternative approach which intent is to exploit the capability of model updating techniques associated to neural networks to be used in a process of automation of fault detection. The updating procedure supplies a reliable model which permits to simulate any damage condition in order to establish direct correlation between faults and deviation in the response of the model. The ability of the neural networks to recognize, at known signature, changes in the actual data of a model in real time are explored to investigate changes of the actual operation conditions of the system. The learning of the network is performed using a compressed spectrum signal created for each specific type of fault. Different fault conditions for a frame structure are evaluated using simulated data as well as measured experimental data.
Resumo:
Quantitative characterisation of carotid atherosclerosis and classification into symptomatic or asymptomatic is crucial in planning optimal treatment of atheromatous plaque. The computer-aided diagnosis (CAD) system described in this paper can analyse ultrasound (US) images of carotid artery and classify them into symptomatic or asymptomatic based on their echogenicity characteristics. The CAD system consists of three modules: a) the feature extraction module, where first-order statistical (FOS) features and Laws' texture energy can be estimated, b) the dimensionality reduction module, where the number of features can be reduced using analysis of variance (ANOVA), and c) the classifier module consisting of a neural network (NN) trained by a novel hybrid method based on genetic algorithms (GAs) along with the back propagation algorithm. The hybrid method is able to select the most robust features, to adjust automatically the NN architecture and to optimise the classification performance. The performance is measured by the accuracy, sensitivity, specificity and the area under the receiver-operating characteristic (ROC) curve. The CAD design and development is based on images from 54 symptomatic and 54 asymptomatic plaques. This study demonstrates the ability of a CAD system based on US image analysis and a hybrid trained NN to identify atheromatous plaques at high risk of stroke.