823 resultados para Convolutional Neural Network (CNN)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sea- level variations have a significant impact on coastal areas. Prediction of sea level variations expected from the pre most critical information needs associated with the sea environment. For this, various methods exist. In this study, on the northern coast of the Persian Gulf have been studied relation to the effectiveness of parameters such as pressure, temperature and wind speed on sea leve and associated with global parameters such as the North Atlantic Oscillation index and NAO index and present statistic models for prediction of sea level. In the next step by using artificial neural network predict sea level for first in this region. Then compared results of the models. Prediction using statistical models estimated in terms correlation coefficient R = 0.84 and root mean square error (RMS) 21.9 cm for the Bushehr station, and R = 0.85 and root mean square error (RMS) 48.4 cm for Rajai station, While neural network used to have 4 layers and each middle layer six neurons is best for prediction and produces the results reliably in terms of correlation coefficient with R = 0.90126 and the root mean square error (RMS) 13.7 cm for the Bushehr station, and R = 0.93916 and the root mean square error (RMS) 22.6 cm for Rajai station. Therefore, the proposed methodology could be successfully used in the study area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prostate cancer is the most common non-dermatological cancer amongst men in the developed world. The current definitive diagnosis is core needle biopsy guided by transrectal ultrasound. However, this method suffers from low sensitivity and specificity in detecting cancer. Recently, a new ultrasound based tissue typing approach has been proposed, known as temporal enhanced ultrasound (TeUS). In this approach, a set of temporal ultrasound frames is collected from a stationary tissue location without any intentional mechanical excitation. The main aim of this thesis is to implement a deep learning-based solution for prostate cancer detection and grading using TeUS data. In the proposed solution, convolutional neural networks are trained to extract high-level features from time domain TeUS data in temporally and spatially adjacent frames in nine in vivo prostatectomy cases. This approach avoids information loss due to feature extraction and also improves cancer detection rate. The output likelihoods of two TeUS arrangements are then combined to form our novel decision support system. This deep learning-based approach results in the area under the receiver operating characteristic curve (AUC) of 0.80 and 0.73 for prostate cancer detection and grading, respectively, in leave-one-patient-out cross-validation. Recently, multi-parametric magnetic resonance imaging (mp-MRI) has been utilized to improve detection rate of aggressive prostate cancer. In this thesis, for the first time, we present the fusion of mp-MRI and TeUS for characterization of prostate cancer to compensates the deficiencies of each image modalities and improve cancer detection rate. The results obtained using TeUS are fused with those attained using consolidated mp-MRI maps from multiple MR modalities and cancer delineations on those by multiple clinicians. The proposed fusion approach yields the AUC of 0.86 in prostate cancer detection. The outcomes of this thesis emphasize the viable potential of TeUS as a tissue typing method. Employing this ultrasound-based intervention, which is non-invasive and inexpensive, can be a valuable and practical addition to enhance the current prostate cancer detection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, the problem of controlling a quadrotor UAV is considered. It is done by presenting an original control system, designed as a combination of Neural Networks and Disturbance Observer, using a composite learning approach for a system of the second order, which is a novel methodology in literature. After a brief introduction about the quadrotors, the concepts needed to understand the controller are presented, such as the main notions of advanced control, the basic structure and design of a Neural Network, the modeling of a quadrotor and its dynamics. The full simulator, developed on the MATLAB Simulink environment, used throughout the whole thesis, is also shown. For the guidance and control purposes, a Sliding Mode Controller, used as a reference, it is firstly introduced, and its theory and implementation on the simulator are illustrated. Finally the original controller is introduced, through its novel formulation, and implementation on the model. The effectiveness and robustness of the two controllers are then proven by extensive simulations in all different conditions of external disturbance and faults.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Resolution of multisensory deficits has been observed in teenagers with Autism Spectrum Disorders (ASD) for complex, social speech stimuli; this resolution extends to more basic multisensory processing, involving low-level stimuli. In particular, a delayed transition of multisensory integration (MSI) from a default state of competition to one of facilitation has been observed in ASD children. In other terms, the complete maturation of MSI is achieved later in ASD. In the present study a neuro-computational model is used to reproduce some patterns of behavior observed experimentally, modeling a bisensory reaction time task, in which auditory and visual stimuli are presented in random sequence alone (A or V) or together (AV). The model explains how the default competitive state can be implemented via mutual inhibition between primary sensory areas, and how the shift toward the classical multisensory facilitation, observed in adults, is the result of inhibitory cross-modal connections becoming excitatory during the development. Model results are consistent with a stronger cross-modal inhibition in ASD children, compared to normotypical (NT) ones, suggesting that the transition toward a cooperative interaction between sensory modalities takes longer to occur. Interestingly, the model also predicts the difference between unisensory switch trials (in which sensory modality switches) and unisensory repeat trials (in which sensory modality repeats). This is due to an inhibitory mechanism, characterized by a slow dynamics, driven by the preceding stimulus and inhibiting the processing of the incoming one, when of the opposite sensory modality. These findings link the cognitive framework delineated by the empirical results to a plausible neural implementation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Il machine learning negli ultimi anni ha acquisito una crescente popolarità nell’ambito della ricerca scientifica e delle sue applicazioni. Lo scopo di questa tesi è stato quello di studiare il machine learning nei suoi aspetti generali e applicarlo a problemi di computer vision. La tesi ha affrontato le difficoltà del dover spiegare dal punto di vista teorico gli algoritmi alla base delle reti neurali convoluzionali e ha successivamente trattato due problemi concreti di riconoscimento immagini: il dataset MNIST (immagini di cifre scritte a mano) e un dataset che sarà chiamato ”MELANOMA dataset” (immagini di melanomi e nevi sani). Utilizzando le tecniche spiegate nella sezione teorica si sono riusciti ad ottenere risultati soddifacenti per entrambi i dataset ottenendo una precisione del 98% per il MNIST e del 76.8% per il MELANOMA dataset

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pervasive and distributed Internet of Things (IoT) devices demand ubiquitous coverage beyond No-man’s land. To satisfy plethora of IoT devices with resilient connectivity, Non-Terrestrial Networks (NTN) will be pivotal to assist and complement terrestrial systems. In a massiveMTC scenario over NTN, characterized by sporadic uplink data reports, all the terminals within a satellite beam shall be served during the short visibility window of the flying platform, thus generating congestion due to simultaneous access attempts of IoT devices on the same radio resource. The more terminals collide, the more average-time it takes to complete an access which is due to the decreased number of successful attempts caused by Back-off commands of legacy methods. A possible countermeasure is represented by Non-Orthogonal Multiple Access scheme, which requires the knowledge of the number of superimposed NPRACH preambles. This work addresses this problem by proposing a Neural Network (NN) algorithm to cope with the uncoordinated random access performed by a prodigious number of Narrowband-IoT devices. Our proposed method classifies the number of colliding users, and for each estimates the Time of Arrival (ToA). The performance assessment, under Line of Sight (LoS) and Non-LoS conditions in sub-urban environments with two different satellite configurations, shows significant benefits of the proposed NN algorithm with respect to traditional methods for the ToA estimation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Questo lavoro è iniziato con uno studio teorico delle principali tecniche di classificazione di immagini note in letteratura, con particolare attenzione ai più diffusi modelli di rappresentazione dell’immagine, quali il modello Bag of Visual Words, e ai principali strumenti di Apprendimento Automatico (Machine Learning). In seguito si è focalizzata l’attenzione sulla analisi di ciò che costituisce lo stato dell’arte per la classificazione delle immagini, ovvero il Deep Learning. Per sperimentare i vantaggi dell’insieme di metodologie di Image Classification, si è fatto uso di Torch7, un framework di calcolo numerico, utilizzabile mediante il linguaggio di scripting Lua, open source, con ampio supporto alle metodologie allo stato dell’arte di Deep Learning. Tramite Torch7 è stata implementata la vera e propria classificazione di immagini poiché questo framework, grazie anche al lavoro di analisi portato avanti da alcuni miei colleghi in precedenza, è risultato essere molto efficace nel categorizzare oggetti in immagini. Le immagini su cui si sono basati i test sperimentali, appartengono a un dataset creato ad hoc per il sistema di visione 3D con la finalità di sperimentare il sistema per individui ipovedenti e non vedenti; in esso sono presenti alcuni tra i principali ostacoli che un ipovedente può incontrare nella propria quotidianità. In particolare il dataset si compone di potenziali ostacoli relativi a una ipotetica situazione di utilizzo all’aperto. Dopo avere stabilito dunque che Torch7 fosse il supporto da usare per la classificazione, l’attenzione si è concentrata sulla possibilità di sfruttare la Visione Stereo per aumentare l’accuratezza della classificazione stessa. Infatti, le immagini appartenenti al dataset sopra citato sono state acquisite mediante una Stereo Camera con elaborazione su FPGA sviluppata dal gruppo di ricerca presso il quale è stato svolto questo lavoro. Ciò ha permesso di utilizzare informazioni di tipo 3D, quali il livello di depth (profondità) di ogni oggetto appartenente all’immagine, per segmentare, attraverso un algoritmo realizzato in C++, gli oggetti di interesse, escludendo il resto della scena. L’ultima fase del lavoro è stata quella di testare Torch7 sul dataset di immagini, preventivamente segmentate attraverso l’algoritmo di segmentazione appena delineato, al fine di eseguire il riconoscimento della tipologia di ostacolo individuato dal sistema.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In product reviews, it is observed that the distribution of polarity ratings over reviews written by different users or evaluated based on different products are often skewed in the real world. As such, incorporating user and product information would be helpful for the task of sentiment classification of reviews. However, existing approaches ignored the temporal nature of reviews posted by the same user or evaluated on the same product. We argue that the temporal relations of reviews might be potentially useful for learning user and product embedding and thus propose employing a sequence model to embed these temporal relations into user and product representations so as to improve the performance of document-level sentiment analysis. Specifically, we first learn a distributed representation of each review by a one-dimensional convolutional neural network. Then, taking these representations as pretrained vectors, we use a recurrent neural network with gated recurrent units to learn distributed representations of users and products. Finally, we feed the user, product and review representations into a machine learning classifier for sentiment classification. Our approach has been evaluated on three large-scale review datasets from the IMDB and Yelp. Experimental results show that: (1) sequence modeling for the purposes of distributed user and product representation learning can improve the performance of document-level sentiment classification; (2) the proposed approach achieves state-of-The-Art results on these benchmark datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intersubjectivity is an important concept in psychology and sociology. It refers to sharing conceptualizations through social interactions in a community and using such shared conceptualization as a resource to interpret things that happen in everyday life. In this work, we make use of intersubjectivity as the basis to model shared stance and subjectivity for sentiment analysis. We construct an intersubjectivity network which links review writers, terms they used, as well as the polarities of the terms. Based on this network model, we propose a method to learn writer embeddings which are subsequently incorporated into a convolutional neural network for sentiment analysis. Evaluations on the IMDB, Yelp 2013 and Yelp 2014 datasets show that the proposed approach has achieved the state-of-the-art performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Negli ultimi due anni, per via della pandemia generata dal virus Covid19, la vita in ogni angolo del nostro pianeta è drasticamente cambiata. Ad oggi, nel mondo, sono oltre duecentoventi milioni le persone che hanno contratto questo virus e sono quasi cinque milioni le persone decedute. In alcuni periodi si è arrivati ad avere anche un milione di nuovi contagiati al giorno e mediamente, negli ultimi sei mesi, questo dato è stato di più di mezzo milione al giorno. Gli ospedali, soprattutto nei paesi meno sviluppati, hanno subito un grande stress e molte volte hanno avuto una carenza di risorse per fronteggiare questa grave pandemia. Per questo motivo ogni ricerca in questo campo diventa estremamente importante, soprattutto quelle che, con l'ausilio dell'intelligenza artificiale, riescono a dare supporto ai medici. Queste tecnologie una volta sviluppate e approvate possono essere diffuse a costi molto bassi e accessibili a tutti. In questo elaborato sono stati sperimentati e valutati due diversi approcci alla diagnosi del Covid-19 a partire dalle radiografie toraciche dei pazienti: il primo metodo si basa sul transfer learning di una rete convoluzionale inizialmente pensata per la classificazione di immagini. Il secondo approccio utilizza i Vision Transformer (ViT), un'architettura ampiamente diffusa nel campo del Natural Language Processing adattata ai task di Visione Artificiale. La prima soluzione ha ottenuto un’accuratezza di 0.85 mentre la seconda di 0.92, questi risultati, soprattutto il secondo, sono molto incoraggianti soprattutto vista la minima quantità di dati di training necessaria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main goal of the Airborne project is to develop, at technology readiness level 8 (TRL8), a few selected robotic aerial technologies for quick localization of victims by avalanches by equipping drones with two forefront sensors used in SAR operations in case of avalanches, namely the ARVA and RECCO. This thesis focuses on the design, development, and guidance of the TRL8 quadrotor developed during the project. We present and describe the design method that allowed us to obtain an EMI shielded UAV capable of integrating both RECCO and ARVA sensors. Besides, is presented the avionics and power train design and building procedure in order to obtain a modular UAV frame that can be easily carried by rescuers and achieves all the performance benchmarks of the project. Additionally, in addition to the onboard algorithms, a multivariate regressive convolutional neural network whose goal is the localization of the ARVA signal is presented. On guidance, the automatic flight procedure is described, and the onboard waypoint generator algorithm is presented. The goal of this algorithm is the generation and execution of an automatic grid pattern without the need to know the map in advance and without the support of a control ground station (CGS). Moreover, we present an iterative trajectory planner that does not need pre-knowledge of the map and uses Bézier curves to address optimal, dynamically feasible, safe, and re-plannable trajectories. The goal is to develop a method that allows local and fast replannings in case of an obstacle pop up or if some waypoints change. This makes the novel planner suitable to be applied in SAR operations. The introduction of the final version of the quadrotor is supported by internal flight tests and field tests performed in real operative scenarios by the Club Alpino Italiano (CAI).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis focuses on automating the time-consuming task of manually counting activated neurons in fluorescent microscopy images, which is used to study the mechanisms underlying torpor. The traditional method of manual annotation can introduce bias and delay the outcome of experiments, so the author investigates a deep-learning-based procedure to automatize this task. The author explores two of the main convolutional-neural-network (CNNs) state-of-the-art architectures: UNet and ResUnet family model, and uses a counting-by-segmentation strategy to provide a justification of the objects considered during the counting process. The author also explores a weakly-supervised learning strategy that exploits only dot annotations. The author quantifies the advantages in terms of data reduction and counting performance boost obtainable with a transfer-learning approach and, specifically, a fine-tuning procedure. The author released the dataset used for the supervised use case and all the pre-training models, and designed a web application to share both the counting process pipeline developed in this work and the models pre-trained on the dataset analyzed in this work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

I recenti sviluppi nel campo dell’intelligenza artificiale hanno permesso una più adeguata classificazione del segnale EEG. Negli ultimi anni è stato dimostrato come sia possibile ottenere ottime performance di classificazione impiegando tecniche di Machine Learning (ML) e di Deep Learning (DL), facendo uso, per quest’ultime, di reti neurali convoluzionali (Convolutional Neural Networks, CNN). In particolare, il Deep Learning richiede molti dati di training mentre spesso i dataset per EEG sono limitati ed è difficile quindi raggiungere prestazioni elevate. I metodi di Data Augmentation possono alleviare questo problema. Partendo da dati reali, questa tecnica permette, la creazione di dati artificiali fondamentali per aumentare le dimensioni del dataset di partenza. L’applicazione più comune è quella di utilizzare i Data Augmentation per aumentare le dimensioni del training set, in modo da addestrare il modello/rete neurale su un numero di campioni più esteso, riducendo gli errori di classificazione. Partendo da questa idea, i Data Augmentation sono stati applicati in molteplici campi e in particolare per la classificazione del segnale EEG. In questo elaborato di tesi, inizialmente, vengono descritti metodi di Data Augmentation implementati nel corso degli anni, utilizzabili anche nell’ambito di applicazioni EEG. Successivamente, si presentano alcuni studi specifici che applicano metodi di Data Augmentation per migliorare le presentazioni di classificatori basati su EEG per l’identificazione dello stato sonno/veglia, per il riconoscimento delle emozioni, e per la classificazione di immaginazione motoria.