783 resultados para Neural Network-models
Resumo:
This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient’s extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.
Resumo:
Recommender system is a specific type of intelligent systems, which exploits historical user ratings on items and/or auxiliary information to make recommendations on items to the users. It plays a critical role in a wide range of online shopping, e-commercial services and social networking applications. Collaborative filtering (CF) is the most popular approaches used for recommender systems, but it suffers from complete cold start (CCS) problem where no rating record are available and incomplete cold start (ICS) problem where only a small number of rating records are available for some new items or users in the system. In this paper, we propose two recommendation models to solve the CCS and ICS problems for new items, which are based on a framework of tightly coupled CF approach and deep learning neural network. A specific deep neural network SADE is used to extract the content features of the items. The state of the art CF model, timeSVD++, which models and utilizes temporal dynamics of user preferences and item features, is modified to take the content features into prediction of ratings for cold start items. Extensive experiments on a large Netflix rating dataset of movies are performed, which show that our proposed recommendation models largely outperform the baseline models for rating prediction of cold start items. The two proposed recommendation models are also evaluated and compared on ICS items, and a flexible scheme of model retraining and switching is proposed to deal with the transition of items from cold start to non-cold start status. The experiment results on Netflix movie recommendation show the tight coupling of CF approach and deep learning neural network is feasible and very effective for cold start item recommendation. The design is general and can be applied to many other recommender systems for online shopping and social networking applications. The solution of cold start item problem can largely improve user experience and trust of recommender systems, and effectively promote cold start items.
Resumo:
Recommender systems (RS) are used by many social networking applications and online e-commercial services. Collaborative filtering (CF) is one of the most popular approaches used for RS. However traditional CF approach suffers from sparsity and cold start problems. In this paper, we propose a hybrid recommendation model to address the cold start problem, which explores the item content features learned from a deep learning neural network and applies them to the timeSVD++ CF model. Extensive experiments are run on a large Netflix rating dataset for movies. Experiment results show that the proposed hybrid recommendation model provides a good prediction for cold start items, and performs better than four existing recommendation models for rating of non-cold start items.
Resumo:
In this paper, a real-time optimal control technique for non-linear plants is proposed. The control system makes use of the cell-mapping (CM) techniques, widely used for the global analysis of highly non-linear systems. The CM framework is employed for designing approximate optimal controllers via a control variable discretization. Furthermore, CM-based designs can be improved by the use of supervised feedforward artificial neural networks (ANNs), which have proved to be universal and efficient tools for function approximation, providing also very fast responses. The quantitative nature of the approximate CM solutions fits very well with ANNs characteristics. Here, we propose several control architectures which combine, in a different manner, supervised neural networks and CM control algorithms. On the one hand, different CM control laws computed for various target objectives can be employed for training a neural network, explicitly including the target information in the input vectors. This way, tracking problems, in addition to regulation ones, can be addressed in a fast and unified manner, obtaining smooth, averaged and global feedback control laws. On the other hand, adjoining CM and ANNs are also combined into a hybrid architecture to address problems where accuracy and real-time response are critical. Finally, some optimal control problems are solved with the proposed CM, neural and hybrid techniques, illustrating their good performance.
Resumo:
This paper presents a methodology for short-term load forecasting based on genetic algorithm feature selection and artificial neural network modeling. A feed forward artificial neural network is used to model the 24-h ahead load based on past consumption, weather and stock index data. A genetic algorithm is used in order to find the best subset of variables for modeling. Three data sets of different geographical locations, encompassing areas of different dimensions with distinct load profiles are used in order to evaluate the methodology. The developed approach was found to generate models achieving a minimum mean average percentage error under 2 %. The feature selection algorithm was able to significantly reduce the number of used features and increase the accuracy of the models.
Resumo:
Digital soil mapping is an alternative for the recognition of soil classes in areas where pedological surveys are not available. The main aim of this study was to obtain a digital soil map using artificial neural networks (ANN) and environmental variables that express soillandscape relationships. This study was carried out in an area of 11,072 ha located in the Barra Bonita municipality, state of São Paulo, Brazil. A soil survey was obtained from a reference area of approximately 500 ha located in the center of the area studied. With the mapping units identified together with the environmental variables elevation, slope, slope plan, slope profile, convergence index, geology and geomorphic surfaces, a supervised classification by ANN was implemented. The neural network simulator used was the Java NNS with the learning algorithm "back propagation." Reference points were collected for evaluating the performance of the digital map produced. The occurrence of soils in the landscape obtained in the reference area was observed in the following digital classification: medium-textured soils at the highest positions of the landscape, originating from sandstone, and clayey loam soils in the end thirds of the hillsides due to the greater presence of basalt. The variables elevation and slope were the most important factors for discriminating soil class through the ANN. An accuracy level of 82% between the reference points and the digital classification was observed. The methodology proposed allowed for a preliminary soil classification of an area not previously mapped using mapping units obtained in a reference area
Resumo:
Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose to learn a variable selection policy for branch-and-bound in mixed-integer linear programming, by imitation learning on a diversified variant of the strong branching expert rule. We encode states as bipartite graphs and parameterize the policy as a graph convolutional neural network. Experiments on a series of synthetic problems demonstrate that our approach produces policies that can improve upon expert-designed branching rules on large problems, and generalize to instances significantly larger than seen during training.
Resumo:
Acoustic Emission (AE) monitoring can be used to detect the presence of damage as well as determine its location in Structural Health Monitoring (SHM) applications. Information on the time difference of the signal generated by the damage event arriving at different sensors is essential in performing localization. This makes the time of arrival (ToA) an important piece of information to retrieve from the AE signal. Generally, this is determined using statistical methods such as the Akaike Information Criterion (AIC) which is particularly prone to errors in the presence of noise. And given that the structures of interest are surrounded with harsh environments, a way to accurately estimate the arrival time in such noisy scenarios is of particular interest. In this work, two new methods are presented to estimate the arrival times of AE signals which are based on Machine Learning. Inspired by great results in the field, two models are presented which are Deep Learning models - a subset of machine learning. They are based on Convolutional Neural Network (CNN) and Capsule Neural Network (CapsNet). The primary advantage of such models is that they do not require the user to pre-define selected features but only require raw data to be given and the models establish non-linear relationships between the inputs and outputs. The performance of the models is evaluated using AE signals generated by a custom ray-tracing algorithm by propagating them on an aluminium plate and compared to AIC. It was found that the relative error in estimation on the test set was < 5% for the models compared to around 45% of AIC. The testing process was further continued by preparing an experimental setup and acquiring real AE signals to test on. Similar performances were observed where the two models not only outperform AIC by more than a magnitude in their average errors but also they were shown to be a lot more robust as compared to AIC which fails in the presence of noise.
Resumo:
Much of the real-world dataset, including textual data, can be represented using graph structures. The use of graphs to represent textual data has many advantages, mainly related to maintaining a more significant amount of information, such as the relationships between words and their types. In recent years, many neural network architectures have been proposed to deal with tasks on graphs. Many of them consider only node features, ignoring or not giving the proper relevance to relationships between them. However, in many node classification tasks, they play a fundamental role. This thesis aims to analyze the main GNNs, evaluate their advantages and disadvantages, propose an innovative solution considered as an extension of GAT, and apply them to a case study in the biomedical field. We propose the reference GNNs, implemented with methodologies later analyzed, and then applied to a question answering system in the biomedical field as a replacement for the pre-existing GNN. We attempt to obtain better results by using models that can accept as input both node and edge features. As shown later, our proposed models can beat the original solution and define the state-of-the-art for the task under analysis.
Resumo:
Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability.
Resumo:
The first topic analyzed in the thesis will be Neural Architecture Search (NAS). I will focus on two different tools that I developed, one to optimize the architecture of Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing that has recently emerged, and one to optimize the data precision of tensors inside CNNs. The first NAS proposed explicitly targets the optimization of the most peculiar architectural parameters of TCNs, namely dilation, receptive field, and the number of features in each layer. Note that this is the first NAS that explicitly targets these networks. The second NAS proposed instead focuses on finding the most efficient data format for a target CNN, with the granularity of the layer filter. Note that applying these two NASes in sequence allows an "application designer" to minimize the structure of the neural network employed, minimizing the number of operations or the memory usage of the network. After that, the second topic described is the optimization of neural network deployment on edge devices. Importantly, exploiting edge platforms' scarce resources is critical for NN efficient execution on MCUs. To do so, I will introduce DORY (Deployment Oriented to memoRY) -- an automatic tool to deploy CNNs on low-cost MCUs. DORY, in different steps, can manage different levels of memory inside the MCU automatically, offload the computation workload (i.e., the different layers of a neural network) to dedicated hardware accelerators, and automatically generates ANSI C code that orchestrates off- and on-chip transfers with the computation phases. On top of this, I will introduce two optimized computation libraries that DORY can exploit to deploy TCNs and Transformers on edge efficiently. I conclude the thesis with two different applications on bio-signal analysis, i.e., heart rate tracking and sEMG-based gesture recognition.
Assessing brain connectivity through electroencephalographic signal processing and modeling analysis
Resumo:
Brain functioning relies on the interaction of several neural populations connected through complex connectivity networks, enabling the transmission and integration of information. Recent advances in neuroimaging techniques, such as electroencephalography (EEG), have deepened our understanding of the reciprocal roles played by brain regions during cognitive processes. The underlying idea of this PhD research is that EEG-related functional connectivity (FC) changes in the brain may incorporate important neuromarkers of behavior and cognition, as well as brain disorders, even at subclinical levels. However, a complete understanding of the reliability of the wide range of existing connectivity estimation techniques is still lacking. The first part of this work addresses this limitation by employing Neural Mass Models (NMMs), which simulate EEG activity and offer a unique tool to study interconnected networks of brain regions in controlled conditions. NMMs were employed to test FC estimators like Transfer Entropy and Granger Causality in linear and nonlinear conditions. Results revealed that connectivity estimates reflect information transmission between brain regions, a quantity that can be significantly different from the connectivity strength, and that Granger causality outperforms the other estimators. A second objective of this thesis was to assess brain connectivity and network changes on EEG data reconstructed at the cortical level. Functional brain connectivity has been estimated through Granger Causality, in both temporal and spectral domains, with the following goals: a) detect task-dependent functional connectivity network changes, focusing on internal-external attention competition and fear conditioning and reversal; b) identify resting-state network alterations in a subclinical population with high autistic traits. Connectivity-based neuromarkers, compared to the canonical EEG analysis, can provide deeper insights into brain mechanisms and may drive future diagnostic methods and therapeutic interventions. However, further methodological studies are required to fully understand the accuracy and information captured by FC estimates, especially concerning nonlinear phenomena.
Resumo:
This thesis focuses on automating the time-consuming task of manually counting activated neurons in fluorescent microscopy images, which is used to study the mechanisms underlying torpor. The traditional method of manual annotation can introduce bias and delay the outcome of experiments, so the author investigates a deep-learning-based procedure to automatize this task. The author explores two of the main convolutional-neural-network (CNNs) state-of-the-art architectures: UNet and ResUnet family model, and uses a counting-by-segmentation strategy to provide a justification of the objects considered during the counting process. The author also explores a weakly-supervised learning strategy that exploits only dot annotations. The author quantifies the advantages in terms of data reduction and counting performance boost obtainable with a transfer-learning approach and, specifically, a fine-tuning procedure. The author released the dataset used for the supervised use case and all the pre-training models, and designed a web application to share both the counting process pipeline developed in this work and the models pre-trained on the dataset analyzed in this work.
Resumo:
The scientific success of the LHC experiments at CERN highly depends on the availability of computing resources which efficiently store, process, and analyse the amount of data collected every year. This is ensured by the Worldwide LHC Computing Grid infrastructure that connect computing centres distributed all over the world with high performance network. LHC has an ambitious experimental program for the coming years, which includes large investments and improvements both for the hardware of the detectors and for the software and computing systems, in order to deal with the huge increase in the event rate expected from the High Luminosity LHC (HL-LHC) phase and consequently with the huge amount of data that will be produced. Since few years the role of Artificial Intelligence has become relevant in the High Energy Physics (HEP) world. Machine Learning (ML) and Deep Learning algorithms have been successfully used in many areas of HEP, like online and offline reconstruction programs, detector simulation, object reconstruction, identification, Monte Carlo generation, and surely they will be crucial in the HL-LHC phase. This thesis aims at contributing to a CMS R&D project, regarding a ML "as a Service" solution for HEP needs (MLaaS4HEP). It consists in a data-service able to perform an entire ML pipeline (in terms of reading data, processing data, training ML models, serving predictions) in a completely model-agnostic fashion, directly using ROOT files of arbitrary size from local or distributed data sources. This framework has been updated adding new features in the data preprocessing phase, allowing more flexibility to the user. Since the MLaaS4HEP framework is experiment agnostic, the ATLAS Higgs Boson ML challenge has been chosen as physics use case, with the aim to test MLaaS4HEP and the contribution done with this work.
Resumo:
Correctness of information gathered in production environments is an essential part of quality assurance processes in many industries, this task is often performed by human resources who visually take annotations in various steps of the production flow. Depending on the performed task the correlation between where exactly the information is gathered and what it represents is more than often lost in the process. The lack of labeled data places a great boundary on the application of deep neural networks aimed at object detection tasks, moreover supervised training of deep models requires a great amount of data to be available. Reaching an adequate large collection of labeled images through classic techniques of data annotations is an exhausting and costly task to perform, not always suitable for every scenario. A possible solution is to generate synthetic data that replicates the real one and use it to fine-tune a deep neural network trained on one or more source domains to a different target domain. The purpose of this thesis is to show a real case scenario where the provided data were both in great scarcity and missing the required annotations. Sequentially a possible approach is presented where synthetic data has been generated to address those issues while standing as a training base of deep neural networks for object detection, capable of working on images taken in production-like environments. Lastly, it compares performance on different types of synthetic data and convolutional neural networks used as backbones for the model.