880 resultados para Neural networks training
Resumo:
Current state of the art techniques for landmine detection in ground penetrating radar (GPR) utilize statistical methods to identify characteristics of a landmine response. This research makes use of 2-D slices of data in which subsurface landmine responses have hyperbolic shapes. Various methods from the field of visual image processing are adapted to the 2-D GPR data, producing superior landmine detection results. This research goes on to develop a physics-based GPR augmentation method motivated by current advances in visual object detection. This GPR specific augmentation is used to mitigate issues caused by insufficient training sets. This work shows that augmentation improves detection performance under training conditions that are normally very difficult. Finally, this work introduces the use of convolutional neural networks as a method to learn feature extraction parameters. These learned convolutional features outperform hand-designed features in GPR detection tasks. This work presents a number of methods, both borrowed from and motivated by the substantial work in visual image processing. The methods developed and presented in this work show an improvement in overall detection performance and introduce a method to improve the robustness of statistical classification.
Resumo:
Finding rare events in multidimensional data is an important detection problem that has applications in many fields, such as risk estimation in insurance industry, finance, flood prediction, medical diagnosis, quality assurance, security, or safety in transportation. The occurrence of such anomalies is so infrequent that there is usually not enough training data to learn an accurate statistical model of the anomaly class. In some cases, such events may have never been observed, so the only information that is available is a set of normal samples and an assumed pairwise similarity function. Such metric may only be known up to a certain number of unspecified parameters, which would either need to be learned from training data, or fixed by a domain expert. Sometimes, the anomalous condition may be formulated algebraically, such as a measure exceeding a predefined threshold, but nuisance variables may complicate the estimation of such a measure. Change detection methods used in time series analysis are not easily extendable to the multidimensional case, where discontinuities are not localized to a single point. On the other hand, in higher dimensions, data exhibits more complex interdependencies, and there is redundancy that could be exploited to adaptively model the normal data. In the first part of this dissertation, we review the theoretical framework for anomaly detection in images and previous anomaly detection work done in the context of crack detection and detection of anomalous components in railway tracks. In the second part, we propose new anomaly detection algorithms. The fact that curvilinear discontinuities in images are sparse with respect to the frame of shearlets, allows us to pose this anomaly detection problem as basis pursuit optimization. Therefore, we pose the problem of detecting curvilinear anomalies in noisy textured images as a blind source separation problem under sparsity constraints, and propose an iterative shrinkage algorithm to solve it. Taking advantage of the parallel nature of this algorithm, we describe how this method can be accelerated using graphical processing units (GPU). Then, we propose a new method for finding defective components on railway tracks using cameras mounted on a train. We describe how to extract features and use a combination of classifiers to solve this problem. Then, we scale anomaly detection to bigger datasets with complex interdependencies. We show that the anomaly detection problem naturally fits in the multitask learning framework. The first task consists of learning a compact representation of the good samples, while the second task consists of learning the anomaly detector. Using deep convolutional neural networks, we show that it is possible to train a deep model with a limited number of anomalous examples. In sequential detection problems, the presence of time-variant nuisance parameters affect the detection performance. In the last part of this dissertation, we present a method for adaptively estimating the threshold of sequential detectors using Extreme Value Theory on a Bayesian framework. Finally, conclusions on the results obtained are provided, followed by a discussion of possible future work.
Resumo:
We evaluate the integration of 3D preoperative computed tomography angiography of the coronary arteries with intraoperative 2D X-ray angiographies by a recently proposed novel registration-by-regression method. The method relates image features of 2D projection images to the transformation parameters of the 3D image. We compared different sets of features and studied the influence of preprocessing the training set. For the registration evaluation, a gold standard was developed from eight X-ray angiography sequences from six different patients. The alignment quality was measured using the 3D mean target registration error (mTRE). The registration-by-regression method achieved moderate accuracy (median mTRE of 15 mm) on real images. It does therefore not provide yet a complete solution to the 3D–2D registration problem but it could be used as an initialisation method to eliminate the need for manual initialisation.
Resumo:
The analysis of steel and composite frames has traditionally been carried out by idealizing beam-to-column connections as either rigid or pinned. Although some advanced analysis methods have been proposed to account for semi-rigid connections, the performance of these methods strongly depends on the proper modeling of connection behavior. The primary challenge of modeling beam-to-column connections is their inelastic response and continuously varying stiffness, strength, and ductility. In this dissertation, two distinct approaches—mathematical models and informational models—are proposed to account for the complex hysteretic behavior of beam-to-column connections. The performance of the two approaches is examined and is then followed by a discussion of their merits and deficiencies. To capitalize on the merits of both mathematical and informational representations, a new approach, a hybrid modeling framework, is developed and demonstrated through modeling beam-to-column connections. Component-based modeling is a compromise spanning two extremes in the field of mathematical modeling: simplified global models and finite element models. In the component-based modeling of angle connections, the five critical components of excessive deformation are identified. Constitutive relationships of angles, column panel zones, and contact between angles and column flanges, are derived by using only material and geometric properties and theoretical mechanics considerations. Those of slip and bolt hole ovalization are simplified by empirically-suggested mathematical representation and expert opinions. A mathematical model is then assembled as a macro-element by combining rigid bars and springs that represent the constitutive relationship of components. Lastly, the moment-rotation curves of the mathematical models are compared with those of experimental tests. In the case of a top-and-seat angle connection with double web angles, a pinched hysteretic response is predicted quite well by complete mechanical models, which take advantage of only material and geometric properties. On the other hand, to exhibit the highly pinched behavior of a top-and-seat angle connection without web angles, a mathematical model requires components of slip and bolt hole ovalization, which are more amenable to informational modeling. An alternative method is informational modeling, which constitutes a fundamental shift from mathematical equations to data that contain the required information about underlying mechanics. The information is extracted from observed data and stored in neural networks. Two different training data sets, analytically-generated and experimental data, are tested to examine the performance of informational models. Both informational models show acceptable agreement with the moment-rotation curves of the experiments. Adding a degradation parameter improves the informational models when modeling highly pinched hysteretic behavior. However, informational models cannot represent the contribution of individual components and therefore do not provide an insight into the underlying mechanics of components. In this study, a new hybrid modeling framework is proposed. In the hybrid framework, a conventional mathematical model is complemented by the informational methods. The basic premise of the proposed hybrid methodology is that not all features of system response are amenable to mathematical modeling, hence considering informational alternatives. This may be because (i) the underlying theory is not available or not sufficiently developed, or (ii) the existing theory is too complex and therefore not suitable for modeling within building frame analysis. The role of informational methods is to model aspects that the mathematical model leaves out. Autoprogressive algorithm and self-learning simulation extract the missing aspects from a system response. In a hybrid framework, experimental data is an integral part of modeling, rather than being used strictly for validation processes. The potential of the hybrid methodology is illustrated through modeling complex hysteretic behavior of beam-to-column connections. Mechanics-based components of deformation such as angles, flange-plates, and column panel zone, are idealized to a mathematical model by using a complete mechanical approach. Although the mathematical model represents envelope curves in terms of initial stiffness and yielding strength, it is not capable of capturing the pinching effects. Pinching is caused mainly by separation between angles and column flanges as well as slip between angles/flange-plates and beam flanges. These components of deformation are suitable for informational modeling. Finally, the moment-rotation curves of the hybrid models are validated with those of the experimental tests. The comparison shows that the hybrid models are capable of representing the highly pinched hysteretic behavior of beam-to-column connections. In addition, the developed hybrid model is successfully used to predict the behavior of a newly-designed connection.
Resumo:
This study is aimed to model and forecast the tourism demand for Mozambique for the period from January 2004 to December 2013 using artificial neural networks models. The number of overnight stays in Hotels was used as representative of the tourism demand. A set of independent variables were experimented in the input of the model, namely: Consumer Price Index, Gross Domestic Product and Exchange Rates, of the outbound tourism markets, South Africa, United State of America, Mozambique, Portugal and the United Kingdom. The best model achieved has 6.5% for Mean Absolute Percentage Error and 0.696 for Pearson correlation coefficient. A model like this with high accuracy of forecast is important for the economic agents to know the future growth of this activity sector, as it is important for stakeholders to provide products, services and infrastructures and for the hotels establishments to adequate its level of capacity to the tourism demand.
Resumo:
In contemporary societies higher education must shape individuals able to solve problems in a workable and simpler manner and, therefore, a multidisciplinary view of the problems, with insights in disciplines like psychology, mathematics or computer science becomes mandatory. Undeniably, the great challenge for teachers is to provide a comprehensive training in General Chemistry with high standards of quality, and aiming not only at the promotion of the student’s academic success, but also at the understanding of the competences/skills required to their future doings. Thus, this work will be focused on the development of an intelligent system to assess the Quality-of-General-Chemistry-Learning, based on factors related with subject, teachers and students.
Resumo:
The inclusion of General Chemistry (GC) in the curricula of higher education courses in science and technology aims, on the one hand, to develop students' skills necessary for further studies and, on the other hand, to respond to the need of endowing future professionals of knowledge to analyze and solve multidisciplinary problems in a sustainable way. The participation of students in the evaluation of the role played by the GC in their training is crucial, and the analysis of the results can be an essential tool to increase success in the education of students and improving practices in various professions. Undeniably, this work will be focused on the development of an intelligent system to assess the role of GC. The computational framework is built on top of a Logic Programming approach to Knowledge Representation and Reasoning, complemented with a problem solving methodology moored on Artificial Neural Networks. The results so far obtained show that the proposed model stands for a good start, being its overall accuracy higher than 95%.
Resumo:
Many tissue level models of neural networks are written in the language of nonlinear integro-differential equations. Analytical solutions have only been obtained for the special case that the nonlinearity is a Heaviside function. Thus the pursuit of even approximate solutions to such models is of interest to the broad mathematical neuroscience community. Here we develop one such scheme, for stationary and travelling wave solutions, that can deal with a certain class of smoothed Heaviside functions. The distribution that smoothes the Heaviside is viewed as a fundamental object, and all expressions describing the scheme are constructed in terms of integrals over this distribution. The comparison of our scheme and results from direct numerical simulations is used to highlight the very good levels of approximation that can be achieved by iterating the process only a small number of times.
Resumo:
The artificial lifting of oil is needed when the pressure of the reservoir is not high enough so that the fluid contained in it can reach the surface spontaneously. Thus the increase in energy supplies artificial or additional fluid integral to the well to come to the surface. The rod pump is the artificial lift method most used in the world and the dynamometer card (surface and down-hole) is the best tool for the analysis of a well equipped with such method. A computational method using Artificial Neural Networks MLP was and developed using pre-established patterns, based on its geometry, the downhole card are used for training the network and then the network provides the knowledge for classification of new cards, allows the fails diagnose in the system and operation conditions of the lifting system. These routines could be integrated to a supervisory system that collects the cards to be analyzed
Resumo:
The brain is a network spanning multiple scales from subcellular to macroscopic. In this thesis I present four projects studying brain networks at different levels of abstraction. The first involves determining a functional connectivity network based on neural spike trains and using a graph theoretical method to cluster groups of neurons into putative cell assemblies. In the second project I model neural networks at a microscopic level. Using diferent clustered wiring schemes, I show that almost identical spatiotemporal activity patterns can be observed, demonstrating that there is a broad neuro-architectural basis to attain structured spatiotemporal dynamics. Remarkably, irrespective of the precise topological mechanism, this behavior can be predicted by examining the spectral properties of the synaptic weight matrix. The third project introduces, via two circuit architectures, a new paradigm for feedforward processing in which inhibitory neurons have the complex and pivotal role in governing information flow in cortical network models. Finally, I analyze axonal projections in sleep deprived mice using data collected as part of the Allen Institute's Mesoscopic Connectivity Atlas. After normalizing for experimental variability, the results indicate there is no single explanatory difference in the mesoscale network between control and sleep deprived mice. Using machine learning techniques, however, animal classification could be done at levels significantly above chance. This reveals that intricate changes in connectivity do occur due to chronic sleep deprivation.
Resumo:
Nowadays robotic applications are widespread and most of the manipulation tasks are efficiently solved. However, Deformable-Objects (DOs) still represent a huge limitation for robots. The main difficulty in DOs manipulation is dealing with the shape and dynamics uncertainties, which prevents the use of model-based approaches (since they are excessively computationally complex) and makes sensory data difficult to interpret. This thesis reports the research activities aimed to address some applications in robotic manipulation and sensing of Deformable-Linear-Objects (DLOs), with particular focus to electric wires. In all the works, a significant effort was made in the study of an effective strategy for analyzing sensory signals with various machine learning algorithms. In the former part of the document, the main focus concerns the wire terminals, i.e. detection, grasping, and insertion. First, a pipeline that integrates vision and tactile sensing is developed, then further improvements are proposed for each module. A novel procedure is proposed to gather and label massive amounts of training images for object detection with minimal human intervention. Together with this strategy, we extend a generic object detector based on Convolutional-Neural-Networks for orientation prediction. The insertion task is also extended by developing a closed-loop control capable to guide the insertion of a longer and curved segment of wire through a hole, where the contact forces are estimated by means of a Recurrent-Neural-Network. In the latter part of the thesis, the interest shifts to the DLO shape. Robotic reshaping of a DLO is addressed by means of a sequence of pick-and-place primitives, while a decision making process driven by visual data learns the optimal grasping locations exploiting Deep Q-learning and finds the best releasing point. The success of the solution leverages on a reliable interpretation of the DLO shape. For this reason, further developments are made on the visual segmentation.
Resumo:
Much of the real-world dataset, including textual data, can be represented using graph structures. The use of graphs to represent textual data has many advantages, mainly related to maintaining a more significant amount of information, such as the relationships between words and their types. In recent years, many neural network architectures have been proposed to deal with tasks on graphs. Many of them consider only node features, ignoring or not giving the proper relevance to relationships between them. However, in many node classification tasks, they play a fundamental role. This thesis aims to analyze the main GNNs, evaluate their advantages and disadvantages, propose an innovative solution considered as an extension of GAT, and apply them to a case study in the biomedical field. We propose the reference GNNs, implemented with methodologies later analyzed, and then applied to a question answering system in the biomedical field as a replacement for the pre-existing GNN. We attempt to obtain better results by using models that can accept as input both node and edge features. As shown later, our proposed models can beat the original solution and define the state-of-the-art for the task under analysis.
Resumo:
In recent years, there has been exponential growth in using virtual spaces, including dialogue systems, that handle personal information. The concept of personal privacy in the literature is discussed and controversial, whereas, in the technological field, it directly influences the degree of reliability perceived in the information system (privacy ‘as trust’). This work aims to protect the right to privacy on personal data (GDPR, 2018) and avoid the loss of sensitive content by exploring sensitive information detection (SID) task. It is grounded on the following research questions: (RQ1) What does sensitive data mean? How to define a personal sensitive information domain? (RQ2) How to create a state-of-the-art model for SID?(RQ3) How to evaluate the model? RQ1 theoretically investigates the concepts of privacy and the ontological state-of-the-art representation of personal information. The Data Privacy Vocabulary (DPV) is the taxonomic resource taken as an authoritative reference for the definition of the knowledge domain. Concerning RQ2, we investigate two approaches to classify sensitive data: the first - bottom-up - explores automatic learning methods based on transformer networks, the second - top-down - proposes logical-symbolic methods with the construction of privaframe, a knowledge graph of compositional frames representing personal data categories. Both approaches are tested. For the evaluation - RQ3 – we create SPeDaC, a sentence-level labeled resource. This can be used as a benchmark or training in the SID task, filling the gap of a shared resource in this field. If the approach based on artificial neural networks confirms the validity of the direction adopted in the most recent studies on SID, the logical-symbolic approach emerges as the preferred way for the classification of fine-grained personal data categories, thanks to the semantic-grounded tailor modeling it allows. At the same time, the results highlight the strong potential of hybrid architectures in solving automatic tasks.
Resumo:
Neural representations (NR) have emerged in the last few years as a powerful tool to represent signals from several domains, such as images, 3D shapes, or audio. Indeed, deep neural networks have been shown capable of approximating continuous functions that describe a given signal with theoretical infinite resolution. This finding allows obtaining representations whose memory footprint is fixed and decoupled from the resolution at which the underlying signal can be sampled, something that is not possible with traditional discrete representations, e.g., grids of pixels for images or voxels for 3D shapes. During the last two years, many techniques have been proposed to improve the capability of NR to approximate high-frequency details and to make the optimization procedures required to obtain NR less demanding both in terms of time and data requirements, motivating many researchers to deploy NR as the main form of data representation for complex pipelines. Following this line of research, we first show that NR can approximate precisely Unsigned Distance Functions, providing an effective way to represent garments that feature open 3D surfaces and unknown topology. Then, we present a pipeline to obtain in a few minutes a compact Neural Twin® for a given object, by exploiting the recent advances in modeling neural radiance fields. Furthermore, we move a step in the direction of adopting NR as a standalone representation, by considering the possibility of performing downstream tasks by processing directly the NR weights. We first show that deep neural networks can be compressed into compact latent codes. Then, we show how this technique can be exploited to perform deep learning on implicit neural representations (INR) of 3D shapes, by only looking at the weights of the networks.
Resumo:
Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.