804 resultados para Machine Learning Techniques


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Les logiciels actuels sont de grandes tailles, complexes et critiques. Le besoin de qualité exige beaucoup de tests, ce qui consomme de grandes quantités de ressources durant le développement et la maintenance de ces systèmes. Différentes techniques permettent de réduire les coûts liés aux activités de test. Notre travail s’inscrit dans ce cadre, est a pour objectif d’orienter l’effort de test vers les composants logiciels les plus à risque à l’aide de certains attributs du code source. À travers plusieurs démarches empiriques menées sur de grands logiciels open source, développés avec la technologie orientée objet, nous avons identifié et étudié les métriques qui caractérisent l’effort de test unitaire sous certains angles. Nous avons aussi étudié les liens entre cet effort de test et les métriques des classes logicielles en incluant les indicateurs de qualité. Les indicateurs de qualité sont une métrique synthétique, que nous avons introduite dans nos travaux antérieurs, qui capture le flux de contrôle ainsi que différentes caractéristiques du logiciel. Nous avons exploré plusieurs techniques permettant d’orienter l’effort de test vers des composants à risque à partir de ces attributs de code source, en utilisant des algorithmes d’apprentissage automatique. En regroupant les métriques logicielles en familles, nous avons proposé une approche basée sur l’analyse du risque des classes logicielles. Les résultats que nous avons obtenus montrent les liens entre l’effort de test unitaire et les attributs de code source incluant les indicateurs de qualité, et suggèrent la possibilité d’orienter l’effort de test à l’aide des métriques.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper provides an overview of IDS types and how they work as well as configuration considerations and issues that affect them. Advanced methods of increasing the performance of an IDS are explored such as specification based IDS for protecting Supervisory Control And Data Acquisition (SCADA) and Cloud networks. Also by providing a review of varied studies ranging from issues in configuration and specific problems to custom techniques and cutting edge studies a reference can be provided to others interested in learning about and developing IDS solutions. Intrusion Detection is an area of much required study to provide solutions to satisfy evolving services and networks and systems that support them. This paper aims to be a reference for IDS technologies other researchers and developers interested in the field of intrusion detection.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A variety of physical and biomedical imaging techniques, such as digital holography, interferometric synthetic aperture radar (InSAR), or magnetic resonance imaging (MRI) enable measurement of the phase of a physical quantity additionally to its amplitude. However, the phase can commonly only be measured modulo 2π, as a so called wrapped phase map. Phase unwrapping is the process of obtaining the underlying physical phase map from the wrapped phase. Tile-based phase unwrapping algorithms operate by first tessellating the phase map, then unwrapping individual tiles, and finally merging them to a continuous phase map. They can be implemented computationally efficiently and are robust to noise. However, they are prone to failure in the presence of phase residues or erroneous unwraps of single tiles. We tried to overcome these shortcomings by creating novel tile unwrapping and merging algorithms as well as creating a framework that allows to combine them in modular fashion. To increase the robustness of the tile unwrapping step, we implemented a model-based algorithm that makes efficient use of linear algebra to unwrap individual tiles. Furthermore, we adapted an established pixel-based unwrapping algorithm to create a quality guided tile merger. These original algorithms as well as previously existing ones were implemented in a modular phase unwrapping C++ framework. By examining different combinations of unwrapping and merging algorithms we compared our method to existing approaches. We could show that the appropriate choice of unwrapping and merging algorithms can significantly improve the unwrapped result in the presence of phase residues and noise. Beyond that, our modular framework allows for efficient design and test of new tile-based phase unwrapping algorithms. The software developed in this study is freely available.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Las organizaciones y sus entornos son sistemas complejos. Tales sistemas son difíciles de comprender y predecir. Pese a ello, la predicción es una tarea fundamental para la gestión empresarial y para la toma de decisiones que implica siempre un riesgo. Los métodos clásicos de predicción (entre los cuales están: la regresión lineal, la Autoregresive Moving Average y el exponential smoothing) establecen supuestos como la linealidad, la estabilidad para ser matemática y computacionalmente tratables. Por diferentes medios, sin embargo, se han demostrado las limitaciones de tales métodos. Pues bien, en las últimas décadas nuevos métodos de predicción han surgido con el fin de abarcar la complejidad de los sistemas organizacionales y sus entornos, antes que evitarla. Entre ellos, los más promisorios son los métodos de predicción bio-inspirados (ej. redes neuronales, algoritmos genéticos /evolutivos y sistemas inmunes artificiales). Este artículo pretende establecer un estado situacional de las aplicaciones actuales y potenciales de los métodos bio-inspirados de predicción en la administración.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A Histologia, o estudo de tecidos, é uma das áreas fundamentais da Biologia que permitiu enormes avanços científicos. Sendo uma tarefa exigente, meticulosa e demorada, será importante aproveitar a existência de ferramentas e algoritmos computacionais no seu auxílio, tornando o processo mais rápido e possibilitando a descoberta de informação que poderá não estar visível à partida. Esta dissertação tem como principal objectivo averiguar se um animal foi ou não sujeito à ingestão de um xenobiótico. Com esse objectivo em vista, utilizaram-se técnicas de processamento e segmentação de imagem aplicadas a imagens de tecido renal de ratos saudáveis e ratos que ingeriram o xenobiótico. Destas imagens extraíram-se inúmeras características do corpúsculo renal que após serem analisadas através de vários algoritmos de classificação mostraram ser possível saber se o animal ingeriu ou não o xenobiótico, com um reduzido grau de incerteza. ABSTRACT: Histology, the study of tissues, is one of the key areas of Biology that has allowed huge advances in Science. Being a demanding, meticulous and time consuming task, it is important to use the existence of computational tools and algorithms in its aid, making the process faster and enabling the discovery of information that may not be initially visible. The main goal of this thesis is to ascertain if an animal was subjected or not to the ingestion of a xenobiotic. With this in mind, were used image processing and segmentation techniques applied on images of kidney tissue from healthy rats and rats that ingested the xenobiotic. From these images were extracted several features of renal glomeruli that after being analyzed by various classification algorithms had shown to be possible to know, with an acceptable degree of certainty, if the animal ingested or not the xenobiotic.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nowadays robotic applications are widespread and most of the manipulation tasks are efficiently solved. However, Deformable-Objects (DOs) still represent a huge limitation for robots. The main difficulty in DOs manipulation is dealing with the shape and dynamics uncertainties, which prevents the use of model-based approaches (since they are excessively computationally complex) and makes sensory data difficult to interpret. This thesis reports the research activities aimed to address some applications in robotic manipulation and sensing of Deformable-Linear-Objects (DLOs), with particular focus to electric wires. In all the works, a significant effort was made in the study of an effective strategy for analyzing sensory signals with various machine learning algorithms. In the former part of the document, the main focus concerns the wire terminals, i.e. detection, grasping, and insertion. First, a pipeline that integrates vision and tactile sensing is developed, then further improvements are proposed for each module. A novel procedure is proposed to gather and label massive amounts of training images for object detection with minimal human intervention. Together with this strategy, we extend a generic object detector based on Convolutional-Neural-Networks for orientation prediction. The insertion task is also extended by developing a closed-loop control capable to guide the insertion of a longer and curved segment of wire through a hole, where the contact forces are estimated by means of a Recurrent-Neural-Network. In the latter part of the thesis, the interest shifts to the DLO shape. Robotic reshaping of a DLO is addressed by means of a sequence of pick-and-place primitives, while a decision making process driven by visual data learns the optimal grasping locations exploiting Deep Q-learning and finds the best releasing point. The success of the solution leverages on a reliable interpretation of the DLO shape. For this reason, further developments are made on the visual segmentation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Radars are expected to become the main sensors in various civilian applications, especially for autonomous driving. Their success is mainly due to the availability of low cost integrated devices, equipped with compact antenna arrays, and computationally efficient signal processing techniques. This thesis focuses on the study and the development of different deterministic and learning based techniques for colocated multiple-input multiple-output (MIMO) radars. In particular, after providing an overview on the architecture of these devices, the problem of detecting and estimating multiple targets in stepped frequency continuous wave (SFCW) MIMO radar systems is investigated and different deterministic techniques solving it are illustrated. Moreover, novel solutions, based on an approximate maximum likelihood approach, are developed. The accuracy achieved by all the considered algorithms is assessed on the basis of the raw data acquired from low power wideband radar devices. The results demonstrate that the developed algorithms achieve reasonable accuracies, but at the price of different computational efforts. Another important technical problem investigated in this thesis concerns the exploitation of machine learning and deep learning techniques in the field of colocated MIMO radars. In this thesis, after providing a comprehensive overview of the machine learning and deep learning techniques currently being considered for use in MIMO radar systems, their performance in two different applications is assessed on the basis of synthetically generated and experimental datasets acquired through a commercial frequency modulated continuous wave (FMCW) MIMO radar. Finally, the application of colocated MIMO radars to autonomous driving in smart agriculture is illustrated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo realizzato un applicativo chiamato Service Broker, il quale, attraverso il reinforcement learning, apprende come distribuire l'esecuzione di tasks su dei lavoratori asincroni e distribuiti. L'analogia è quella di un cloud data center, il quale possiede delle risorse interne - possibilmente distribuite nella server farm -, riceve dei tasks dai suoi clienti e li esegue su queste risorse. L'obiettivo dell'applicativo, e quindi del data center, è quello di allocare questi tasks in maniera da minimizzare il costo di esecuzione. Inoltre, al fine di testare gli agenti del reinforcement learning sviluppati è stato creato un environment, un simulatore, che permettesse di concentrarsi nello sviluppo dei componenti necessari agli agenti, invece che doversi anche occupare di eventuali aspetti implementativi necessari in un vero data center, come ad esempio la comunicazione con i vari nodi e i tempi di latenza di quest'ultima. I risultati ottenuti hanno dunque confermato la teoria studiata, riuscendo a ottenere prestazioni migliori di alcuni dei metodi classici per il task allocation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Acoustic Emission (AE) monitoring can be used to detect the presence of damage as well as determine its location in Structural Health Monitoring (SHM) applications. Information on the time difference of the signal generated by the damage event arriving at different sensors is essential in performing localization. This makes the time of arrival (ToA) an important piece of information to retrieve from the AE signal. Generally, this is determined using statistical methods such as the Akaike Information Criterion (AIC) which is particularly prone to errors in the presence of noise. And given that the structures of interest are surrounded with harsh environments, a way to accurately estimate the arrival time in such noisy scenarios is of particular interest. In this work, two new methods are presented to estimate the arrival times of AE signals which are based on Machine Learning. Inspired by great results in the field, two models are presented which are Deep Learning models - a subset of machine learning. They are based on Convolutional Neural Network (CNN) and Capsule Neural Network (CapsNet). The primary advantage of such models is that they do not require the user to pre-define selected features but only require raw data to be given and the models establish non-linear relationships between the inputs and outputs. The performance of the models is evaluated using AE signals generated by a custom ray-tracing algorithm by propagating them on an aluminium plate and compared to AIC. It was found that the relative error in estimation on the test set was < 5% for the models compared to around 45% of AIC. The testing process was further continued by preparing an experimental setup and acquiring real AE signals to test on. Similar performances were observed where the two models not only outperform AIC by more than a magnitude in their average errors but also they were shown to be a lot more robust as compared to AIC which fails in the presence of noise.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

L'image captioning è un task di machine learning che consiste nella generazione di una didascalia, o caption, che descriva le caratteristiche di un'immagine data in input. Questo può essere applicato, ad esempio, per descrivere in dettaglio i prodotti in vendita su un sito di e-commerce, migliorando l'accessibilità del sito web e permettendo un acquisto più consapevole ai clienti con difficoltà visive. La generazione di descrizioni accurate per gli articoli di moda online è importante non solo per migliorare le esperienze di acquisto dei clienti, ma anche per aumentare le vendite online. Oltre alla necessità di presentare correttamente gli attributi degli articoli, infatti, descrivere i propri prodotti con il giusto linguaggio può contribuire a catturare l'attenzione dei clienti. In questa tesi, ci poniamo l'obiettivo di sviluppare un sistema in grado di generare una caption che descriva in modo dettagliato l'immagine di un prodotto dell'industria della moda dato in input, sia esso un capo di vestiario o un qualche tipo di accessorio. A questo proposito, negli ultimi anni molti studi hanno proposto soluzioni basate su reti convoluzionali e LSTM. In questo progetto proponiamo invece un'architettura encoder-decoder, che utilizza il modello Vision Transformer per la codifica delle immagini e GPT-2 per la generazione dei testi. Studiamo inoltre come tecniche di deep metric learning applicate in end-to-end durante l'addestramento influenzino le metriche e la qualità delle caption generate dal nostro modello.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Reinforcement Learning is an increasingly popular area of Artificial Intelligence. The applications of this learning paradigm are many, but its application in mobile computing is in its infancy. This study aims to provide an overview of current Reinforcement Learning applications on mobile devices, as well as to introduce a new framework for iOS devices: Swift-RL Lib. This new Swift package allows developers to easily support and integrate two of the most common RL algorithms, Q-Learning and Deep Q-Network, in a fully customizable environment. All processes are performed on the device, without any need for remote computation. The framework was tested in different settings and evaluated through several use cases. Through an in-depth performance analysis, we show that the platform provides effective and efficient support for Reinforcement Learning for mobile applications.