882 resultados para Machine Learning. Semissupervised learning. Multi-label classification. Reliability Parameter


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Reputation, influenced by ratings from past clients, is crucial for providers competing for custom. For new providers with less track record, a few negative ratings can harm their chances of growing. In the JASPR project, we aim to look at how to ensure automated reputation assessments are justified and informative. Even an honest balanced review of a service provision may still be an unreliable predictor of future performance if the circumstances differ. For example, a service may have previously relied on different sub-providers to now, or been affected by season-specific weather events. A common way to ameliorate the ratings that may not reflect future performance is by weighting by recency. We argue that better results are obtained by querying provenance records on how services are provided for the circumstances of provision, to determine the significance of past interactions. Informed by case studies in global logistics, taxi hire, and courtesy car leasing, we are going on to explore the generation of explanations for reputation assessments, which can be valuable both for clients and for providers wishing to improve their match to the market, and applying machine learning to predict aspects of service provision which may influence decisions on the appropriateness of a provider. In this talk, I will give an overview of the research conducted and planned on JASPR. Speaker Biography Dr Simon Miles Simon Miles is a Reader in Computer Science at King's College London, UK, and head of the Agents and Intelligent Systems group. He conducts research in the areas of normative systems, data provenance, and medical informatics at King's, and has published widely and manages a number of research projects in these areas. He was previously a researcher at the University of Southampton after graduating from his PhD at Warwick. He has twice been an organising committee member for the Autonomous Agents and Multi-Agent Systems conference series, and was a member of the W3C working group which published standards on interoperable provenance data in 2013.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Heading into the 2020s, Physics and Astronomy are undergoing experimental revolutions that will reshape our picture of the fabric of the Universe. The Large Hadron Collider (LHC), the largest particle physics project in the world, produces 30 petabytes of data annually that need to be sifted through, analysed, and modelled. In astrophysics, the Large Synoptic Survey Telescope (LSST) will be taking a high-resolution image of the full sky every 3 days, leading to data rates of 30 terabytes per night over ten years. These experiments endeavour to answer the question why 96% of the content of the universe currently elude our physical understanding. Both the LHC and LSST share the 5-dimensional nature of their data, with position, energy and time being the fundamental axes. This talk will present an overview of the experiments and data that is gathered, and outlines the challenges in extracting information. Common strategies employed are very similar to industrial data! Science problems (e.g., data filtering, machine learning, statistical interpretation) and provide a seed for exchange of knowledge between academia and industry. Speaker Biography Professor Mark Sullivan Mark Sullivan is a Professor of Astrophysics in the Department of Physics and Astronomy. Mark completed his PhD at Cambridge, and following postdoctoral study in Durham, Toronto and Oxford, now leads a research group at Southampton studying dark energy using exploding stars called "type Ia supernovae". Mark has many years' experience of research that involves repeatedly imaging the night sky to track the arrival of transient objects, involving significant challenges in data handling, processing, classification and analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In computer vision, training a model that performs classification effectively is highly dependent on the extracted features, and the number of training instances. Conventionally, feature detection and extraction are performed by a domain-expert who, in many cases, is expensive to employ and hard to find. Therefore, image descriptors have emerged to automate these tasks. However, designing an image descriptor still requires domain-expert intervention. Moreover, the majority of machine learning algorithms require a large number of training examples to perform well. However, labelled data is not always available or easy to acquire, and dealing with a large dataset can dramatically slow down the training process. In this paper, we propose a novel Genetic Programming based method that automatically synthesises a descriptor using only two training instances per class. The proposed method combines arithmetic operators to evolve a model that takes an image and generates a feature vector. The performance of the proposed method is assessed using six datasets for texture classification with different degrees of rotation, and is compared with seven domain-expert designed descriptors. The results show that the proposed method is robust to rotation, and has significantly outperformed, or achieved a comparable performance to, the baseline methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current Ambient Intelligence and Intelligent Environment research focuses on the interpretation of a subject’s behaviour at the activity level by logging the Activity of Daily Living (ADL) such as eating, cooking, etc. In general, the sensors employed (e.g. PIR sensors, contact sensors) provide low resolution information. Meanwhile, the expansion of ubiquitous computing allows researchers to gather additional information from different types of sensor which is possible to improve activity analysis. Based on the previous research about sitting posture detection, this research attempts to further analyses human sitting activity. The aim of this research is to use non-intrusive low cost pressure sensor embedded chair system to recognize a subject’s activity by using their detected postures. There are three steps for this research, the first step is to find a hardware solution for low cost sitting posture detection, second step is to find a suitable strategy of sitting posture detection and the last step is to correlate the time-ordered sitting posture sequences with sitting activity. The author initiated a prototype type of sensing system called IntelliChair for sitting posture detection. Two experiments are proceeded in order to determine the hardware architecture of IntelliChair system. The prototype looks at the sensor selection and integration of various sensor and indicates the best for a low cost, non-intrusive system. Subsequently, this research implements signal process theory to explore the frequency feature of sitting posture, for the purpose of determining a suitable sampling rate for IntelliChair system. For second and third step, ten subjects are recruited for the sitting posture data and sitting activity data collection. The former dataset is collected byasking subjects to perform certain pre-defined sitting postures on IntelliChair and it is used for posture recognition experiment. The latter dataset is collected by asking the subjects to perform their normal sitting activity routine on IntelliChair for four hours, and the dataset is used for activity modelling and recognition experiment. For the posture recognition experiment, two Support Vector Machine (SVM) based classifiers are trained (one for spine postures and the other one for leg postures), and their performance evaluated. Hidden Markov Model is utilized for sitting activity modelling and recognition in order to establish the selected sitting activities from sitting posture sequences.2. After experimenting with possible sensors, Force Sensing Resistor (FSR) is selected as the pressure sensing unit for IntelliChair. Eight FSRs are mounted on the seat and back of a chair to gather haptic (i.e., touch-based) posture information. Furthermore, the research explores the possibility of using alternative non-intrusive sensing technology (i.e. vision based Kinect Sensor from Microsoft) and find out the Kinect sensor is not reliable for sitting posture detection due to the joint drifting problem. A suitable sampling rate for IntelliChair is determined according to the experiment result which is 6 Hz. The posture classification performance shows that the SVM based classifier is robust to “familiar” subject data (accuracy is 99.8% with spine postures and 99.9% with leg postures). When dealing with “unfamiliar” subject data, the accuracy is 80.7% for spine posture classification and 42.3% for leg posture classification. The result of activity recognition achieves 41.27% accuracy among four selected activities (i.e. relax, play game, working with PC and watching video). The result of this thesis shows that different individual body characteristics and sitting habits influence both sitting posture and sitting activity recognition. In this case, it suggests that IntelliChair is suitable for individual usage but a training stage is required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The continuous technology evaluation is benefiting our lives to a great extent. The evolution of Internet of things and deployment of wireless sensor networks is making it possible to have more connectivity between people and devices used extensively in our daily lives. Almost every discipline of daily life including health sector, transportation, agriculture etc. is benefiting from these technologies. There is a great potential of research and refinement of health sector as the current system is very often dependent on manual evaluations conducted by the clinicians. There is no automatic system for patient health monitoring and assessment which results to incomplete and less reliable heath information. Internet of things has a great potential to benefit health care applications by automated and remote assessment, monitoring and identification of diseases. Acute pain is the main cause of people visiting to hospitals. An automatic pain detection system based on internet of things with wireless devices can make the assessment and redemption significantly more efficient. The contribution of this research work is proposing pain assessment method based on physiological parameters. The physiological parameters chosen for this study are heart rate, electrocardiography, breathing rate and galvanic skin response. As a first step, the relation between these physiological parameters and acute pain experienced by the test persons is evaluated. The electrocardiography data collected from the test persons is analyzed to extract interbeat intervals. This evaluation clearly demonstrates specific patterns and trends in these parameters as a consequence of pain. This parametric behavior is then used to assess and identify the pain intensity by implementing machine learning algorithms. Support vector machines are used for classifying these parameters influenced by different pain intensities and classification results are achieved. The classification results with good accuracy rates between two and three levels of pain intensities shows clear indication of pain and the feasibility of this pain assessment method. An improved approach on the basis of this research work can be implemented by using both physiological parameters and electromyography data of facial muscles for classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A primary goal of context-aware systems is delivering the right information at the right place and right time to users in order to enable them to make effective decisions and improve their quality of life. There are three key requirements for achieving this goal: determining what information is relevant, personalizing it based on the users’ context (location, preferences, behavioral history etc.), and delivering it to them in a timely manner without an explicit request from them. These requirements create a paradigm that we term as “Proactive Context-aware Computing”. Most of the existing context-aware systems fulfill only a subset of these requirements. Many of these systems focus only on personalization of the requested information based on users’ current context. Moreover, they are often designed for specific domains. In addition, most of the existing systems are reactive - the users request for some information and the system delivers it to them. These systems are not proactive i.e. they cannot anticipate users’ intent and behavior and act proactively without an explicit request from them. In order to overcome these limitations, we need to conduct a deeper analysis and enhance our understanding of context-aware systems that are generic, universal, proactive and applicable to a wide variety of domains. To support this dissertation, we explore several directions. Clearly the most significant sources of information about users today are smartphones. A large amount of users’ context can be acquired through them and they can be used as an effective means to deliver information to users. In addition, social media such as Facebook, Flickr and Foursquare provide a rich and powerful platform to mine users’ interests, preferences and behavioral history. We employ the ubiquity of smartphones and the wealth of information available from social media to address the challenge of building proactive context-aware systems. We have implemented and evaluated a few approaches, including some as part of the Rover framework, to achieve the paradigm of Proactive Context-aware Computing. Rover is a context-aware research platform which has been evolving for the last 6 years. Since location is one of the most important context for users, we have developed ‘Locus’, an indoor localization, tracking and navigation system for multi-story buildings. Other important dimensions of users’ context include the activities that they are engaged in. To this end, we have developed ‘SenseMe’, a system that leverages the smartphone and its multiple sensors in order to perform multidimensional context and activity recognition for users. As part of the ‘SenseMe’ project, we also conducted an exploratory study of privacy, trust, risks and other concerns of users with smart phone based personal sensing systems and applications. To determine what information would be relevant to users’ situations, we have developed ‘TellMe’ - a system that employs a new, flexible and scalable approach based on Natural Language Processing techniques to perform bootstrapped discovery and ranking of relevant information in context-aware systems. In order to personalize the relevant information, we have also developed an algorithm and system for mining a broad range of users’ preferences from their social network profiles and activities. For recommending new information to the users based on their past behavior and context history (such as visited locations, activities and time), we have developed a recommender system and approach for performing multi-dimensional collaborative recommendations using tensor factorization. For timely delivery of personalized and relevant information, it is essential to anticipate and predict users’ behavior. To this end, we have developed a unified infrastructure, within the Rover framework, and implemented several novel approaches and algorithms that employ various contextual features and state of the art machine learning techniques for building diverse behavioral models of users. Examples of generated models include classifying users’ semantic places and mobility states, predicting their availability for accepting calls on smartphones and inferring their device charging behavior. Finally, to enable proactivity in context-aware systems, we have also developed a planning framework based on HTN planning. Together, these works provide a major push in the direction of proactive context-aware computing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Requirement engineering is a key issue in the development of a software project. Like any other development activity it is not without risks. This work is about the empirical study of risks of requirements by applying machine learning techniques, specifically Bayesian networks classifiers. We have defined several models to predict the risk level for a given requirement using three dataset that collect metrics taken from the requirement specifications of different projects. The classification accuracy of the Bayesian models obtained is evaluated and compared using several classification performance measures. The results of the experiments show that the Bayesians networks allow obtaining valid predictors. Specifically, a tree augmented network structure shows a competitive experimental performance in all datasets. Besides, the relations established between the variables collected to determine the level of risk in a requirement, match with those set by requirement engineers. We show that Bayesian networks are valid tools for the automation of risks assessment in requirement engineering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Melanoma is a type of skin cancer and is caused by the uncontrolled growth of atypical melanocytes. In recent decades, computer aided diagnosis is used to support medical professionals; however, there is still no globally accepted tool. In this context, similar to state-of-the-art we propose a system that receives a dermatoscopy image and provides a diagnostic if the lesion is benign or malignant. This tool is composed with next modules: Preprocessing, Segmentation, Feature Extraction, and Classification. Preprocessing involves the removal of hairs. Segmentation is to isolate the lesion. Feature extraction is considering the ABCD dermoscopy rule. The classification is performed by the Support Vector Machine. Experimental evidence indicates that the proposal has 90.63 % accuracy, 95 % sensitivity, and 83.33 % specificity on a data-set of 104 dermatoscopy images. These results are favorable considering the performance of diagnosis by traditional progress in the area of dermatology

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämä diplomityö tarkastelee pelaajatyyppien ja pelaajamotivaatioiden tunnistamista videopeleissä. Aiempi tutkimus tuntee monia pelaajatyyppien malleja, mutta niitä ei ole liiemmin sovellettu käytäntöön peleissä. Tässä työssä suoritetaan systemaattinen kirjallisuuskartoitus erilaisista pelaajatyyppien malleista, jonka pohjalta esitetään useita pelaajien luokittelutapoja. Lisäksi toteutetaan tapaustutkimus, jossa kirjallisuuden pohjalta valitaan pelaajien luokittelumalli ja testataan mallia käytännössä tunnistamalla pelaajatyyppejä data-analytiikan avulla reaaliaikaisessa strategiapelissä.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent years have seen an astronomical rise in SQL Injection Attacks (SQLIAs) used to compromise the confidentiality, authentication and integrity of organisations’ databases. Intruders becoming smarter in obfuscating web requests to evade detection combined with increasing volumes of web traffic from the Internet of Things (IoT), cloud-hosted and on-premise business applications have made it evident that the existing approaches of mostly static signature lack the ability to cope with novel signatures. A SQLIA detection and prevention solution can be achieved through exploring an alternative bio-inspired supervised learning approach that uses input of labelled dataset of numerical attributes in classifying true positives and negatives. We present in this paper a Numerical Encoding to Tame SQLIA (NETSQLIA) that implements a proof of concept for scalable numerical encoding of features to a dataset attributes with labelled class obtained from deep web traffic analysis. In the numerical attributes encoding: the model leverages proxy in the interception and decryption of web traffic. The intercepted web requests are then assembled for front-end SQL parsing and pattern matching by applying traditional Non-Deterministic Finite Automaton (NFA). This paper is intended for a technique of numerical attributes extraction of any size primed as an input dataset to an Artificial Neural Network (ANN) and statistical Machine Learning (ML) algorithms implemented using Two-Class Averaged Perceptron (TCAP) and Two-Class Logistic Regression (TCLR) respectively. This methodology then forms the subject of the empirical evaluation of the suitability of this model in the accurate classification of both legitimate web requests and SQLIA payloads.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Artificial immune systems, more specifically the negative selection algorithm, have previously been applied to intrusion detection. The aim of this research is to develop an intrusion detection system based on a novel concept in immunology, the Danger Theory. Dendritic Cells (DCs) are antigen presenting cells and key to the activation of the human immune system. DCs perform the vital role of combining signals from the host tissue and correlate these signals with proteins known as antigens. In algorithmic terms, individual DCs perform multi-sensor data fusion based on time-windows. The whole population of DCs asynchronously correlates the fused signals with a secondary data stream. The behaviour of human DCs is abstracted to form the DC Algorithm (DCA), which is implemented using an immune inspired framework, libtissue. This system is used to detect context switching for a basic machine learning dataset and to detect outgoing portscans in real-time. Experimental results show a significant difference between an outgoing portscan and normal traffic.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabajo se enfoca en la implementación de un detector de arrecife de coral de desempeño rápido que se utiliza para un vehículo autónomo submarino (Autonomous Underwater Vehicle, AUV, por sus siglas en inglés). Una detección rápida de la presencia de coral asegura la estabilización del AUV frente al arrecife en el menor tiempo posible, evitando colisiones con el coral. La detección de coral se hace en una imagen que captura la escena que percibe la cámara del AUV. Se realiza una clasificación píxel por píxel entre dos clases: arrecife de coral y el plano de fondo que no es coral. A cada píxel de la imagen se le asigna un vector característico, el mismo que se genera mediante el uso de filtros Gabor Wavelets. Éstos son implementados en C++ y la librería OpenCV. Los vectores característicos son clasificados a través de nueve algoritmos de máquinas de aprendizaje. El desempeño de cada algoritmo se compara mediante la precisión y el tiempo de ejecución. El algoritmo de Árboles de Decisión resultó ser el más rápido y preciso de entre todos los algoritmos. Se creó una base de datos de 621 imágenes de corales de Belice (110 imágenes de entrenamiento y 511 imágenes de prueba).