720 resultados para Machine Learning. Semissupervised learning. Multi-label classification. Reliability Parameter


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The driving task requires sustained attention during prolonged periods, and can be performed in highly predictable or repetitive environments. Such conditions could create hypovigilance and impair performance towards critical events. Identifying such impairment in monotonous conditions has been a major subject of research, but no research to date has attempted to predict it in real-time. This pilot study aims to show that performance decrements due to monotonous tasks can be predicted through mathematical modelling taking into account sensation seeking levels. A short vigilance task sensitive to short periods of lapses of vigilance called Sustained Attention to Response Task is used to assess participants‟ performance. The framework for prediction developed on this task could be extended to a monotonous driving task. A Hidden Markov Model (HMM) is proposed to predict participants‟ lapses in alertness. Driver‟s vigilance evolution is modelled as a hidden state and is correlated to a surrogate measure: the participant‟s reactions time. This experiment shows that the monotony of the task can lead to an important decline in performance in less than five minutes. This impairment can be predicted four minutes in advance with an 86% accuracy using HMMs. This experiment showed that mathematical models such as HMM can efficiently predict hypovigilance through surrogate measures. The presented model could result in the development of an in-vehicle device that detects driver hypovigilance in advance and warn the driver accordingly, thus offering the potential to enhance road safety and prevent road crashes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As more and more information is available on the Web finding quality and reliable information is becoming harder. To help solve this problem, Web search models need to incorporate users’ cognitive styles. This paper reports the preliminary results from a user study exploring the relationships between Web users’ searching behavior and their cognitive style. The data was collected using a questionnaire, Web search logs and think-aloud strategy. The preliminary findings reveal a number of cognitive factors, such as information searching processes, results evaluations and cognitive style, having an influence on users’ Web searching behavior. Among these factors, the cognitive style of the user was observed to have a greater impact. Based on the key findings, a conceptual model of Web searching and cognitive styles is presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With regard to the long-standing problem of the semantic gap between low-level image features and high-level human knowledge, the image retrieval community has recently shifted its emphasis from low-level features analysis to high-level image semantics extrac- tion. User studies reveal that users tend to seek information using high-level semantics. Therefore, image semantics extraction is of great importance to content-based image retrieval because it allows the users to freely express what images they want. Semantic content annotation is the basis for semantic content retrieval. The aim of image anno- tation is to automatically obtain keywords that can be used to represent the content of images. The major research challenges in image semantic annotation are: what is the basic unit of semantic representation? how can the semantic unit be linked to high-level image knowledge? how can the contextual information be stored and utilized for image annotation? In this thesis, the Semantic Web technology (i.e. ontology) is introduced to the image semantic annotation problem. Semantic Web, the next generation web, aims at mak- ing the content of whatever type of media not only understandable to humans but also to machines. Due to the large amounts of multimedia data prevalent on the Web, re- searchers and industries are beginning to pay more attention to the Multimedia Semantic Web. The Semantic Web technology provides a new opportunity for multimedia-based applications, but the research in this area is still in its infancy. Whether ontology can be used to improve image annotation and how to best use ontology in semantic repre- sentation and extraction is still a worth-while investigation. This thesis deals with the problem of image semantic annotation using ontology and machine learning techniques in four phases as below. 1) Salient object extraction. A salient object servers as the basic unit in image semantic extraction as it captures the common visual property of the objects. Image segmen- tation is often used as the �rst step for detecting salient objects, but most segmenta- tion algorithms often fail to generate meaningful regions due to over-segmentation and under-segmentation. We develop a new salient object detection algorithm by combining multiple homogeneity criteria in a region merging framework. 2) Ontology construction. Since real-world objects tend to exist in a context within their environment, contextual information has been increasingly used for improving object recognition. In the ontology construction phase, visual-contextual ontologies are built from a large set of fully segmented and annotated images. The ontologies are composed of several types of concepts (i.e. mid-level and high-level concepts), and domain contextual knowledge. The visual-contextual ontologies stand as a user-friendly interface between low-level features and high-level concepts. 3) Image objects annotation. In this phase, each object is labelled with a mid-level concept in ontologies. First, a set of candidate labels are obtained by training Support Vectors Machines with features extracted from salient objects. After that, contextual knowledge contained in ontologies is used to obtain the �nal labels by removing the ambiguity concepts. 4) Scene semantic annotation. The scene semantic extraction phase is to get the scene type by using both mid-level concepts and domain contextual knowledge in ontologies. Domain contextual knowledge is used to create scene con�guration that describes which objects co-exist with which scene type more frequently. The scene con�guration is represented in a probabilistic graph model, and probabilistic inference is employed to calculate the scene type given an annotated image. To evaluate the proposed methods, a series of experiments have been conducted in a large set of fully annotated outdoor scene images. These include a subset of the Corel database, a subset of the LabelMe dataset, the evaluation dataset of localized semantics in images, the spatial context evaluation dataset, and the segmented and annotated IAPR TC-12 benchmark.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional approaches to the use of machine learning algorithms do not provide a method to learn multiple tasks in one-shot on an embodied robot. It is proposed that grounding actions within the sensory space leads to the development of action-state relationships which can be re-used despite a change in task. A novel approach called an Experience Network is developed and assessed on a real-world robot required to perform three separate tasks. After grounded representations were developed in the initial task, only minimal further learning was required to perform the second and third task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities. In a decision theoretic setting, we prove general risk bounds in terms of these complexities. We consider function classes that can be expressed as combinations of functions from basis classes and show how the Rademacher and Gaussian complexities of such a function class can be bounded in terms of the complexity of the basis classes. We give examples of the application of these techniques in finding data-dependent risk bounds for decision trees, neural networks and support vector machines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of choosing, sequentially, a map which assigns elements of a set A to a few elements of a set B. On each round, the algorithm suffers some cost associated with the chosen assignment, and the goal is to minimize the cumulative loss of these choices relative to the best map on the entire sequence. Even though the offline problem of finding the best map is provably hard, we show that there is an equivalent online approximation algorithm, Randomized Map Prediction (RMP), that is efficient and performs nearly as well. While drawing upon results from the "Online Prediction with Expert Advice" setting, we show how RMP can be utilized as an online approach to several standard batch problems. We apply RMP to online clustering as well as online feature selection and, surprisingly, RMP often outperforms the standard batch algorithms on these problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The topic of fault detection and diagnostics (FDD) is studied from the perspective of proactive testing. Unlike most research focus in the diagnosis area in which system outputs are analyzed for diagnosis purposes, in this paper the focus is on the other side of the problem: manipulating system inputs for better diagnosis reasoning. In other words, the question of how diagnostic mechanisms can direct system inputs for better diagnosis analysis is addressed here. It is shown how the problem can be formulated as decision making problem coupled with a Bayesian Network based diagnostic mechanism. The developed mechanism is applied to the problem of supervised testing in HVAC systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In fault detection and diagnostics, limitations coming from the sensor network architecture are one of the main challenges in evaluating a system’s health status. Usually the design of the sensor network architecture is not solely based on diagnostic purposes, other factors like controls, financial constraints, and practical limitations are also involved. As a result, it quite common to have one sensor (or one set of sensors) monitoring the behaviour of two or more components. This can significantly extend the complexity of diagnostic problems. In this paper a systematic approach is presented to deal with such complexities. It is shown how the problem can be formulated as a Bayesian network based diagnostic mechanism with latent variables. The developed approach is also applied to the problem of fault diagnosis in HVAC systems, an application area with considerable modeling and measurement constraints.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after n1-ε iterations---for sample size n and ε ∈ (0,1)---the sequence of risks of the classifiers it produces approaches the Bayes risk.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the ocean science community, researchers have begun employing novel sensor platforms as integral pieces in oceanographic data collection, which have significantly advanced the study and prediction of complex and dynamic ocean phenomena. These innovative tools are able to provide scientists with data at unprecedented spatiotemporal resolutions. This paper focuses on the newly developed Wave Glider platform from Liquid Robotics. This vehicle produces forward motion by harvesting abundant natural energy from ocean waves, and provides a persistent ocean presence for detailed ocean observation. This study is targeted at determining a kinematic model for offline planning that provides an accurate estimation of the vehicle speed for a desired heading and set of environmental parameters. Given the significant wave height, ocean surface and subsurface currents, wind speed and direction, we present the formulation of a system identification to provide the vehicle’s speed over a range of possible directions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have used microarray gene expression profiling and machine learning to predict the presence of BRAF mutations in a panel of 61 melanoma cell lines. The BRAF gene was found to be mutated in 42 samples (69%) and intragenic mutations of the NRAS gene were detected in seven samples (11%). No cell line carried mutations of both genes. Using support vector machines, we have built a classifier that differentiates between melanoma cell lines based on BRAF mutation status. As few as 83 genes are able to discriminate between BRAF mutant and BRAF wild-type samples with clear separation observed using hierarchical clustering. Multidimensional scaling was used to visualize the relationship between a BRAF mutation signature and that of a generalized mitogen-activated protein kinase (MAPK) activation (either BRAF or NRAS mutation) in the context of the discriminating gene list. We observed that samples carrying NRAS mutations lie somewhere between those with or without BRAF mutations. These observations suggest that there are gene-specific mutation signals in addition to a common MAPK activation that result from the pleiotropic effects of either BRAF or NRAS on other signaling pathways, leading to measurably different transcriptional changes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trees, shrubs and other vegetation are of continued importance to the environment and our daily life. They provide shade around our roads and houses, offer a habitat for birds and wildlife, and absorb air pollutants. However, vegetation touching power lines is a risk to public safety and the environment, and one of the main causes of power supply problems. Vegetation management, which includes tree trimming and vegetation control, is a significant cost component of the maintenance of electrical infrastructure. For example, Ergon Energy, the Australia’s largest geographic footprint energy distributor, currently spends over $80 million a year inspecting and managing vegetation that encroach on power line assets. Currently, most vegetation management programs for distribution systems are calendar-based ground patrol. However, calendar-based inspection by linesman is labour-intensive, time consuming and expensive. It also results in some zones being trimmed more frequently than needed and others not cut often enough. Moreover, it’s seldom practicable to measure all the plants around power line corridors by field methods. Remote sensing data captured from airborne sensors has great potential in assisting vegetation management in power line corridors. This thesis presented a comprehensive study on using spiking neural networks in a specific image analysis application: power line corridor monitoring. Theoretically, the thesis focuses on a biologically inspired spiking cortical model: pulse coupled neural network (PCNN). The original PCNN model was simplified in order to better analyze the pulse dynamics and control the performance. Some new and effective algorithms were developed based on the proposed spiking cortical model for object detection, image segmentation and invariant feature extraction. The developed algorithms were evaluated in a number of experiments using real image data collected from our flight trails. The experimental results demonstrated the effectiveness and advantages of spiking neural networks in image processing tasks. Operationally, the knowledge gained from this research project offers a good reference to our industry partner (i.e. Ergon Energy) and other energy utilities who wants to improve their vegetation management activities. The novel approaches described in this thesis showed the potential of using the cutting edge sensor technologies and intelligent computing techniques in improve power line corridor monitoring. The lessons learnt from this project are also expected to increase the confidence of energy companies to move from traditional vegetation management strategy to a more automated, accurate and cost-effective solution using aerial remote sensing techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rule extraction from neural network algorithms have been investigated for two decades and there have been significant applications. Despite this level of success, rule extraction from neural network methods are generally not part of data mining tools, and a significant commercial breakthrough may still be some time away. This paper briefly reviews the state-of-the-art and points to some of the obstacles, namely a lack of evaluation techniques in experiments and larger benchmark data sets. A significant new development is the view that rule extraction from neural networks is an interactive process which actively involves the user. This leads to the application of assessment and evaluation techniques from information retrieval which may lead to a range of new methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring the natural environment is increasingly important as habit degradation and climate change reduce theworld’s biodiversity.We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales.One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal’s proximity to the microphone. Second, initial experimentation suggested that no single recognizer could dealwith the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems.