171 resultados para machine learning algorithms


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The topic of the present work is to study the relationship between the power of the learning algorithms on the one hand, and the expressive power of the logical language which is used to represent the problems to be learned on the other hand. The central question is whether enriching the language results in more learning power. In order to make the question relevant and nontrivial, it is required that both texts (sequences of data) and hypotheses (guesses) be translatable from the “rich” language into the “poor” one. The issue is considered for several logical languages suitable to describe structures whose domain is the set of natural numbers. It is shown that enriching the language does not give any advantage for those languages which define a monadic second-order language being decidable in the following sense: there is a fixed interpretation in the structure of natural numbers such that the set of sentences of this extended language true in that structure is decidable. But enriching the original language even by only one constant gives an advantage if this language contains a binary function symbol (which will be interpreted as addition). Furthermore, it is shown that behaviourally correct learning has exactly the same power as learning in the limit for those languages which define a monadic second-order language with the property given above, but has more power in case of languages containing a binary function symbol. Adding the natural requirement that the set of all structures to be learned is recursively enumerable, it is shown that it pays o6 to enrich the language of arithmetics for both finite learning and learning in the limit, but it does not pay off to enrich the language for behaviourally correct learning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In an open railway access market price negotiation, it is feasible to achieve higher cost recovery by applying the principles of price discrimination. The price negotiation can be modeled as an optimization problem of revenue intake. In this paper, we present the pricing negotiation based on reinforcement learning model. A negotiated-price setting technique based on agent learning is introduced, and the feasible applications of the proposed method for open railway access market simulation are discussed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With regard to the long-standing problem of the semantic gap between low-level image features and high-level human knowledge, the image retrieval community has recently shifted its emphasis from low-level features analysis to high-level image semantics extrac- tion. User studies reveal that users tend to seek information using high-level semantics. Therefore, image semantics extraction is of great importance to content-based image retrieval because it allows the users to freely express what images they want. Semantic content annotation is the basis for semantic content retrieval. The aim of image anno- tation is to automatically obtain keywords that can be used to represent the content of images. The major research challenges in image semantic annotation are: what is the basic unit of semantic representation? how can the semantic unit be linked to high-level image knowledge? how can the contextual information be stored and utilized for image annotation? In this thesis, the Semantic Web technology (i.e. ontology) is introduced to the image semantic annotation problem. Semantic Web, the next generation web, aims at mak- ing the content of whatever type of media not only understandable to humans but also to machines. Due to the large amounts of multimedia data prevalent on the Web, re- searchers and industries are beginning to pay more attention to the Multimedia Semantic Web. The Semantic Web technology provides a new opportunity for multimedia-based applications, but the research in this area is still in its infancy. Whether ontology can be used to improve image annotation and how to best use ontology in semantic repre- sentation and extraction is still a worth-while investigation. This thesis deals with the problem of image semantic annotation using ontology and machine learning techniques in four phases as below. 1) Salient object extraction. A salient object servers as the basic unit in image semantic extraction as it captures the common visual property of the objects. Image segmen- tation is often used as the �rst step for detecting salient objects, but most segmenta- tion algorithms often fail to generate meaningful regions due to over-segmentation and under-segmentation. We develop a new salient object detection algorithm by combining multiple homogeneity criteria in a region merging framework. 2) Ontology construction. Since real-world objects tend to exist in a context within their environment, contextual information has been increasingly used for improving object recognition. In the ontology construction phase, visual-contextual ontologies are built from a large set of fully segmented and annotated images. The ontologies are composed of several types of concepts (i.e. mid-level and high-level concepts), and domain contextual knowledge. The visual-contextual ontologies stand as a user-friendly interface between low-level features and high-level concepts. 3) Image objects annotation. In this phase, each object is labelled with a mid-level concept in ontologies. First, a set of candidate labels are obtained by training Support Vectors Machines with features extracted from salient objects. After that, contextual knowledge contained in ontologies is used to obtain the �nal labels by removing the ambiguity concepts. 4) Scene semantic annotation. The scene semantic extraction phase is to get the scene type by using both mid-level concepts and domain contextual knowledge in ontologies. Domain contextual knowledge is used to create scene con�guration that describes which objects co-exist with which scene type more frequently. The scene con�guration is represented in a probabilistic graph model, and probabilistic inference is employed to calculate the scene type given an annotated image. To evaluate the proposed methods, a series of experiments have been conducted in a large set of fully annotated outdoor scene images. These include a subset of the Corel database, a subset of the LabelMe dataset, the evaluation dataset of localized semantics in images, the spatial context evaluation dataset, and the segmented and annotated IAPR TC-12 benchmark.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper reports on three primary school students’ explorations of 3D rotation in a virtual reality learning environment (VRLE) named VRMath. When asked to investigate if you would face the same direction when you turn right 45 degrees first then roll up 45 degrees, or when you roll up 45 degrees first then turn right 45 degrees, the students found that the different order of the two turns ended up with different directions in the VRLE. This was contrary to the students’ prior predictions based on using pen, paper and body movements. The findings of this study showed the difficulty young children have in perceiving and understanding the non-commutative nature of 3D rotation and the power of the computational VRLE in giving students experiences that they rarely have in real life with 3D manipulations and 3D mental movements.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a general methodology for learning articulated motions that, despite having non-linear correlations, are cyclical and have a defined pattern of behavior Using conventional algorithms to extract features from images, a Bayesian classifier is applied to cluster and classify features of the moving object. Clusters are then associated in different frames and structure learning algorithms for Bayesian networks are used to recover the structure of the motion. This framework is applied to the human gait analysis and tracking but applications include any coordinated movement such as multi-robots behavior analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Digital forensic examiners often need to identify the type of a file or file fragment based only on the content of the file. Content-based file type identification schemes typically use a byte frequency distribution with statistical machine learning to classify file types. Most algorithms analyze the entire file content to obtain the byte frequency distribution, a technique that is inefficient and time consuming. This paper proposes two techniques for reducing the classification time. The first technique selects a subset of features based on the frequency of occurrence. The second speeds classification by sampling several blocks from the file. Experimental results demonstrate that up to a fifteen-fold reduction in file size analysis time can be achieved with limited impact on accuracy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0–1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0–1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function—that it satisfies a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise, and show that in this case, strictly convex loss functions lead to faster rates of convergence of the risk than would be implied by standard uniform convergence arguments. Finally, we present applications of our results to the estimation of convergence rates in function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This important work describes recent theoretical advances in the study of artificial neural networks. It explores probabilistic models of supervised learning problems, and addresses the key statistical and computational questions. Chapters survey research on pattern classification with binary-output networks, including a discussion of the relevance of the Vapnik Chervonenkis dimension, and of estimates of the dimension for several neural network models. In addition, Anthony and Bartlett develop a model of classification by real-output networks, and demonstrate the usefulness of classification with a "large margin." The authors explain the role of scale-sensitive versions of the Vapnik Chervonenkis dimension in large margin classification, and in real prediction. Key chapters also discuss the computational complexity of neural network learning, describing a variety of hardness results, and outlining two efficient, constructive learning algorithms. The book is self-contained and accessible to researchers and graduate students in computer science, engineering, and mathematics

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider the problem of choosing, sequentially, a map which assigns elements of a set A to a few elements of a set B. On each round, the algorithm suffers some cost associated with the chosen assignment, and the goal is to minimize the cumulative loss of these choices relative to the best map on the entire sequence. Even though the offline problem of finding the best map is provably hard, we show that there is an equivalent online approximation algorithm, Randomized Map Prediction (RMP), that is efficient and performs nearly as well. While drawing upon results from the "Online Prediction with Expert Advice" setting, we show how RMP can be utilized as an online approach to several standard batch problems. We apply RMP to online clustering as well as online feature selection and, surprisingly, RMP often outperforms the standard batch algorithms on these problems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Machine learning has become a valuable tool for detecting and preventing malicious activity. However, as more applications employ machine learning techniques in adversarial decision-making situations, increasingly powerful attacks become possible against machine learning systems. In this paper, we present three broad research directions towards the end of developing truly secure learning. First, we suggest that finding bounds on adversarial influence is important to understand the limits of what an attacker can and cannot do to a learning system. Second, we investigate the value of adversarial capabilities-the success of an attack depends largely on what types of information and influence the attacker has. Finally, we propose directions in technologies for secure learning and suggest lines of investigation into secure techniques for learning in adversarial environments. We intend this paper to foster discussion about the security of machine learning, and we believe that the research directions we propose represent the most important directions to pursue in the quest for secure learning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Trees, shrubs and other vegetation are of continued importance to the environment and our daily life. They provide shade around our roads and houses, offer a habitat for birds and wildlife, and absorb air pollutants. However, vegetation touching power lines is a risk to public safety and the environment, and one of the main causes of power supply problems. Vegetation management, which includes tree trimming and vegetation control, is a significant cost component of the maintenance of electrical infrastructure. For example, Ergon Energy, the Australia’s largest geographic footprint energy distributor, currently spends over $80 million a year inspecting and managing vegetation that encroach on power line assets. Currently, most vegetation management programs for distribution systems are calendar-based ground patrol. However, calendar-based inspection by linesman is labour-intensive, time consuming and expensive. It also results in some zones being trimmed more frequently than needed and others not cut often enough. Moreover, it’s seldom practicable to measure all the plants around power line corridors by field methods. Remote sensing data captured from airborne sensors has great potential in assisting vegetation management in power line corridors. This thesis presented a comprehensive study on using spiking neural networks in a specific image analysis application: power line corridor monitoring. Theoretically, the thesis focuses on a biologically inspired spiking cortical model: pulse coupled neural network (PCNN). The original PCNN model was simplified in order to better analyze the pulse dynamics and control the performance. Some new and effective algorithms were developed based on the proposed spiking cortical model for object detection, image segmentation and invariant feature extraction. The developed algorithms were evaluated in a number of experiments using real image data collected from our flight trails. The experimental results demonstrated the effectiveness and advantages of spiking neural networks in image processing tasks. Operationally, the knowledge gained from this research project offers a good reference to our industry partner (i.e. Ergon Energy) and other energy utilities who wants to improve their vegetation management activities. The novel approaches described in this thesis showed the potential of using the cutting edge sensor technologies and intelligent computing techniques in improve power line corridor monitoring. The lessons learnt from this project are also expected to increase the confidence of energy companies to move from traditional vegetation management strategy to a more automated, accurate and cost-effective solution using aerial remote sensing techniques.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Rule extraction from neural network algorithms have been investigated for two decades and there have been significant applications. Despite this level of success, rule extraction from neural network methods are generally not part of data mining tools, and a significant commercial breakthrough may still be some time away. This paper briefly reviews the state-of-the-art and points to some of the obstacles, namely a lack of evaluation techniques in experiments and larger benchmark data sets. A significant new development is the view that rule extraction from neural networks is an interactive process which actively involves the user. This leads to the application of assessment and evaluation techniques from information retrieval which may lead to a range of new methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A rule-based approach for classifying previously identified medical concepts in the clinical free text into an assertion category is presented. There are six different categories of assertions for the task: Present, Absent, Possible, Conditional, Hypothetical and Not associated with the patient. The assertion classification algorithms were largely based on extending the popular NegEx and Context algorithms. In addition, a health based clinical terminology called SNOMED CT and other publicly available dictionaries were used to classify assertions, which did not fit the NegEx/Context model. The data for this task includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Centre, as well as discharge summaries and progress notes from University of Pittsburgh Medical Centre. The set consists of 349 discharge reports, each with pairs of ground truth concept and assertion files for system development, and 477 reports for evaluation. The system’s performance on the evaluation data set was 0.83, 0.83 and 0.83 for recall, precision and F1-measure, respectively. Although the rule-based system shows promise, further improvements can be made by incorporating machine learning approaches.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Toolbox, combined with MATLAB ® and a modern workstation computer, is a useful and convenient environment for investigation of machine vision algorithms. For modest image sizes the processing rate can be sufficiently ``real-time'' to allow for closed-loop control. Focus of attention methods such as dynamic windowing (not provided) can be used to increase the processing rate. With input from a firewire or web camera (support provided) and output to a robot (not provided) it would be possible to implement a visual servo system entirely in MATLAB. Provides many functions that are useful in machine vision and vision-based control. Useful for photometry, photogrammetry, colorimetry. It includes over 100 functions spanning operations such as image file reading and writing, acquisition, display, filtering, blob, point and line feature extraction, mathematical morphology, homographies, visual Jacobians, camera calibration and color space conversion.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Electronic services are a leitmotif in ‘hot’ topics like Software as a Service, Service Oriented Architecture (SOA), Service oriented Computing, Cloud Computing, application markets and smart devices. We propose to consider these in what has been termed the Service Ecosystem (SES). The SES encompasses all levels of electronic services and their interaction, with human consumption and initiation on its periphery in much the same way the ‘Web’ describes a plethora of technologies that eventuate to connect information and expose it to humans. Presently, the SES is heterogeneous, fragmented and confined to semi-closed systems. A key issue hampering the emergence of an integrated SES is Service Discovery (SD). A SES will be dynamic with areas of structured and unstructured information within which service providers and ‘lay’ human consumers interact; until now the two are disjointed, e.g., SOA-enabled organisations, industries and domains are choreographed by domain experts or ‘hard-wired’ to smart device application markets and web applications. In a SES, services are accessible, comparable and exchangeable to human consumers closing the gap to the providers. This requires a new SD with which humans can discover services transparently and effectively without special knowledge or training. We propose two modes of discovery, directed search following an agenda and explorative search, which speculatively expands knowledge of an area of interest by means of categories. Inspired by conceptual space theory from cognitive science, we propose to implement the modes of discovery using concepts to map a lay consumer’s service need to terminologically sophisticated descriptions of services. To this end, we reframe SD as an information retrieval task on the information attached to services, such as, descriptions, reviews, documentation and web sites - the Service Information Shadow. The Semantic Space model transforms the shadow's unstructured semantic information into a geometric, concept-like representation. We introduce an improved and extended Semantic Space including categorization calling it the Semantic Service Discovery model. We evaluate our model with a highly relevant, service related corpus simulating a Service Information Shadow including manually constructed complex service agendas, as well as manual groupings of services. We compare our model against state-of-the-art information retrieval systems and clustering algorithms. By means of an extensive series of empirical evaluations, we establish optimal parameter settings for the semantic space model. The evaluations demonstrate the model’s effectiveness for SD in terms of retrieval precision over state-of-the-art information retrieval models (directed search) and the meaningful, automatic categorization of service related information, which shows potential to form the basis of a useful, cognitively motivated map of the SES for exploratory search.