776 resultados para Machine learning methods


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the pattern recognition research field, Support Vector Machines (SVM) have been an effectiveness tool for classification purposes, being successively employed in many applications. The SVM input data is transformed into a high dimensional space using some kernel functions where linear separation is more likely. However, there are some computational drawbacks associated to SVM. One of them is the computational burden required to find out the more adequate parameters for the kernel mapping considering each non-linearly separable input data space, which reflects the performance of SVM. This paper introduces the Polynomial Powers of Sigmoid for SVM kernel mapping, and it shows their advantages over well-known kernel functions using real and synthetic datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In active learning, a machine learning algorithmis given an unlabeled set of examples U, and is allowed to request labels for a relatively small subset of U to use for training. The goal is then to judiciously choose which examples in U to have labeled in order to optimize some performance criterion, e.g. classification accuracy. We study how active learning affects AUC. We examine two existing algorithms from the literature and present our own active learning algorithms designed to maximize the AUC of the hypothesis. One of our algorithms was consistently the top performer, and Closest Sampling from the literature often came in second behind it. When good posterior probability estimates were available, our heuristics were by far the best.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We explore the problem of budgeted machine learning, in which the learning algorithm has free access to the training examples’ labels but has to pay for each attribute that is specified. This learning model is appropriate in many areas, including medical applications. We present new algorithms for choosing which attributes to purchase of which examples in the budgeted learning model based on algorithms for the multi-armed bandit problem. All of our approaches outperformed the current state of the art. Furthermore, we present a new means for selecting an example to purchase after the attribute is selected, instead of selecting an example uniformly at random, which is typically done. Our new example selection method improved performance of all the algorithms we tested, both ours and those in the literature.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recently, researches have shown that the performance of metaheuristics can be affected by population initialization. Opposition-based Differential Evolution (ODE), Quasi-Oppositional Differential Evolution (QODE), and Uniform-Quasi-Opposition Differential Evolution (UQODE) are three state-of-the-art methods that improve the performance of the Differential Evolution algorithm based on population initialization and different search strategies. In a different approach to achieve similar results, this paper presents a technique to discover promising regions in a continuous search-space of an optimization problem. Using machine-learning techniques, the algorithm named Smart Sampling (SS) finds regions with high possibility of containing a global optimum. Next, a metaheuristic can be initialized inside each region to find that optimum. SS and DE were combined (originating the SSDE algorithm) to evaluate our approach, and experiments were conducted in the same set of benchmark functions used by ODE, QODE and UQODE authors. Results have shown that the total number of function evaluations required by DE to reach the global optimum can be significantly reduced and that the success rate improves if SS is employed first. Such results are also in consonance with results from the literature, stating the importance of an adequate starting population. Moreover, SS presents better efficacy to find initial populations of superior quality when compared to the other three algorithms that employ oppositional learning. Finally and most important, the SS performance in finding promising regions is independent of the employed metaheuristic with which SS is combined, making SS suitable to improve the performance of a large variety of optimization techniques. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Shared attention is a type of communication very important among human beings. It is sometimes reserved for the more complex form of communication being constituted by a sequence of four steps: mutual gaze, gaze following, imperative pointing and declarative pointing. Some approaches have been proposed in Human-Robot Interaction area to solve part of shared attention process, that is, the most of works proposed try to solve the first two steps. Models based on temporal difference, neural networks, probabilistic and reinforcement learning are methods used in several works. In this article, we are presenting a robotic architecture that provides a robot or agent, the capacity of learning mutual gaze, gaze following and declarative pointing using a robotic head interacting with a caregiver. Three learning methods have been incorporated to this architecture and a comparison of their performance has been done to find the most adequate to be used in real experiment. The learning capabilities of this architecture have been analyzed by observing the robot interacting with the human in a controlled environment. The experimental results show that the robotic head is able to produce appropriate behavior and to learn from sociable interaction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In multi-label classification, examples can be associated with multiple labels simultaneously. The task of learning from multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. The binary relevance approach is one of these methods, where the multi-label learning task is decomposed into several independent binary classification problems, one for each label in the set of labels, and the final labels for each example are determined by aggregating the predictions from all binary classifiers. However, this approach fails to consider any dependency among the labels. Aiming to accurately predict label combinations, in this paper we propose a simple approach that enables the binary classifiers to discover existing label dependency by themselves. An experimental study using decision trees, a kernel method as well as Naive Bayes as base-learning techniques shows the potential of the proposed approach to improve the multi-label classification performance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Educational computer games are examples of computer-assisted learning objects, representing an educational strategy of growing interest. Given the changes in the digital world over the last decades, students of the current generation expect technology to be used in advancing their learning requiring a need to change traditional passive learning methodologies to an active multisensory experimental learning methodology. The objective of this study was to compare a computer game-based learning method with a traditional learning method, regarding learning gains and knowledge retention, as means of teaching head and neck Anatomy and Physiology to Speech-Language and Hearing pathology undergraduate students. Methods Students were randomized to participate to one of the learning methods and the data analyst was blinded to which method of learning the students had received. Students’ prior knowledge (i.e. before undergoing the learning method), short-term knowledge retention and long-term knowledge retention (i.e. six months after undergoing the learning method) were assessed with a multiple choice questionnaire. Students’ performance was compared considering the three moments of assessment for both for the mean total score and for separated mean scores for Anatomy questions and for Physiology questions. Results Students that received the game-based method performed better in the pos-test assessment only when considering the Anatomy questions section. Students that received the traditional lecture performed better in both post-test and long-term post-test when considering the Anatomy and Physiology questions. Conclusions The game-based learning method is comparable to the traditional learning method in general and in short-term gains, while the traditional lecture still seems to be more effective to improve students’ short and long-term knowledge retention.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the collective imaginaries a robot is a human like machine as any androids in science fiction. However the type of robots that you will encounter most frequently are machinery that do work that is too dangerous, boring or onerous. Most of the robots in the world are of this type. They can be found in auto, medical, manufacturing and space industries. Therefore a robot is a system that contains sensors, control systems, manipulators, power supplies and software all working together to perform a task. The development and use of such a system is an active area of research and one of the main problems is the development of interaction skills with the surrounding environment, which include the ability to grasp objects. To perform this task the robot needs to sense the environment and acquire the object informations, physical attributes that may influence a grasp. Humans can solve this grasping problem easily due to their past experiences, that is why many researchers are approaching it from a machine learning perspective finding grasp of an object using information of already known objects. But humans can select the best grasp amongst a vast repertoire not only considering the physical attributes of the object to grasp but even to obtain a certain effect. This is why in our case the study in the area of robot manipulation is focused on grasping and integrating symbolic tasks with data gained through sensors. The learning model is based on Bayesian Network to encode the statistical dependencies between the data collected by the sensors and the symbolic task. This data representation has several advantages. It allows to take into account the uncertainty of the real world, allowing to deal with sensor noise, encodes notion of causality and provides an unified network for learning. Since the network is actually implemented and based on the human expert knowledge, it is very interesting to implement an automated method to learn the structure as in the future more tasks and object features can be introduced and a complex network design based only on human expert knowledge can become unreliable. Since structure learning algorithms presents some weaknesses, the goal of this thesis is to analyze real data used in the network modeled by the human expert, implement a feasible structure learning approach and compare the results with the network designed by the expert in order to possibly enhance it.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nel lavoro di tesi qui presentato si indaga l'applicazione di tecniche di apprendimento mirate ad una più efficiente esecuzione di un portfolio di risolutore di vincoli (constraint solver). Un constraint solver è un programma che dato in input un problema di vincoli, elabora una soluzione mediante l'utilizzo di svariate tecniche. I problemi di vincoli sono altamente presenti nella vita reale. Esempi come l'organizzazione dei viaggi dei treni oppure la programmazione degli equipaggi di una compagnia aerea, sono tutti problemi di vincoli. Un problema di vincoli è formalizzato da un problema di soddisfacimento di vincoli(CSP). Un CSP è descritto da un insieme di variabili che possono assumere valori appartenenti ad uno specico dominio ed un insieme di vincoli che mettono in relazione variabili e valori assumibili da esse. Una tecnica per ottimizzare la risoluzione di tali problemi è quella suggerita da un approccio a portfolio. Tale tecnica, usata anche in am- biti come quelli economici, prevede la combinazione di più solver i quali assieme possono generare risultati migliori di un approccio a singolo solver. In questo lavoro ci preoccupiamo di creare una nuova tecnica che combina un portfolio di constraint solver con tecniche di machine learning. Il machine learning è un campo di intelligenza articiale che si pone l'obiettivo di immettere nelle macchine una sorta di `intelligenza'. Un esempio applicativo potrebbe essere quello di valutare i casi passati di un problema ed usarli in futuro per fare scelte. Tale processo è riscontrato anche a livello cognitivo umano. Nello specico, vogliamo ragionare in termini di classicazione. Una classicazione corrisponde ad assegnare ad un insieme di caratteristiche in input, un valore discreto in output, come vero o falso se una mail è classicata come spam o meno. La fase di apprendimento sarà svolta utilizzando una parte di CPHydra, un portfolio di constraint solver sviluppato presso la University College of Cork (UCC). Di tale algoritmo a portfolio verranno utilizzate solamente le caratteristiche usate per descrivere determinati aspetti di un CSP rispetto ad un altro; queste caratteristiche vengono altresì dette features. Creeremo quindi una serie di classicatori basati sullo specifico comportamento dei solver. La combinazione di tali classicatori con l'approccio a portfolio sara nalizzata allo scopo di valutare che le feature di CPHydra siano buone e che i classicatori basati su tali feature siano affidabili. Per giusticare il primo risultato, eettueremo un confronto con uno dei migliori portfolio allo stato dell'arte, SATzilla. Una volta stabilita la bontà delle features utilizzate per le classicazioni, andremo a risolvere i problemi simulando uno scheduler. Tali simulazioni testeranno diverse regole costruite con classicatori precedentemente introdotti. Prima agiremo su uno scenario ad un processore e successivamente ci espanderemo ad uno scenario multi processore. In questi esperimenti andremo a vericare che, le prestazioni ottenute tramite l'applicazione delle regole create appositamente sui classicatori, abbiano risultati migliori rispetto ad un'esecuzione limitata all'utilizzo del migliore solver del portfolio. I lavoro di tesi è stato svolto in collaborazione con il centro di ricerca 4C presso University College Cork. Su questo lavoro è stato elaborato e sottomesso un articolo scientico alla International Joint Conference of Articial Intelligence (IJCAI) 2011. Al momento della consegna della tesi non siamo ancora stati informati dell'accettazione di tale articolo. Comunque, le risposte dei revisori hanno indicato che tale metodo presentato risulta interessante.