Biblioteca Digital

819 resultados para Machine learning.

Brownian bridge and other path-dependent Gaussian processes vectorial simulation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The iterative simulation of the Brownian bridge is well known. In this article, we present a vectorial simulation alternative based on Gaussian processes for machine learning regression that is suitable for interpreted programming languages implementations. We extend the vectorial simulation of path-dependent trajectories to other Gaussian processes, namely, sequences of Brownian bridges, geometric Brownian motion, fractional Brownian motion, and Ornstein-Ulenbeck mean reversion process.

Probabilistic consensus clustering using evidence accumulation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

Decision Support Concerning Demand Response Programs Design and Use – A Conceptual Framework and Simulation Tool

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Demand response can play a very relevant role in the context of power systems with an intensive use of distributed energy resources, from which renewable intermittent sources are a significant part. More active consumers participation can help improving the system reliability and decrease or defer the required investments. Demand response adequate use and management is even more important in competitive electricity markets. However, experience shows difficulties to make demand response be adequately used in this context, showing the need of research work in this area. The most important difficulties seem to be caused by inadequate business models and by inadequate demand response programs management. This paper contributes to developing methodologies and a computational infrastructure able to provide the involved players with adequate decision support on demand response programs and contracts design and use. The presented work uses DemSi, a demand response simulator that has been developed by the authors to simulate demand response actions and programs, which includes realistic power system simulation. It includes an optimization module for the application of demand response programs and contracts using deterministic and metaheuristic approaches. The proposed methodology is an important improvement in the simulator while providing adequate tools for demand response programs adoption by the involved players. A machine learning method based on clustering and classification techniques, resulting in a rule base concerning DR programs and contracts use, is also used. A case study concerning the use of demand response in an incident situation is presented.

Feature Discretization with Relevance and Mutual Information Criteria

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.

Exploiting the bin-class histograms for feature selection on discrete data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In machine learning and pattern recognition tasks, the use of feature discretization techniques may have several advantages. The discretized features may hold enough information for the learning task at hand, while ignoring minor fluctuations that are irrelevant or harmful for that task. The discretized features have more compact representations that may yield both better accuracy and lower training time, as compared to the use of the original features. However, in many cases, mainly with medium and high-dimensional data, the large number of features usually implies that there is some redundancy among them. Thus, we may further apply feature selection (FS) techniques on the discrete data, keeping the most relevant features, while discarding the irrelevant and redundant ones. In this paper, we propose relevance and redundancy criteria for supervised feature selection techniques on discrete data. These criteria are applied to the bin-class histograms of the discrete features. The experimental results, on public benchmark data, show that the proposed criteria can achieve better accuracy than widely used relevance and redundancy criteria, such as mutual information and the Fisher ratio.

Text classification using compression-based dissimilarity measures

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Arguably, the most difficult task in text classification is to choose an appropriate set of features that allows machine learning algorithms to provide accurate classification. Most state-of-the-art techniques for this task involve careful feature engineering and a pre-processing stage, which may be too expensive in the emerging context of massive collections of electronic texts. In this paper, we propose efficient methods for text classification based on information-theoretic dissimilarity measures, which are used to define dissimilarity-based representations. These methods dispense with any feature design or engineering, by mapping texts into a feature space using universal dissimilarity measures; in this space, classical classifiers (e.g. nearest neighbor or support vector machines) can then be used. The reported experimental evaluation of the proposed methods, on sentiment polarity analysis and authorship attribution problems, reveals that it approximates, sometimes even outperforms previous state-of-the-art techniques, despite being much simpler, in the sense that they do not require any text pre-processing or feature engineering.

Híper-heurísticas com aprendizagem

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A otimização nos sistemas de suporte à decisão atuais assume um carácter fortemente interdisciplinar relacionando-se com a necessidade de integração de diferentes técnicas e paradigmas na resolução de problemas reais complexos, sendo que a computação de soluções ótimas em muitos destes problemas é intratável. Os métodos de pesquisa heurística são conhecidos por permitir obter bons resultados num intervalo temporal aceitável. Muitas vezes, necessitam que a parametrização seja ajustada de forma a permitir obter bons resultados. Neste sentido, as estratégias de aprendizagem podem incrementar o desempenho de um sistema, dotando-o com a capacidade de aprendizagem, por exemplo, qual a técnica de otimização mais adequada para a resolução de uma classe particular de problemas, ou qual a parametrização mais adequada de um dado algoritmo num determinado cenário. Alguns dos métodos de otimização mais usados para a resolução de problemas do mundo real resultaram da adaptação de ideias de várias áreas de investigação, principalmente com inspiração na natureza - Meta-heurísticas. O processo de seleção de uma Meta-heurística para a resolução de um dado problema é em si um problema de otimização. As Híper-heurísticas surgem neste contexto como metodologias eficientes para selecionar ou gerar heurísticas (ou Meta-heurísticas) na resolução de problemas de otimização NP-difícil. Nesta dissertação pretende-se dar uma contribuição para o problema de seleção de Metaheurísticas respetiva parametrização. Neste sentido é descrita a especificação de uma Híperheurística para a seleção de técnicas baseadas na natureza, na resolução do problema de escalonamento de tarefas em sistemas de fabrico, com base em experiência anterior. O módulo de Híper-heurística desenvolvido utiliza um algoritmo de aprendizagem por reforço (QLearning), que permite dotar o sistema da capacidade de seleção automática da Metaheurística a usar no processo de otimização, assim como a respetiva parametrização. Finalmente, procede-se à realização de testes computacionais para avaliar a influência da Híper- Heurística no desempenho do sistema de escalonamento AutoDynAgents. Como conclusão genérica, é possível afirmar que, dos resultados obtidos é possível concluir existir vantagem significativa no desempenho do sistema quando introduzida a Híper-heurística baseada em QLearning.

Scenarios Generation for Multi-Agent simulation of Electricity Markets based on Intelligent Data Analysis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This document presents a tool able to automatically gather data provided by real energy markets and to generate scenarios, capture and improve market players’ profiles and strategies by using knowledge discovery processes in databases supported by artificial intelligence techniques, data mining algorithms and machine learning methods. It provides the means for generating scenarios with different dimensions and characteristics, ensuring the representation of real and adapted markets, and their participating entities. The scenarios generator module enhances the MASCEM (Multi-Agent Simulator of Competitive Electricity Markets) simulator, endowing a more effective tool for decision support. The achievements from the implementation of the proposed module enables researchers and electricity markets’ participating entities to analyze data, create real scenarios and make experiments with them. On the other hand, applying knowledge discovery techniques to real data also allows the improvement of MASCEM agents’ profiles and strategies resulting in a better representation of real market players’ behavior. This work aims to improve the comprehension of electricity markets and the interactions among the involved entities through adequate multi-agent simulation.

Data Extraction Tool to Analyse, Transform and Store Real Data from Electricity Markets

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study of electricity markets operation has been gaining an increasing importance in the last years, as result of the new challenges that the restructuring process produced. Currently, lots of information concerning electricity markets is available, as market operators provide, after a period of confidentiality, data regarding market proposals and transactions. These data can be used as source of knowledge to define realistic scenarios, which are essential for understanding and forecast electricity markets behavior. The development of tools able to extract, transform, store and dynamically update data, is of great importance to go a step further into the comprehension of electricity markets and of the behaviour of the involved entities. In this paper an adaptable tool capable of downloading, parsing and storing data from market operators’ websites is presented, assuring constant updating and reliability of the stored data.

An automatic tool to Extract, Transform and Load data from real electricity markets

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study of Electricity Markets operation has been gaining an increasing importance in the last years, as result of the new challenges that the restructuring produced. Currently, lots of information concerning Electricity Markets is available, as market operators provide, after a period of confidentiality, data regarding market proposals and transactions. These data can be used as source of knowledge, to define realistic scenarios, essential for understanding and forecast Electricity Markets behaviour. The development of tools able to extract, transform, store and dynamically update data, is of great importance to go a step further into the comprehension of Electricity Markets and the behaviour of the involved entities. In this paper we present an adaptable tool capable of downloading, parsing and storing data from market operators’ websites, assuring actualization and reliability of stored data.

Solar Intensity Forecasting using Artificial Neural Networks and Support Vector Machines

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents several forecasting methodologies based on the application of Artificial Neural Networks (ANN) and Support Vector Machines (SVM), directed to the prediction of the solar radiance intensity. The methodologies differ from each other by using different information in the training of the methods, i.e, different environmental complementary fields such as the wind speed, temperature, and humidity. Additionally, different ways of considering the data series information have been considered. Sensitivity testing has been performed on all methodologies in order to achieve the best parameterizations for the proposed approaches. Results show that the SVM approach using the exponential Radial Basis Function (eRBF) is capable of achieving the best forecasting results, and in half execution time of the ANN based approaches.

Data Mining Approach to support the Generation of Realistic Scenarios for Multi-Agent simulation of Electricity Markets

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents the Realistic Scenarios Generator (RealScen), a tool that processes data from real electricity markets to generate realistic scenarios that enable the modeling of electricity market players’ characteristics and strategic behavior. The proposed tool provides significant advantages to the decision making process in an electricity market environment, especially when coupled with a multi-agent electricity markets simulator. The generation of realistic scenarios is performed using mechanisms for intelligent data analysis, which are based on artificial intelligence and data mining algorithms. These techniques allow the study of realistic scenarios, adapted to the existing markets, and improve the representation of market entities as software agents, enabling a detailed modeling of their profiles and strategies. This work contributes significantly to the understanding of the interactions between the entities acting in electricity markets by increasing the capability and realism of market simulations.

Exploring distributed computing tools through data mining tasks

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment.

Automated image tagging through tag propagation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial Para obtenção do grau de Mestre em Engenharia Informática

Automatic musical instrument recognition for multimedia indexing

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

«
1
2
...
19
20
21
22
23
24
25
...
54
55
»