47 resultados para Selection methods
Resumo:
Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia Civil
Resumo:
The investigation which employed the action research method (qualitative analysis)was divided into four fases. In phases 1-3 the participants were six double bass students at Nossa Senhora do Cabo Music School. Pilot exercises in creativity were followed by broader and more ambitious projects. In phase 4 the techniques were tested and amplified during a summer course for twelve double bass students at Santa Cecilia College.
Resumo:
Electricity short-term load forecast is very important for the operation of power systems. In this work a classical exponential smoothing model, the Holt-Winters with double seasonality was used to test for accurate predictions applied to the Portuguese demand time series. Some metaheuristic algorithms for the optimal selection of the smoothing parameters of the Holt-Winters forecast function were used and the results after testing in the time series showed little differences among methods, so the use of the simple local search algorithms is recommended as they are easier to implement.
Resumo:
Electricity short-term load forecast is very important for the operation of power systems. In this work a classical exponential smoothing model, the Holt-Winters with double seasonality was used to test for accurate predictions applied to the Portuguese demand time series. Some metaheuristic algorithms for the optimal selection of the smoothing parameters of the Holt-Winters forecast function were used and the results after testing in the time series showed little differences among methods, so the use of the simple local search algorithms is recommended as they are easier to implement.
Resumo:
In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.
Resumo:
Discrete data representations are necessary, or at least convenient, in many machine learning problems. While feature selection (FS) techniques aim at finding relevant subsets of features, the goal of feature discretization (FD) is to find concise (quantized) data representations, adequate for the learning task at hand. In this paper, we propose two incremental methods for FD. The first method belongs to the filter family, in which the quality of the discretization is assessed by a (supervised or unsupervised) relevance criterion. The second method is a wrapper, where discretized features are assessed using a classifier. Both methods can be coupled with any static (unsupervised or supervised) discretization procedure and can be used to perform FS as pre-processing or post-processing stages. The proposed methods attain efficient representations suitable for binary and multi-class problems with different types of data, being competitive with existing methods. Moreover, using well-known FS methods with the features discretized by our techniques leads to better accuracy than with the features discretized by other methods or with the original features. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Mestrado em Segurança e Higiene no Trabalho
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.
Resumo:
In the field of appearance-based robot localization, the mainstream approach uses a quantized representation of local image features. An alternative strategy is the exploitation of raw feature descriptors, thus avoiding approximations due to quantization. In this work, the quantized and non-quantized representations are compared with respect to their discriminativity, in the context of the robot global localization problem. Having demonstrated the advantages of the non-quantized representation, the paper proposes mechanisms to reduce the computational burden this approach would carry, when applied in its simplest form. This reduction is achieved through a hierarchical strategy which gradually discards candidate locations and by exploring two simplifying assumptions about the training data. The potential of the non-quantized representation is exploited by resorting to the entropy-discriminativity relation. The idea behind this approach is that the non-quantized representation facilitates the assessment of the distinctiveness of features, through the entropy measure. Building on this finding, the robustness of the localization system is enhanced by modulating the importance of features according to the entropy measure. Experimental results support the effectiveness of this approach, as well as the validity of the proposed computation reduction methods.
Resumo:
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.
Resumo:
Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia da Manutenção
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion.
Resumo:
Materials selection is a matter of great importance to engineering design and software tools are valuable to inform decisions in the early stages of product development. However, when a set of alternative materials is available for the different parts a product is made of, the question of what optimal material mix to choose for a group of parts is not trivial. The engineer/designer therefore goes about this in a part-by-part procedure. Optimizing each part per se can lead to a global sub-optimal solution from the product point of view. An optimization procedure to deal with products with multiple parts, each with discrete design variables, and able to determine the optimal solution assuming different objectives is therefore needed. To solve this multiobjective optimization problem, a new routine based on Direct MultiSearch (DMS) algorithm is created. Results from the Pareto front can help the designer to align his/hers materials selection for a complete set of materials with product attribute objectives, depending on the relative importance of each objective.
Resumo:
Introdução – A pesquisa de informação realizada pelos estudantes de ensino superior em recursos eletrónicos não corresponde necessariamente ao domínio de competências de pesquisa, análise, avaliação, seleção e bom uso da informação recuperada. O conceito de literacia da informação ganha pertinência e destaque, na medida em que abarca competências que permitem reconhecer quando é necessária a informação e de atuar de forma eficiente e efetiva na sua obtenção e utilização. Objetivo – A meta da Escola Superior de Tecnologia da Saúde de Lisboa (ESTeSL) foi a formação em competências de literacia da informação, fora da ESTeSL, de estudantes, professores e investigadores. Métodos – A formação foi integrada em projetos nacionais e internacionais, dependendo dos públicos-alvo, das temáticas, dos conteúdos, da carga horária e da solicitação da instituição parceira. A Fundação Calouste Gulbenkian foi o promotor financeiro privilegiado. Resultados – Decorreram várias intervenções em território nacional e internacional. Em 2010, em Angola, no Instituto Médio de Saúde do Bengo, formação de 10 bibliotecários sobre a construção e a gestão de uma biblioteca de saúde e introdução à literacia da informação (35h). Em 2014, decorrente do ERASMUS Intensive Programme, o OPTIMAX (Radiation Dose and Image Quality Optimisation in Medical Imaging) para 40 professores e estudantes de radiologia (oriundos de Portugal, Reino Unido, Noruega, Países Baixos e Suíça) sobre metodologia e pesquisa de informação na MEDLINE e na Web of Science e sobre o Mendeley, enquanto gestor de referências (4h). Os trabalhos finais deste curso foram publicados em formato de ebook (http://usir.salford.ac.uk/34439/1/Final%20complete%20version.pdf), cuja revisão editorial foi da responsabilidade dos bibliotecários. Ao longo de 2014, na Escola Superior de Educação, Escola Superior de Dança, Instituto Politécnico de Setúbal e Faculdade de Medicina de Lisboa e, ao longo de 2015, na Universidade Aberta, Escola Superior de Comunicação Social, Instituto Egas Moniz, Faculdade de Letras de Lisboa e Centro de Linguística da Universidade de Lisboa foram desenhados conteúdos sobre o uso do ZOTERO e do Mendeley para a gestão de referências bibliográficas e sobre uma nova forma de fazer investigação. Cada uma destas sessões (2,5h) envolveu cerca de 25 estudantes finalistas, mestrandos e professores. Em 2015, em Moçambique, no Instituto Superior de Ciências da Saúde, decorreu a formação de 5 bibliotecários e 46 estudantes e professores (70h). Os conteúdos ministrados foram: 1) gestão e organização de uma biblioteca de saúde (para bibliotecários); 2) literacia da informação: pesquisa de informação na MEDLINE, SciELO e RCAAP, gestores de referências e como evitar o plágio (para bibliotecários e estudantes finalistas de radiologia). A carga horária destinada aos estudantes incluiu a tutoria das monografias de licenciatura, em colaboração com mais duas professoras do projeto. Para 2016 está agendada formação noutras instituições de ensino superior nacionais. Perspetiva-se, ainda, formação similar em Timor-Leste, cujos conteúdos, datas e carga horária estão por agendar. Conclusões – Destas iniciativas beneficia a instituição (pela visibilidade), os bibliotecários (pelo evidenciar de competências) e os estudantes, professores e investigadores (pelo ganho de novas competências e pela autonomia adquirida). O projeto de literacia da informação da ESTeSL tem contribuído de forma efetiva para a construção e para a produção de conhecimento no meio académico, nacional e internacional, sendo a biblioteca o parceiro privilegiado nesta cultura de colaboração.
Resumo:
In machine learning and pattern recognition tasks, the use of feature discretization techniques may have several advantages. The discretized features may hold enough information for the learning task at hand, while ignoring minor fluctuations that are irrelevant or harmful for that task. The discretized features have more compact representations that may yield both better accuracy and lower training time, as compared to the use of the original features. However, in many cases, mainly with medium and high-dimensional data, the large number of features usually implies that there is some redundancy among them. Thus, we may further apply feature selection (FS) techniques on the discrete data, keeping the most relevant features, while discarding the irrelevant and redundant ones. In this paper, we propose relevance and redundancy criteria for supervised feature selection techniques on discrete data. These criteria are applied to the bin-class histograms of the discrete features. The experimental results, on public benchmark data, show that the proposed criteria can achieve better accuracy than widely used relevance and redundancy criteria, such as mutual information and the Fisher ratio.