38 resultados para Clustering search algorithm
Resumo:
Given a set of mixed spectral (multispectral or hyperspectral) vectors, linear spectral mixture analysis, or linear unmixing, aims at estimating the number of reference substances, also called endmembers, their spectral signatures, and their abundance fractions. This paper presents a new method for unsupervised endmember extraction from hyperspectral data, termed vertex component analysis (VCA). The algorithm exploits two facts: (1) the endmembers are the vertices of a simplex and (2) the affine transformation of a simplex is also a simplex. In a series of experiments using simulated and real data, the VCA algorithm competes with state-of-the-art methods, with a computational complexity between one and two orders of magnitude lower than the best available method.
Resumo:
The calculation of the dose is one of the key steps in radiotherapy planning1-5. This calculation should be as accurate as possible, and over the years it became feasible through the implementation of new algorithms to calculate the dose on the treatment planning systems applied in radiotherapy. When a breast tumour is irradiated, it is fundamental a precise dose distribution to ensure the planning target volume (PTV) coverage and prevent skin complications. Some investigations, using breast cases, showed that the pencil beam convolution algorithm (PBC) overestimates the dose in the PTV and in the proximal region of the ipsilateral lung. However, underestimates the dose in the distal region of the ipsilateral lung, when compared with analytical anisotropic algorithm (AAA). With this study we aim to compare the performance in breast tumors of the PBC and AAA algorithms.
Resumo:
Conferência - 16th International Symposium on Wireless Personal Multimedia Communications (WPMC)- Jun 24-27, 2013
Resumo:
Clustering analysis is a useful tool to detect and monitor disease patterns and, consequently, to contribute for an effective population disease management. Portugal has the highest incidence of tuberculosis in the European Union (in 2012, 21.6 cases per 100.000 inhabitants), although it has been decreasing consistently. Two critical PTB (Pulmonary Tuberculosis) areas, metropolitan Oporto and metropolitan Lisbon regions, were previously identified through spatial and space-time clustering for PTB incidence rate and risk factors. Identifying clusters of temporal trends can further elucidate policy makers about municipalities showing a faster or a slower TB control improvement.
Resumo:
Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.
Resumo:
Objectivo do estudo: comparar o desempenho dos algoritmos Pencil Beam Convolution (PBC) e do Analytical Anisotropic Algorithm (AAA) no planeamento do tratamento de tumores de mama com radioterapia conformacional a 3D.
Resumo:
This paper focus on a demand response model analysis in a smart grid context considering a contingency scenario. A fuzzy clustering technique is applied on the developed demand response model and an analysis is performed for the contingency scenario. Model considerations and architecture are described. The demand response developed model aims to support consumers decisions regarding their consumption needs and possible economic benefits.
Resumo:
In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP.
Resumo:
Biosignals analysis has become widespread, upstaging their typical use in clinical settings. Electrocardiography (ECG) plays a central role in patient monitoring as a diagnosis tool in today's medicine and as an emerging biometric trait. In this paper we adopt a consensus clustering approach for the unsupervised analysis of an ECG-based biometric records. This type of analysis highlights natural groups within the population under investigation, which can be correlated with ground truth information in order to gain more insights about the data. Preliminary results are promising, for meaningful clusters are extracted from the population under analysis. © 2014 EURASIP.
Resumo:
This paper focus on a demand response model analysis in a smart grid context considering a contingency scenario. A fuzzy clustering technique is applied on the developed demand response model and an analysis is performed for the contingency scenario. Model considerations and architecture are described. The demand response developed model aims to support consumers decisions regarding their consumption needs and possible economic benefits.
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.
Resumo:
In the present paper we focus on the performance of clustering algorithms using indices of paired agreement to measure the accordance between clusters and an a priori known structure. We specifically propose a method to correct all indices considered for agreement by chance - the adjusted indices are meant to provide a realistic measure of clustering performance. The proposed method enables the correction of virtually any index - overcoming previous limitations known in the literature - and provides very precise results. We use simulated datasets under diverse scenarios and discuss the pertinence of our proposal which is particularly relevant when poorly separated clusters are considered. Finally we compare the performance of EM and KMeans algorithms, within each of the simulated scenarios and generally conclude that EM generally yields best results.
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Resumo:
Trabalho de projeto realizado para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.