938 resultados para Elaborazione d’immagini, Microscopia, Istopatologia, Classificazione, K-means


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Os avanços tecnológicos e científicos, na área da saúde, têm vindo a aliar áreas como a Medicina e a Matemática, cabendo à ciência adequar de forma mais eficaz os meios de investigação, diagnóstico, monitorização e terapêutica. Os métodos desenvolvidos e os estudos apresentados nesta dissertação resultam da necessidade de encontrar respostas e soluções para os diferentes desafios identificados na área da anestesia. A índole destes problemas conduz, necessariamente, à aplicação, adaptação e conjugação de diferentes métodos e modelos das diversas áreas da matemática. A capacidade para induzir a anestesia em pacientes, de forma segura e confiável, conduz a uma enorme variedade de situações que devem ser levadas em conta, exigindo, por isso, intensivos estudos. Assim, métodos e modelos de previsão, que permitam uma melhor personalização da dosagem a administrar ao paciente e por monitorizar, o efeito induzido pela administração de cada fármaco, com sinais mais fiáveis, são fundamentais para a investigação e progresso neste campo. Neste contexto, com o objetivo de clarificar a utilização em estudos na área da anestesia de um ajustado tratamento estatístico, proponho-me abordar diferentes análises estatísticas para desenvolver um modelo de previsão sobre a resposta cerebral a dois fármacos durante sedação. Dados obtidos de voluntários serão utilizados para estudar a interação farmacodinâmica entre dois fármacos anestésicos. Numa primeira fase são explorados modelos de regressão lineares que permitam modelar o efeito dos fármacos no sinal cerebral BIS (índice bispectral do EEG – indicador da profundidade de anestesia); ou seja estimar o efeito que as concentrações de fármacos têm na depressão do eletroencefalograma (avaliada pelo BIS). Na segunda fase deste trabalho, pretende-se a identificação de diferentes interações com Análise de Clusters bem como a validação do respetivo modelo com Análise Discriminante, identificando grupos homogéneos na amostra obtida através das técnicas de agrupamento. O número de grupos existentes na amostra foi, numa fase exploratória, obtido pelas técnicas de agrupamento hierárquicas, e a caracterização dos grupos identificados foi obtida pelas técnicas de agrupamento k-means. A reprodutibilidade dos modelos de agrupamento obtidos foi testada através da análise discriminante. As principais conclusões apontam que o teste de significância da equação de Regressão Linear indicou que o modelo é altamente significativo. As variáveis propofol e remifentanil influenciam significativamente o BIS e o modelo melhora com a inclusão do remifentanil. Este trabalho demonstra ainda ser possível construir um modelo que permite agrupar as concentrações dos fármacos, com base no efeito no sinal cerebral BIS, com o apoio de técnicas de agrupamento e discriminantes. Os resultados desmontram claramente a interacção farmacodinâmica dos dois fármacos, quando analisamos o Cluster 1 e o Cluster 3. Para concentrações semelhantes de propofol o efeito no BIS é claramente diferente dependendo da grandeza da concentração de remifentanil. Em suma, o estudo demostra claramente, que quando o remifentanil é administrado com o propofol (um hipnótico) o efeito deste último é potenciado, levando o sinal BIS a valores bastante baixos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent decades, all over the world, competition in the electric power sector has deeply changed the way this sector’s agents play their roles. In most countries, electric process deregulation was conducted in stages, beginning with the clients of higher voltage levels and with larger electricity consumption, and later extended to all electrical consumers. The sector liberalization and the operation of competitive electricity markets were expected to lower prices and improve quality of service, leading to greater consumer satisfaction. Transmission and distribution remain noncompetitive business areas, due to the large infrastructure investments required. However, the industry has yet to clearly establish the best business model for transmission in a competitive environment. After generation, the electricity needs to be delivered to the electrical system nodes where demand requires it, taking into consideration transmission constraints and electrical losses. If the amount of power flowing through a certain line is close to or surpasses the safety limits, then cheap but distant generation might have to be replaced by more expensive closer generation to reduce the exceeded power flows. In a congested area, the optimal price of electricity rises to the marginal cost of the local generation or to the level needed to ration demand to the amount of available electricity. Even without congestion, some power will be lost in the transmission system through heat dissipation, so prices reflect that it is more expensive to supply electricity at the far end of a heavily loaded line than close to an electric power generation. Locational marginal pricing (LMP), resulting from bidding competition, represents electrical and economical values at nodes or in areas that may provide economical indicator signals to the market agents. This article proposes a data-mining-based methodology that helps characterize zonal prices in real power transmission networks. To test our methodology, we used an LMP database from the California Independent System Operator for 2009 to identify economical zones. (CAISO is a nonprofit public benefit corporation charged with operating the majority of California’s high-voltage wholesale power grid.) To group the buses into typical classes that represent a set of buses with the approximate LMP value, we used two-step and k-means clustering algorithms. By analyzing the various LMP components, our goal was to extract knowledge to support the ISO in investment and network-expansion planning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A methodology based on data mining techniques to support the analysis of zonal prices in real transmission networks is proposed in this paper. The mentioned methodology uses clustering algorithms to group the buses in typical classes that include a set of buses with similar LMP values. Two different clustering algorithms have been used to determine the LMP clusters: the two-step and K-means algorithms. In order to evaluate the quality of the partition as well as the best performance algorithm adequacy measurements indices are used. The paper includes a case study using a Locational Marginal Prices (LMP) data base from the California ISO (CAISO) in order to identify zonal prices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para a obtenção do grau de Mestre em Engenharia Electrotécnica Ramo de Energia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the present paper we compare clustering solutions using indices of paired agreement. We propose a new method - IADJUST - to correct indices of paired agreement, excluding agreement by chance. This new method overcomes previous limitations known in the literature as it permits the correction of any index. We illustrate its use in external clustering validation, to measure the accordance between clusters and an a priori known structure. The adjusted indices are intended to provide a realistic measure of clustering performance that excludes agreement by chance with ground truth. We use simulated data sets, under a range of scenarios - considering diverse numbers of clusters, clusters overlaps and balances - to discuss the pertinence and the precision of our proposal. Precision is established based on comparisons with the analytical approach for correction specific indices that can be corrected in this way are used for this purpose. The pertinence of the proposed correction is discussed when making a detailed comparison between the performance of two classical clustering approaches, namely Expectation-Maximization (EM) and K-Means (KM) algorithms. Eight indices of paired agreement are studied and new corrected indices are obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Com base no modelo de Resposta à Intervenção (RtI), este estudo centrouse em três objetivos: construir um instrumento vocacionado para a determinação do nível de competências fundamentais, do 1º ao 6º anos, na disciplina de Matemática; avaliar o valor preditivo do instrumento sobre a necessidade de intervenção; examinar o efeito de uma intervenção planeada com base na avaliação diagnóstica desse instrumento. Para dar resposta ao primeiro e segundo objetivos foram consideradas duas amostras de conveniência: a primeira, constituída por 5 docentes, avaliou a versão teste do instrumento e a segunda, constituída por 6 docentes, avaliou a sua versão final (perfazendo um total de 75 alunos). Recorrendo ao método kmeans, os resultados mostraram que o instrumento é de útil e fácil aplicação, permitindo aos docentes avaliarem e identificarem o grupo de desempenho a que pertence cada aluno, em relação à média dos resultados da respetiva turma. Relativamente ao terceiro objetivo, foi constituída uma amostra de 7 alunos de uma turma do 4º ano. A intervenção decorreu ao longo de 11 semanas, com 2 sessões semanais, cuja duração variou entre 10 a 35 minutos. Para avaliar os efeitos da intervenção, foi realizado um pré e um pós-teste, assim como 2 sessões de avaliação intermédia (checkpoints), tendo-se recorrido ao teste não paramétrico de Friedman e ao teste de Wilcoxon, para avaliar a significância das diferenças entre os tempos e os níveis de suporte, para o aluno resolver a tarefa com sucesso, respetivamente. Os resultados mostraram diferenças estatiscamente significativas, particularmente entre as duas avaliações intermédia consideradas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O objetivo desta dissertação foi estudar um conjunto de empresas cotadas na bolsa de valores de Lisboa, para identificar aquelas que têm um comportamento semelhante ao longo do tempo. Para isso utilizamos algoritmos de Clustering tais como K-Means, PAM, Modelos hierárquicos, Funny e C-Means tanto com a distância euclidiana como com a distância de Manhattan. Para selecionar o melhor número de clusters identificado por cada um dos algoritmos testados, recorremos a alguns índices de avaliação/validação de clusters como o Davies Bouldin e Calinski-Harabasz entre outros.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O paradigma de avaliação do ensino superior foi alterado em 2005 para ter em conta, para além do número de entradas, o número de alunos diplomados. Esta alteração pressiona as instituições académicas a melhorar o desempenho dos alunos. Um fenómeno perceptível ao analisar esse desempenho é que a performance registada não é nem uniforme nem constante ao longo da estadia do aluno no curso. Estas variações não estão a ser consideradas no esforço de melhorar o desempenho académico e surge motivação para detectar os diferentes perfis de desempenho e utilizar esse conhecimento para melhorar a o desempenho das instituições académicas. Este documento descreve o trabalho realizado no sentido de propor uma metodologia para detectar padrões de desempenho académico, num curso do ensino superior. Como ferramenta de análise são usadas técnicas de data mining, mais precisamente algoritmos de agrupamento. O caso de estudo para este trabalho é a população estudantil da licenciatura em Eng. Informática da FCT-UNL. Propõe-se dois modelos para o aluno, que servem de base para a análise. Um modelo analisa os alunos tendo em conta a sua performance num ano lectivo e o segundo analisa os alunos tendo em conta o seu percurso académico pelo curso, desde que entrou até se diplomar, transferir ou desistir. Esta análise é realizada recorrendo aos algoritmos de agrupamento: algoritmo aglomerativo hierárquico, k-means, SOM e SNN, entre outros.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study is to examine the psychographic (product attributes, motivation opinions, interest, lifestyle, values) characteristics of wine tourists along the Niagara wine r,~ute, located in Ontario, Canada, using a multiple case study method. Four wineries were selected, two wineries each on the East, and West sides of the wine route during the shoulder-season (January, February, 2004). Using a computer generated survey technique, tourists were approached to fill out a questionnaire on one of the available laptop computers, where a sample ofN=321 was obtained. The study findings revealed that there are three distinct wine tourist segments in the Niagara region. The segments were determined using an exploratory factor analysis (EFA) and a K-means cluster analysis: Wine Lovers, Wine Interested, and Wine Curious wine tourists. These three segments displayed significant differences in their, motivation for visiting a winery, lifestyles, values, and wine purchasing behaviour. This study also examined differences between winery locations, on the East and West sides of the Niagara wine route, with respect to the aforementioned variables. The results indicated that there were significant differences between the regions with respect to these variables. The findings suggest that these differences present opportunities for more effective marketing strategies based on the uniqueness of each region. The results of this study provide insight for academia into a method of psychographic market segmentation of wine tourists and consumer behaviour. This study also contributes to the literature on wine tourism, and the identification of psychographic characteristics of wine tourists, an area where little research has taken place.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ce mémoire de maîtrise présente une nouvelle approche non supervisée pour détecter et segmenter les régions urbaines dans les images hyperspectrales. La méthode proposée n ́ecessite trois étapes. Tout d’abord, afin de réduire le coût calculatoire de notre algorithme, une image couleur du contenu spectral est estimée. A cette fin, une étape de réduction de dimensionalité non-linéaire, basée sur deux critères complémentaires mais contradictoires de bonne visualisation; à savoir la précision et le contraste, est réalisée pour l’affichage couleur de chaque image hyperspectrale. Ensuite, pour discriminer les régions urbaines des régions non urbaines, la seconde étape consiste à extraire quelques caractéristiques discriminantes (et complémentaires) sur cette image hyperspectrale couleur. A cette fin, nous avons extrait une série de paramètres discriminants pour décrire les caractéristiques d’une zone urbaine, principalement composée d’objets manufacturés de formes simples g ́eométriques et régulières. Nous avons utilisé des caractéristiques texturales basées sur les niveaux de gris, la magnitude du gradient ou des paramètres issus de la matrice de co-occurrence combinés avec des caractéristiques structurelles basées sur l’orientation locale du gradient de l’image et la détection locale de segments de droites. Afin de réduire encore la complexité de calcul de notre approche et éviter le problème de la ”malédiction de la dimensionnalité” quand on décide de regrouper des données de dimensions élevées, nous avons décidé de classifier individuellement, dans la dernière étape, chaque caractéristique texturale ou structurelle avec une simple procédure de K-moyennes et ensuite de combiner ces segmentations grossières, obtenues à faible coût, avec un modèle efficace de fusion de cartes de segmentations. Les expérimentations données dans ce rapport montrent que cette stratégie est efficace visuellement et se compare favorablement aux autres méthodes de détection et segmentation de zones urbaines à partir d’images hyperspectrales.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Any automatically measurable, robust and distinctive physical characteristic or personal trait that can be used to identify an individual or verify the claimed identity of an individual, referred to as biometrics, has gained significant interest in the wake of heightened concerns about security and rapid advancements in networking, communication and mobility. Multimodal biometrics is expected to be ultra-secure and reliable, due to the presence of multiple and independent—verification clues. In this study, a multimodal biometric system utilising audio and facial signatures has been implemented and error analysis has been carried out. A total of one thousand face images and 250 sound tracks of 50 users are used for training the proposed system. To account for the attempts of the unregistered signatures data of 25 new users are tested. The short term spectral features were extracted from the sound data and Vector Quantization was done using K-means algorithm. Face images are identified based on Eigen face approach using Principal Component Analysis. The success rate of multimodal system using speech and face is higher when compared to individual unimodal recognition systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biclustering is simultaneous clustering of both rows and columns of a data matrix. A measure called Mean Squared Residue (MSR) is used to simultaneously evaluate the coherence of rows and columns within a submatrix. In this paper a novel algorithm is developed for biclustering gene expression data using the newly introduced concept of MSR difference threshold. In the first step high quality bicluster seeds are generated using K-Means clustering algorithm. Then more genes and conditions (node) are added to the bicluster. Before adding a node the MSR X of the bicluster is calculated. After adding the node again the MSR Y is calculated. The added node is deleted if Y minus X is greater than MSR difference threshold or if Y is greater than MSR threshold which depends on the dataset. The MSR difference threshold is different for gene list and condition list and it depends on the dataset also. Proper values should be identified through experimentation in order to obtain biclusters of high quality. The results obtained on bench mark dataset clearly indicate that this algorithm is better than many of the existing biclustering algorithms