17 resultados para Computing Classification Systems


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

Relevância:

80.00% 80.00%

Publicador:

Resumo:

RESUMO - Caracterização do problema: A inadequação e ineficácia do sistema de financiamento ―por diária‖ dos cuidados de reabilitação resultaram na necessidade de criação de sistemas de classificação de doentes de reabilitação em regime de internamento, em muitos países. Também em Portugal é necessário implementar um sistema de financiamento, baseado num sistema de classificação de doentes, ajustado pela complexidade e necessidade de cuidados destes doentes. Objectivos: Caracterização dos cuidados de reabilitação em Portugal, e do actual sistema de financiamento destes doentes; realização de uma revisão de literatura dos sistemas de classificação de doentes de reabilitação já existentes, de modo a compreender quais as variáveis de agrupamento utilizadas e qual a capacidade de previsão dos custos destes mesmos sistemas; perceber a importância da implementação de um dos sistemas de classificação em Portugal, e quais as suas vantagens. Metodologia: Da revisão de literatura efectuada, foram encontrados quatro sistemas de classificação de doentes implementados e/ou em vias de serem implementados como base para um sistema de financiamento, nos EUA, Austrália e Canadá. Foi efectuada uma extensa caracterização e análise crítica dos mesmos. Conclusões: Podemos concluir, que dos poucos sistemas de classificação de doentes de reabilitação existentes, optou-se pelo estudo de uma possível adopção do sistema norte-americano para a realidade portuguesa, por ser o único sistema de classificação já utilizado para fins de financiamento para todos os doentes de reabilitação desde 2002, o que inclui mais variáveis de decisão na classificação dos doentes, e o que permite a maior previsão dos custos dos doentes em termos percentuais.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertation submitted in the fufillment of the requirements for the Degree of Master in Biomedical Engineering

Relevância:

80.00% 80.00%

Publicador:

Resumo:

RESUMO - A definição e medição da produção são questões centrais para a administração hospitalar. A produção hospitalar, quando se consideram os casos tratados, baseia-se em dois aspectos: a definição de sistemas de classificação de doentes como metodologia para identificar produtos e a criação de índices de casemix para se compararem esses mesmos produtos. Para a sua definição e implementação podem ser consideradas características relacionadas com a complexidade dos casos (atributo da oferta) ou com a sua gravidade (atributo da procura), ou ainda características mistas. Por sua vez, a análise do perfil e da política de admissões dos hospitais adquire um maior relevo no contexto de novas experiências previstas e em curso no SNS e da renovada necessidade de avaliação e regulação que daí decorrem. Neste estudo pretendeu-se discutir a metodologia para apuramento do índice de casemix dos hospitais, introduzindo- se a gravidade dos casos tratados como atributo relevante para a sua concretização. Assim, foi analisada uma amostra de 950 443 casos presentes na base de dados dos resumos de alta em 2002, tendo- -se dado particular atenção aos 31 hospitais posteriormente constituídos como SA. Foram considerados três índices de casemix: índice de complexidade (a partir do peso relativo dos DRGs), índice de gravidade (a partir da escala de mortalidade esperada do disease staging recalibrada para Portugal) e índice conjunto (média dos dois anteriores). Verificou-se que a análise do índice de complexidade, de gravidade e conjunto dá informações distintas sobre o perfil de admissões dos hospitais considerados. Os índices de complexidade e de gravidade mostram associações distintas às características dos hospitais e dos doentes tratados. Para além disso, existe uma diferença clara entre os casos com tratamento médico e cirúrgico. No entanto, para a globalidade dos hospitais analisados observou-se que os hospitais que tratam os casos mais graves tratam igualmente os mais complexos, tendo-se ainda identificado alguns hospitais em que tal não se verifica e, quando possível, apontado eventuais razões para esse comportamento.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Este trabalho foi realizado sob orientação do Prof. António Brandão Moniz para a disciplina “Factores Sociais da Inovação” do Mestrado Engenharia Informática realizado na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa (Portugal)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Signal Processing, Vol. 83, nº 11

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis submitted to the Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Information Management – Geographic Information Systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação apresentada para a obtenção do Grau de Doutor em Informática pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last years, volunteers have been contributing massively to what we know nowadays as Volunteered Geographic Information. This huge amount of data might be hiding a vast geographical richness and therefore research needs to be conducted to explore their potential and use it in the solution of real world problems. In this study we conduct an exploratory analysis of data from the OpenStreetMap initiative. Using the Corine Land Cover database as reference and continental Portugal as the study area, we establish a possible correspondence between both classification nomenclatures, evaluate the quality of OpenStreetMap polygon features classification against Corine Land Cover classes from level 1 nomenclature, and analyze the spatial distribution of OpenStreetMap classes over continental Portugal. A global classification accuracy around 76% and interesting coverage areas’ values are remarkable and promising results that encourages us for future research on this topic.