This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.
RESUMO: Os estudos sobre a funcionalidade da população idosa têm uma representação importante naquilo que é o atual conhecimento da demografia do mundo. Portugal posiciona-se e perspetiva-se como pertencendo aos países mais envelhecidos, possuindo uma rede de cuidados pós-agudos – a Rede Nacional de Cuidados Continuados Integrados (RNCCI)– que assiste uma parcela importante dessa população. Os aspetos conceptuais da funcionalidade de acordo com a OMS e operacionalizados pela Classificação Internacional de Funcionalidade (CIF), não mereceram até agora suficiente aplicabilidade no nosso país, inviabilizando a possibilidade de oferecermos contributos para a sua operacionalização. Da mesma forma, também os Core Sets da Classificação não têm sido sujeitos a processos de validação que contemplem amostras portuguesas, mantendo-se desconhecimento da especificidade dos fatores contextuais na nossa população. O presente estudo tem como objetivos conhecer a evolução da funcionalidade dos idosos assistidos na RNCCI na região do Algarve nas unidades de convalescença e média duração, validar o Core Set Geriátrico da OMS e propor uma versão abreviada da sua modalidade abrangente, no contexto destes cuidados. A amostra constituída por 451 idosos, dos quais 62,1% eram mulheres, revelou na pré-morbilidade níveis favoráveis de funcionalidade, com exceção para as Atividades Domésticas. Contudo, os mais idosos (≥ 85 anos), os indivíduos sem escolaridade, as mulheres e os viúvos/solteiros apresentaram mais casos desfavoráveis quando comparados com os seus pares. Na evolução da funcionalidade observámos melhorias significativas em todos os domínios avaliados, com diferenças relativamente à idade e à escolaridade; apesar dos resultados positivos os mais idosos e os indivíduos sem escolaridade apresentaram níveis inferiores de evolução. No entanto, a funcionalidade alcançada revelou ficar com resultados significativamente inferiores na comparação com aquela que os indivíduos possuíam na pré-morbilidade. Os modelos de regressão revelaram que as Funções Mentais, a Perceção do Estado de Saúde e a atividade Usar o Telefone, foram as variáveis que melhor explicaram os outcomes da funcionalidade alcançada. A validação do Core Set Geriátrico foi possível na maioria das categorias, sendo que foi no componente das Funções do Corpo onde esse processo revelou maior fragilidade. As Funções Neuromusculoesqueléticas e Relacionadas com o Movimento foram aquelas que registaram em ambos os momentos avaliativos frequências mais elevadas de deficiência, enquanto no componente Atividades & Participação isso ocorreu na atividade Utilização dos Movimentos Finos da Mão. Os capítulos Apoios e Relacionamentos e Atitudes foram considerados os Fatores Ambientais mais Facilitadores mas também com maior impacto Barreira. A proposta para o Core Set Geriátrico Abreviado resultou das categorias independentes que explicaram os modelos da funcionalidade alcançada e cujo resultado engloba um conjunto de 27 categorias, com um enfoque importante no componente Atividades/Participação de onde se destacam os domínios da Mobilidade e dos Auto Cuidados. A funcionalidade dos indivíduos e das populações deve ser considerada uma variável incontornável da Saúde Pública, cuja avaliação deve refletir uma abordagem biopsicossocial, apoiada na Classificação Internacional de Funcionalidade. A operacionalização da Classificação a partir dos Core Sets necessita de pesquisa mais aprofundada relativamente às caraterísticas psicométricas dos seus qualificadores e dos seus processos de validação.-----------ABSTRACT: The studies about the functioning of the elderly play an important role on what the present knowledge of the demography in the world is. Portugal figures high on the most aged countries, having a network of post-acute care - the National Network of Integrated Continuous Care (RNCCI) - which assists a large part of that population. The conceptual aspects of functioning according to WHO and operated by the International Classification of Functioning (ICF), have been insufficiently addressed concerning its adequate applicability in our country, hindering the contributions of its operation. In the same way, also the Core Sets of the Classification have not been subjected to validation procedures that include portuguese samples, keeping the unawareness of specificity of the contextual factors in our population. The objectives of the present study were to know the evolution of the functioning of the elderly assisted in the RNCCI in the Algarve region in units of convalescence and average duration, validate the WHO Geriatric Core Set and propose an abridged version of this comprehensive core set in this healthcare context. The sample was composed by 451 elderly people, of which 62.1% were women, they showed favourable levels in functioning in the pre-morbid state, except for Domestic Activities. However, the oldest (≥ 85 years), the individuals with no education, women and widowed/ unmarried showed more unfavourable cases when compared to their peers. In the evolution of functioning we observed significant improvements in all domains assessed, with diferences with respect to age and education. In spite of positive results, the oldest and the individuals with no education showed lower levels of evolution. However, the functioning achieved showed significantly lower results when compared to the those observed in pre-morbidity state. Regression models reveal that Mental Functions, the Perceived Health Status and the Use of the Phone activity, were the variables that better explain the functioning of the outcomes achieved. The validation of the Geriatric Core Set of ICF was possible in most categories, and Body Functions was the component where this process showed greatest weakness. Neuromusculoskeletal and Movement-Related Functions experienced in both evaluation times with higher rates of disability, while in the Activities & Participation component this occurred in the Fine Hand Use activity. The Support and Relationships and Attitudes chapters were considered the Environmental Factors most Facilitators but also with greater impact Barrier. The proposal for the Brief Geriatric Core Set has resulted from the independent categories that explained the regression models of functioning and includes a set of 27 categories, with na important emphasis on Activities & Participation component where we can highlight the areas of Mobility and Self Care domains. The functioning of individuals and populations should be considered as an unavoidable variable of Public Health, of which the assessment should reflect a biopsychosocial approach, based on the International Classification of Functioning. The operationalization of the Classification from the Core Sets requires further research regarding the psychometric characteristics of their qualifiers and their validation procedure.