11 resultados para Class Numbers


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

EUROPEAN MASTER’S DEGREE IN HUMAN RIGHTS AND DEMOCRATISATION Academic Year 2007/2008

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Biochem. J. (2011) 438,485–494 doi:10.1042/BJ20110836

Relevância:

20.00% 20.00%

Publicador:

Resumo:

J Biol Inorg Chem (2011) 16:443–460 DOI 10.1007/s00775-010-0741-z

Relevância:

20.00% 20.00%

Publicador:

Resumo:

J Biol Inorg Chem (2006) 11: 548–558 DOI 10.1007/s00775-006-0104-y

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article develops a latent class model for estimating willingness-to-pay for public goods using simultaneously contingent valuation (CV) and attitudinal data capturing protest attitudes related to the lack of trust in public institutions providing those goods. A measure of the social cost associated with protest responses and the consequent loss in potential contributions for providing the public good is proposed. The presence of potential justification biases is further considered, that is, the possibility that for psychological reasons the response to the CV question affects the answers to the attitudinal questions. The results from our empirical application suggest that psychological factors should not be ignored in CV estimation for policy purposes, allowing for a correct identification of protest responses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores

Relevância:

20.00% 20.00%

Publicador:

Resumo:

RESUMO: A estrutura demográfica portuguesa é marcada por baixas taxas de natalidade e mortalidade, onde a população idosa representa uma fatia cada vez mais representativa, fruto de uma maior longevidade. A incidência do cancro, na sua generalidade, é maior precisamente nessa classe etária. A par de outras doenças igualmente lesivas (e.g. cardiovasculares, degenerativas) cuja incidência aumenta com a idade, o cancro merece relevo. Estudos epidemiológicos apresentam o cancro como líder mundial na mortalidade. Em países desenvolvidos, o seu peso representa 25% do número total de óbitos, percentagem essa que mais que duplica noutros países. A obesidade, a baixa ingestão de frutas e vegetais, o sedentarismo, o consumo de tabaco e a ingestão de álcool, configuram-se como cinco dos fatores de risco presentes em 30% das mortes diagnosticadas por cancro. A nível mundial e, em particular no Sul de Portugal, os cancros do estômago, recto e cólon apresentam elevadas taxas de incidência e de mortalidade. Do ponto de vista estritamente económico, o cancro é a doença que mais recursos consome enquanto que do ponto de vista físico e psicológico é uma doença que não limita o seu raio de ação ao doente. O cancro é, portanto, uma doença sempre atual e cada vez mais presente, pois reflete os hábitos e o ambiente de uma sociedade, não obstante as características intrínsecas a cada indivíduo. A adoção de metodologia estatística aplicada à modelação de dados oncológicos é, sobretudo, valiosa e pertinente quando a informação é oriunda de Registos de Cancro de Base Populacional (RCBP). A pertinência é justificada pelo fato destes registos permitirem aferir numa população específica, o risco desta sofrer e/ou vir a sofrer de uma dada neoplasia. O peso que as neoplasias do estômago, cólon e recto assumem foi um dos elementos que motivou o presente estudo que tem por objetivo analisar tendências, projeções, sobrevivências relativas e a distribuição espacial destas neoplasias. Foram considerados neste estudo todos os casos diagnosticados no período 1998-2006, pelo RCBP da região sul de Portugal (ROR-Sul). O estudo descritivo inicial das taxas de incidência e da tendência em cada uma das referidas neoplasias teve como base uma única variável temporal - o ano de diagnóstico - também designada por período. Todavia, uma metodologia que contemple apenas uma única variável temporal é limitativa. No cancro, para além do período, a idade à data do diagnóstico e a coorte de nascimento, são variáveis temporais que poderão prestar um contributo adicional na caracterização das taxas de incidência. A relevância assumida por estas variáveis temporais justificou a sua inclusão numaclasse de modelos designada por modelos Idade-Período-Coorte (Age-Period-Cohort models - APC), utilizada na modelação das taxas de incidência para as neoplasias em estudo. Os referidos modelos permitem ultrapassar o problema de relações não lineares e/ou de mudanças súbitas na tendência linear das taxas. Nos modelos APC foram consideradas a abordagem clássica e a abordagem com recurso a funções suavizadoras. A modelação das taxas foi estratificada por sexo. Foram ainda estudados os respectivos submodelos (apenas com uma ou duas variáveis temporais). Conhecido o comportamento das taxas de incidência, uma questão subsequente prende-se com a sua projeção em períodos futuros. Porém, o efeito de mudanças estruturais na população, ao qual Portugal não é alheio, altera substancialmente o número esperado de casos futuros com cancro. Estimativas da incidência de cancro a nível mundial obtidas a partir de projeções demográficas apontam para um aumento de 25% dos casos de cancro nas próximas duas décadas. Embora a projeção da incidência esteja associada a alguma incerteza, as projeções auxiliam no planeamento de políticas de saúde para a afetação de recursos e permitem a avaliação de cenários e de intervenções que tenham como objetivo a redução do impacto do cancro. O desconhecimento de projeções da taxa de incidência destas neoplasias na área abrangida pelo ROR-Sul, levou à utilização de modelos de projeção que diferem entre si quanto à sua estrutura, linearidade (ou não) dos seus coeficientes e comportamento das taxas na série histórica de dados (e.g. crescente, decrescente ou estável). Os referidos modelos pautaram-se por duas abordagens: (i)modelos lineares no que concerne ao tempo e (ii) extrapolação de efeitos temporais identificados pelos modelos APC para períodos futuros. Foi feita a projeção das taxas de incidência para os anos de 2007 a 2010 tendo em conta o género, idade e neoplasia. É ainda apresentada uma estimativa do impacto económico destas neoplasias no período de projeção. Uma questão pertinente e habitual no contexto clínico e a que o presente estudo pretende dar resposta, reside em saber qual a contribuição da neoplasia em si para a sobrevivência do doente. Nesse sentido, a mortalidade por causa específica é habitualmente utilizada para estimar a mortalidade atribuível apenas ao cancro em estudo. Porém, existem muitas situações em que a causa de morte é desconhecida e, mesmo que esta informação esteja disponível através dos certificados de óbito, não é fácil distinguir os casos em que a principal causa de morte é devida ao cancro. A sobrevivência relativa surge como uma medida objetiva que não necessita do conhecimento da causa específica da morte para o seu cálculo e dar-nos-á uma estimativa da probabilidade de sobrevivência caso o cancro em análise, num cenário hipotético, seja a única causa de morte. Desconhecida a principal causa de morte nos casos diagnosticados com cancro no registo ROR-Sul, foi determinada a sobrevivência relativa para cada uma das neoplasias em estudo, para um período de follow-up de 5 anos, tendo em conta o sexo, a idade e cada uma das regiões que constituem o registo. Foi adotada uma análise por período e as abordagens convencional e por modelos. No epílogo deste estudo, é analisada a influência da variabilidade espaço-temporal nas taxas de incidência. O longo período de latência das doenças oncológicas, a dificuldade em identificar mudanças súbitas no comportamento das taxas, populações com dimensão e riscos reduzidos, são alguns dos elementos que dificultam a análise da variação temporal das taxas. Nalguns casos, estas variações podem ser reflexo de flutuações aleatórias. O efeito da componente temporal aferida pelos modelos APC dá-nos um retrato incompleto da incidência do cancro. A etiologia desta doença, quando conhecida, está associada com alguma frequência a fatores de risco tais como condições socioeconómicas, hábitos alimentares e estilo de vida, atividade profissional, localização geográfica e componente genética. O “contributo”, dos fatores de risco é, por vezes, determinante e não deve ser ignorado. Surge, assim, a necessidade em complementar o estudo temporal das taxas com uma abordagem de cariz espacial. Assim, procurar-se-á aferir se as variações nas taxas de incidência observadas entre os concelhos inseridos na área do registo ROR-Sul poderiam ser explicadas quer pela variabilidade temporal e geográfica quer por fatores socioeconómicos ou, ainda, pelos desiguais estilos de vida. Foram utilizados os Modelos Bayesianos Hierárquicos Espaço-Temporais com o objetivo de identificar tendências espaço-temporais nas taxas de incidência bem como quantificar alguns fatores de risco ajustados à influência simultânea da região e do tempo. Os resultados obtidos pela implementação de todas estas metodologias considera-se ser uma mais valia para o conhecimento destas neoplasias em Portugal.------------ABSTRACT: mortality rates, with the elderly being an increasingly representative sector of the population, mainly due to greater longevity. The incidence of cancer, in general, is greater precisely in that age group. Alongside with other equally damaging diseases (e.g. cardiovascular,degenerative), whose incidence rates increases with age, cancer is of special note. In epidemiological studies, cancer is the global leader in mortality. In developed countries its weight represents 25% of the total number of deaths, with this percentage being doubled in other countries. Obesity, a reduce consumption of fruit and vegetables, physical inactivity, smoking and alcohol consumption, are the five risk factors present in 30% of deaths due to cancer. Globally, and in particular in the South of Portugal, the stomach, rectum and colon cancer have high incidence and mortality rates. From a strictly economic perspective, cancer is the disease that consumes more resources, while from a physical and psychological point of view, it is a disease that is not limited to the patient. Cancer is therefore na up to date disease and one of increased importance, since it reflects the habits and the environment of a society, regardless the intrinsic characteristics of each individual. The adoption of statistical methodology applied to cancer data modelling is especially valuable and relevant when the information comes from population-based cancer registries (PBCR). In such cases, these registries allow for the assessment of the risk and the suffering associated to a given neoplasm in a specific population. The weight that stomach, colon and rectum cancers assume in Portugal was one of the motivations of the present study, that focus on analyzing trends, projections, relative survival and spatial distribution of these neoplasms. The data considered in this study, are all cases diagnosed between 1998 and 2006, by the PBCR of Portugal, ROR-Sul.Only year of diagnosis, also called period, was the only time variable considered in the initial descriptive analysis of the incidence rates and trends for each of the three neoplasms considered. However, a methodology that only considers one single time variable will probably fall short on the conclusions that could be drawn from the data under study. In cancer, apart from the variable period, the age at diagnosis and the birth cohort are also temporal variables and may provide an additional contribution to the characterization of the incidence. The relevance assumed by these temporal variables justified its inclusion in a class of models called Age-Period-Cohort models (APC). This class of models was used for the analysis of the incidence rates of the three cancers under study. APC models allow to model nonlinearity and/or sudden changes in linear relationships of rate trends. Two approaches of APC models were considered: the classical and the one using smoothing functions. The models were stratified by gender and, when justified, further studies explored other sub-models where only one or two temporal variables were considered. After the analysis of the incidence rates, a subsequent goal is related to their projections in future periods. Although the effect of structural changes in the population, of which Portugal is not oblivious, may substantially change the expected number of future cancer cases, the results of these projections could help planning health policies with the proper allocation of resources, allowing for the evaluation of scenarios and interventions that aim to reduce the impact of cancer in a population. Worth noting that cancer incidence worldwide obtained from demographic projections point out to an increase of 25% of cancer cases in the next two decades. The lack of projections of incidence rates of the three cancers under study in the area covered by ROR-Sul, led us to use a variety of forecasting models that differ in the nature and structure. For example, linearity or nonlinearity in their coefficients and the trend of the incidence rates in historical data series (e.g. increasing, decreasing or stable).The models followed two approaches: (i) linear models regarding time and (ii) extrapolation of temporal effects identified by the APC models for future periods. The study provide incidence rates projections and the numbers of newly diagnosed cases for the year, 2007 to 2010, taking into account gender, age and the type of cancer. In addition, an estimate of the economic impact of these neoplasms is presented for the projection period considered. This research also try to address a relevant and common clinical question in these type of studies, regarding the contribution of the type of cancer to the patient survival. In such studies, the primary cause of death is commonly used to estimate the mortality specifically due to the cancer. However, there are many situations in which the cause of death is unknown, or, even if this information is available through the death certificates, it is not easy to distinguish the cases where the primary cause of death is the cancer. With this in mind, the relative survival is an alternative measure that does not need the knowledge of the specific cause of death to be calculated. This estimate will represent the survival probability in the hypothetical scenario of a certain cancer be the only cause of death. For the patients with unknown cause of death that were diagnosed with cancer in the ROR-Sul, the relative survival was calculated for each of the cancers under study, for a follow-up period of 5 years, considering gender, age and each one of the regions that are part the registry. A period analysis was undertaken, considering both the conventional and the model approaches. In final part of this study, we analyzed the influence of space-time variability in the incidence rates. The long latency period of oncologic diseases, the difficulty in identifying subtle changes in the rates behavior, populations of reduced size and low risk are some of the elements that can be a challenge in the analysis of temporal variations in rates, that, in some cases, can reflect simple random fluctuations. The effect of the temporal component measured by the APC models gives an incomplete picture of the cancer incidence. The etiology of this disease, when known, is frequently associated to risk factors such as socioeconomic conditions, eating habits and lifestyle, occupation, geographic location and genetic component. The "contribution"of such risk factors is sometimes decisive in the evolution of the disease and should not be ignored. Therefore, there was the need to consider an additional approach in this study, one of spatial nature, addressing the fact that changes in incidence rates observed in the ROR-Sul area, could be explained either by temporal and geographical variability or by unequal socio-economic or lifestyle factors. Thus, Bayesian hierarchical space-time models were used with the purpose of identifying space-time trends in incidence rates together with the the analysis of the effect of the risk factors considered in the study. The results obtained and the implementation of all these methodologies are considered to be an added value to the knowledge of these neoplasms in Portugal.