936 resultados para textual similarity


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A study about the physical appearance of pre-photographic, photomechanical, photographic and digital positive reflective prints was made, relating the obtained images with the history, materials and technology used to create them. The studied samples are from the Image Permanence Institute (IPI) study collection. The digital images were obtained using a digital SLR on a copystand and a compound light microscope, with different lighting angles (0º, 45ºand 90º) and magnifications from overall views on the copystand down to a 20x objective lens on the microscope. Most of these images were originally created by IPI for www.digitalsamplebook.org, a web tool for teaching print identification, and will be used on the www.graphicsatlas.org website, along with textual information on identification, technology and history information about these reproduction processes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertation presented at the Faculty of Science and Technology of the New University of Lisbon in fulfillment of the requirements for the Masters degree in Electrical Engineering and Computers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertation presented in partial fulfillment of the requirements for the degree of Master in Biotechnology

Relevância:

10.00% 10.00%

Publicador:

Resumo:

ABSTRACT OBJECTIVE To analyze the relations between the meanings of working and the levels of doctors work well-being in the context of their working conditions. METHOD The research combined the qualitative methodology of textual analysis and the quantitative one of correspondence factor analysis. A convenience, intentional, and stratified sample composed of 305 Spanish and Latin American doctors completed an extensive questionnaire on the topics of the research. RESULTS The general meaning of working for the group located in the quartile of malaise included perceptions of discomfort, frustration, and exhaustion. However, those showing higher levels of well-being, located on the opposite quartile, associated their working experience with good conditions and the development of their professional and personal competences. CONCLUSIONS The study provides empirical evidence of the relationship between contextual factors and the meanings of working for participants with higher levels of malaise, and of the importance granted both to intrinsic and extrinsic factors by those who scored highest on well-being.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A genetic algorithm used to design radio-frequency binary-weighted differential switched capacitor arrays (RFDSCAs) is presented in this article. The algorithm provides a set of circuits all having the same maximum performance. This article also describes the design, implementation, and measurements results of a 0.25 lm BiCMOS 3-bit RFDSCA. The experimental results show that the circuit presents the expected performance up to 40 GHz. The similarity between the evolutionary solutions, circuit simulations, and measured results indicates that the genetic synthesis method is a very useful tool for designing optimum performance RFDSCAs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mestrado em Engenharia Informática - Área de Especialização em Sistemas Gráficos e Multimédia

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mestrado em Engenharia Electrotécnica e de Computadores - Área de Especialização de Telecomunicações

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Trabalho de Projecto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Relatório de Estágio apresentado à Escola Superior de Educação de Lisboa para obtenção de grau de mestre em Ensino do 1.º e 2.º Ciclo

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Relatório Final apresentado à Escola Superior de Educação de Lisboa para obtenção de grau de mestre em Ensino do 1.º e do 2.º Ciclo do Ensino Básico

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Extracting the semantic relatedness of terms is an important topic in several areas, including data mining, information retrieval and web recommendation. This paper presents an approach for computing the semantic relatedness of terms using the knowledge base of DBpedia — a community effort to extract structured information from Wikipedia. Several approaches to extract semantic relatedness from Wikipedia using bag-of-words vector models are already available in the literature. The research presented in this paper explores a novel approach using paths on an ontological graph extracted from DBpedia. It is based on an algorithm for finding and weighting a collection of paths connecting concept nodes. This algorithm was implemented on a tool called Shakti that extract relevant ontological data for a given domain from DBpedia using its SPARQL endpoint. To validate the proposed approach Shakti was used to recommend web pages on a Portuguese social site related to alternative music and the results of that experiment are reported in this paper.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hepatobiliary alterations found in an autopsy case of massive Biliary Ascariasis, are reported on histological grounds. Severe cholangitis was the main finding, but other changes were also detected, such as pyloric and intestinal metaplasia, hyperplasia of the epithelial lining, with intraductal papillomas and adenomatous proliferation. Remnants of the worm were observed tightly adhered to the epithelium, forming microscopic intrahepatic calculi. Mucopolysaccharides, especially acid, showed to be strongly positive on the luminal border, and in proliferated glands around the ducts. The authors discuss the similarity between such findings and Oriental Cholangiohepatitis, and suggest that inflammation and the presence of the parasitic remnants are responsible for the hyperplastic and metaplastic changes, similarly with what occurs in chlonorchiasis, fascioliasis and schistosomiasis.