990 resultados para Automatic identification
Resumo:
This paper describes a method to automatically obtain, from a set of impedance measurements at different frequencies, an equivalent circuit composed of lumped elements based on the vector fitting algorithm. The method starts from the impedance measurement of the circuit and then, through the recursive use of vector fitting, identifies the circuit topology and the component values of lumped elements. The method can be expanded to include other components usually used in impedance spectroscopy. The method is firstly described and then two examples highlight the robustness of the method and showcase its applicability.
Resumo:
Signal subspace identification is a crucial first step in many hyperspectral processing algorithms such as target detection, change detection, classification, and unmixing. The identification of this subspace enables a correct dimensionality reduction, yielding gains in algorithm performance and complexity and in data storage. This paper introduces a new minimum mean square error-based approach to infer the signal subspace in hyperspectral imagery. The method, which is termed hyperspectral signal identification by minimum error, is eigen decomposition based, unsupervised, and fully automatic (i.e., it does not depend on any tuning parameters). It first estimates the signal and noise correlation matrices and then selects the subset of eigenvalues that best represents the signal subspace in the least squared error sense. State-of-the-art performance of the proposed method is illustrated by using simulated and real hyperspectral images.
Resumo:
This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do grau de Mestre em Engenharia Informática
Resumo:
Hyperspectral imaging sensors provide image data containing both spectral and spatial information from the Earth surface. The huge data volumes produced by these sensors put stringent requirements on communications, storage, and processing. This paper presents a method, termed hyperspectral signal subspace identification by minimum error (HySime), that infer the signal subspace and determines its dimensionality without any prior knowledge. The identification of this subspace enables a correct dimensionality reduction yielding gains in algorithm performance and complexity and in data storage. HySime method is unsupervised and fully-automatic, i.e., it does not depend on any tuning parameters. The effectiveness of the proposed method is illustrated using simulated data based on U.S.G.S. laboratory spectra and real hyperspectral data collected by the AVIRIS sensor over Cuprite, Nevada.
Resumo:
The electric utilities have large revenue losses annually due to commercial losses, which are caused mainly by fraud on the part of consumers and faulty meters. Automatic detection of such losses where there is a complex problem, given the large number of consumers and the high cost of each inspection, not to mention the wear of the relationship between company and consumer. Given the above, this paper aims to briefly present some methodologies applied by utilities to identify consumer frauds.
Resumo:
Enterococci are increasingly responsible for nosocomial infections worldwide. This study was undertaken to compare the identification and susceptibility profile using an automated MicrosScan system, PCR-based assay and disk diffusion assay of Enterococcus spp. We evaluated 30 clinical isolates of Enterococcus spp. Isolates were identified by MicrosScan system and PCR-based assay. The detection of antibiotic resistance genes (vancomycin, gentamicin, tetracycline and erythromycin) was also determined by PCR. Antimicrobial susceptibilities to vancomycin (30 µg), gentamicin (120 µg), tetracycline (30 µg) and erythromycin (15 µg) were tested by the automated system and disk diffusion method, and were interpreted according to the criteria recommended in CLSI guidelines. Concerning Enterococcus identification the general agreement between data obtained by the PCR method and by the automatic system was 90.0% (27/30). For all isolates of E. faecium and E. faecalis we observed 100% agreement. Resistance frequencies were higher in E. faecium than E. faecalis. The resistance rates obtained were higher for erythromycin (86.7%), vancomycin (80.0%), tetracycline (43.35) and gentamicin (33.3%). The correlation between disk diffusion and automation revealed an agreement for the majority of the antibiotics with category agreement rates of > 80%. The PCR-based assay, the van(A) gene was detected in 100% of vancomycin resistant enterococci. This assay is simple to conduct and reliable in the identification of clinically relevant enterococci. The data obtained reinforced the need for an improvement of the automated system to identify some enterococci.
Resumo:
Nowadays, reducing energy consumption is one of the highest priorities and biggest challenges faced worldwide and in particular in the industrial sector. Given the increasing trend of consumption and the current economical crisis, identifying cost reductions on the most energy-intensive sectors has become one of the main concerns among companies and researchers. Particularly in industrial environments, energy consumption is affected by several factors, namely production factors(e.g. equipments), human (e.g. operators experience), environmental (e.g. temperature), among others, which influence the way of how energy is used across the plant. Therefore, several approaches for identifying consumption causes have been suggested and discussed. However, the existing methods only provide guidelines for energy consumption and have shown difficulties in explaining certain energy consumption patterns due to the lack of structure to incorporate context influence, hence are not able to track down the causes of consumption to a process level, where optimization measures can actually take place. This dissertation proposes a new approach to tackle this issue, by on-line estimation of context-based energy consumption models, which are able to map operating context to consumption patterns. Context identification is performed by regression tree algorithms. Energy consumption estimation is achieved by means of a multi-model architecture using multiple RLS algorithms, locally estimated for each operating context. Lastly, the proposed approach is applied to a real cement plant grinding circuit. Experimental results prove the viability of the overall system, regarding both automatic context identification and energy consumption estimation.
Resumo:
ETL conceptual modeling is a very important activity in any data warehousing system project implementation. Owning a high-level system representation allowing for a clear identification of the main parts of a data warehousing system is clearly a great advantage, especially in early stages of design and development. However, the effort to model conceptually an ETL system rarely is properly rewarded. Translating ETL conceptual models directly into something that saves work and time on the concrete implementation of the system process it would be, in fact, a great help. In this paper we present and discuss a hybrid approach to this problem, combining the simplicity of interpretation and power of expression of BPMN on ETL systems conceptualization with the use of ETL patterns to produce automatically an ETL skeleton, a first prototype system, which has the ability to be executed in a commercial ETL tool like Kettle.
Resumo:
Text Mining has opened a vast array of possibilities concerning automatic information retrieval from large amounts of text documents. A variety of themes and types of documents can be easily analyzed. More complex features such as those used in Forensic Linguistics can gather deeper understanding from the documents, making possible performing di cult tasks such as author identi cation. In this work we explore the capabilities of simpler Text Mining approaches to author identification of unstructured documents, in particular the ability to distinguish poetic works from two of Fernando Pessoas' heteronyms: Alvaro de Campos and Ricardo Reis. Several processing options were tested and accuracies of 97% were reached, which encourage further developments.
Resumo:
Systemidentification, evolutionary automatic, data-driven model, fuzzy Takagi-Sugeno grammar, genotype interpretability, toxicity-prediction
Resumo:
Context: Understanding the process through which adolescents and young adults are trying legal and illegal substances is a crucial point for the development of tailored prevention and treatment programs. However, patterns of substance first use can be very complex when multiple substances are considered, requiring reduction into a few meaningful number of categories. Data: We used data from a survey on adolescent and young adult health conducted in 2002 in Switzerland. Answers from 2212 subjects aged 19 and 20 were included. The first consumption ever of 10 substances (tobacco, cannabis, medicine to get high, sniff (volatile substances, and inhalants), ecstasy, GHB, LSD, cocaine, methadone, and heroin) was considered for a grand total of 516 different patterns. Methods: In a first step, automatic clustering was used to decrease the number of patterns to 50. Then, two groups of substance use experts, three social field workers, and three toxicologists and health professionals, were asked to reduce them into a maximum of 10 meaningful categories. Results: Classifications obtained through our methodology are of practical interest by revealing associations invisible to purely automatic algorithms. The article includes a detailed analysis of both final classifications, and a discussion on the advantages and limitations of our approach.
Resumo:
Difficult tracheal intubation assessment is an important research topic in anesthesia as failed intubations are important causes of mortality in anesthetic practice. The modified Mallampati score is widely used, alone or in conjunction with other criteria, to predict the difficulty of intubation. This work presents an automatic method to assess the modified Mallampati score from an image of a patient with the mouth wide open. For this purpose we propose an active appearance models (AAM) based method and use linear support vector machines (SVM) to select a subset of relevant features obtained using the AAM. This feature selection step proves to be essential as it improves drastically the performance of classification, which is obtained using SVM with RBF kernel and majority voting. We test our method on images of 100 patients undergoing elective surgery and achieve 97.9% accuracy in the leave-one-out crossvalidation test and provide a key element to an automatic difficult intubation assessment system.
Resumo:
We have initiated a gene discovery program in Schistosoma mansoni based on the technique of Expressed Sequence Tags (ESTs), i.e. partial sequences of cDNAs obtained from single passes in automatic DNA sequencers. ESTs can be used to identify genese onf the basis of their homology whith sequences from other species deposited in DNA or protein databases. Trasncripts with sequences without matches in teh databases may represent novel parasite-specific genes. This approach has shown to be very efficient and in less than two years a broad range of novel genes has already been ascertained, more than doubling the number of known S. mansoni genes.
Resumo:
Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.