15 resultados para Local Cluster Neural Networks (LCNN)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an application of an Artificial Neural Network (ANN) to the prediction of stock market direction in the US. Using a multilayer perceptron neural network and a backpropagation algorithm for the training process, the model aims at learning the hidden patterns in the daily movement of the S&P500 to correctly identify if the market will be in a Trend Following or Mean Reversion behavior. The ANN is able to produce a successful investment strategy which outperforms the buy and hold strategy, but presents instability in its overall results which compromises its practical application in real life investment decisions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, a feed-forward, back-propagating Artificial Neural Network using the gradient descent algorithm is developed to forecast the directional movement of daily returns for WTI, gold and copper futures. Out-of-sample back-test results vary, with some predictive abilities for copper futures but none for either WTI or gold. The best statistically significant hit rate achieved was 57% for copper with an absolute return Sharpe Ratio of 1.25 and a benchmarked Information Ratio of 2.11.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada na Faculdade de Ciências e Tecnologiea da Universidade Nova de Lisboa, para obtenção do Grau de Mestre em Engenharia Biomédica

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last few years, we have observed an exponential increasing of the information systems, and parking information is one more example of them. The needs of obtaining reliable and updated information of parking slots availability are very important in the goal of traffic reduction. Also parking slot prediction is a new topic that has already started to be applied. San Francisco in America and Santander in Spain are examples of such projects carried out to obtain this kind of information. The aim of this thesis is the study and evaluation of methodologies for parking slot prediction and the integration in a web application, where all kind of users will be able to know the current parking status and also future status according to parking model predictions. The source of the data is ancillary in this work but it needs to be understood anyway to understand the parking behaviour. Actually, there are many modelling techniques used for this purpose such as time series analysis, decision trees, neural networks and clustering. In this work, the author explains the best techniques at this work, analyzes the result and points out the advantages and disadvantages of each one. The model will learn the periodic and seasonal patterns of the parking status behaviour, and with this knowledge it can predict future status values given a date. The data used comes from the Smart Park Ontinyent and it is about parking occupancy status together with timestamps and it is stored in a database. After data acquisition, data analysis and pre-processing was needed for model implementations. The first test done was with the boosting ensemble classifier, employed over a set of decision trees, created with C5.0 algorithm from a set of training samples, to assign a prediction value to each object. In addition to the predictions, this work has got measurements error that indicates the reliability of the outcome predictions being correct. The second test was done using the function fitting seasonal exponential smoothing tbats model. Finally as the last test, it has been tried a model that is actually a combination of the previous two models, just to see the result of this combination. The results were quite good for all of them, having error averages of 6.2, 6.6 and 5.4 in vacancies predictions for the three models respectively. This means from a parking of 47 places a 10% average error in parking slot predictions. This result could be even better with longer data available. In order to make this kind of information visible and reachable from everyone having a device with internet connection, a web application was made for this purpose. Beside the data displaying, this application also offers different functions to improve the task of searching for parking. The new functions, apart from parking prediction, were: - Park distances from user location. It provides all the distances to user current location to the different parks in the city. - Geocoding. The service for matching a literal description or an address to a concrete location. - Geolocation. The service for positioning the user. - Parking list panel. This is not a service neither a function, is just a better visualization and better handling of the information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fully comprehending brain function, as the scale of neural networks, will only be possi-ble with the development of tools by micro and nanofabrication. Regarding specifically silicon microelectrodes arrays, a significant improvement in long-term performance of these implants is essential. This project aims to create a silicon microelectrode coating that provides high-quality electrical recordings, while limiting the inflammatory response of chronic implants. To this purpose, a combined chitosan and gold nanoparticles coating was produced allied with electrodes modification by electrodeposition with PEDOT/PSS in order to reduce the im-pedance at 1kHz. Using a dip-coating mechanism, the silicon probe was coated and then charac-terized both morphologically and electrochemically, with focus on the stability of post-surgery performance in anesthetized rodents. Since not only the inflammatory response analysis is vital, the electrodes recording degradation over time was also studied. The produced film presented a thickness of approximately 50 μm that led to an increase of impedance of less than 20 kΩ in average. On a 3 week chronic implant, the impedance in-crease on the coated probe was of 641 kΩ, compared with 2.4 MΩ obtained for the uncoated probe. The inflammatory response was also significantly reduced due to the biocompatible film as proved by histological tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Polysaccharides are gaining increasing attention as potential environmental friendly and sustainable building blocks in many fields of the (bio)chemical industry. The microbial production of polysaccharides is envisioned as a promising path, since higher biomass growth rates are possible and therefore higher productivities may be achieved compared to vegetable or animal polysaccharides sources. This Ph.D. thesis focuses on the modeling and optimization of a particular microbial polysaccharide, namely the production of extracellular polysaccharides (EPS) by the bacterial strain Enterobacter A47. Enterobacter A47 was found to be a metabolically versatile organism in terms of its adaptability to complex media, notably capable of achieving high growth rates in media containing glycerol byproduct from the biodiesel industry. However, the industrial implementation of this production process is still hampered due to a largely unoptimized process. Kinetic rates from the bioreactor operation are heavily dependent on operational parameters such as temperature, pH, stirring and aeration rate. The increase of culture broth viscosity is a common feature of this culture and has a major impact on the overall performance. This fact complicates the mathematical modeling of the process, limiting the possibility to understand, control and optimize productivity. In order to tackle this difficulty, data-driven mathematical methodologies such as Artificial Neural Networks can be employed to incorporate additional process data to complement the known mathematical description of the fermentation kinetics. In this Ph.D. thesis, we have adopted such an hybrid modeling framework that enabled the incorporation of temperature, pH and viscosity effects on the fermentation kinetics in order to improve the dynamical modeling and optimization of the process. A model-based optimization method was implemented that enabled to design bioreactor optimal control strategies in the sense of EPS productivity maximization. It is also critical to understand EPS synthesis at the level of the bacterial metabolism, since the production of EPS is a tightly regulated process. Methods of pathway analysis provide a means to unravel the fundamental pathways and their controls in bioprocesses. In the present Ph.D. thesis, a novel methodology called Principal Elementary Mode Analysis (PEMA) was developed and implemented that enabled to identify which cellular fluxes are activated under different conditions of temperature and pH. It is shown that differences in these two parameters affect the chemical composition of EPS, hence they are critical for the regulation of the product synthesis. In future studies, the knowledge provided by PEMA could foster the development of metabolically meaningful control strategies that target the EPS sugar content and oder product quality parameters.