5 resultados para classifiers
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
The life of humans and most living beings depend on sensation and perception for the best assessment of the surrounding world. Sensorial organs acquire a variety of stimuli that are interpreted and integrated in our brain for immediate use or stored in memory for later recall. Among the reasoning aspects, a person has to decide what to do with available information. Emotions are classifiers of collected information, assigning a personal meaning to objects, events and individuals, making part of our own identity. Emotions play a decisive role in cognitive processes as reasoning, decision and memory by assigning relevance to collected information. The access to pervasive computing devices, empowered by the ability to sense and perceive the world, provides new forms of acquiring and integrating information. But prior to data assessment on its usefulness, systems must capture and ensure that data is properly managed for diverse possible goals. Portable and wearable devices are now able to gather and store information, from the environment and from our body, using cloud based services and Internet connections. Systems limitations in handling sensorial data, compared with our sensorial capabilities constitute an identified problem. Another problem is the lack of interoperability between humans and devices, as they do not properly understand human’s emotional states and human needs. Addressing those problems is a motivation for the present research work. The mission hereby assumed is to include sensorial and physiological data into a Framework that will be able to manage collected data towards human cognitive functions, supported by a new data model. By learning from selected human functional and behavioural models and reasoning over collected data, the Framework aims at providing evaluation on a person’s emotional state, for empowering human centric applications, along with the capability of storing episodic information on a person’s life with physiologic indicators on emotional states to be used by new generation applications.