75 resultados para Information discovery


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Based in internet growth, through semantic web, together with communication speed improvement and fast development of storage device sizes, data and information volume rises considerably every day. Because of this, in the last few years there has been a growing interest in structures for formal representation with suitable characteristics, such as the possibility to organize data and information, as well as the reuse of its contents aimed for the generation of new knowledge. Controlled Vocabulary, specifically Ontologies, present themselves in the lead as one of such structures of representation with high potential. Not only allow for data representation, as well as the reuse of such data for knowledge extraction, coupled with its subsequent storage through not so complex formalisms. However, for the purpose of assuring that ontology knowledge is always up to date, they need maintenance. Ontology Learning is an area which studies the details of update and maintenance of ontologies. It is worth noting that relevant literature already presents first results on automatic maintenance of ontologies, but still in a very early stage. Human-based processes are still the current way to update and maintain an ontology, which turns this into a cumbersome task. The generation of new knowledge aimed for ontology growth can be done based in Data Mining techniques, which is an area that studies techniques for data processing, pattern discovery and knowledge extraction in IT systems. This work aims at proposing a novel semi-automatic method for knowledge extraction from unstructured data sources, using Data Mining techniques, namely through pattern discovery, focused in improving the precision of concept and its semantic relations present in an ontology. In order to verify the applicability of the proposed method, a proof of concept was developed, presenting its results, which were applied in building and construction sector.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese de doutoramento em Engenharia do Ambiente, especialidade em Sistemas Sociais

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do Grau de Mestre em Engenharia do Ambiente, perfil Gestão e Sistemas Ambientais

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation presented to obtain a Masters degree in Computer Science

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfillment of the requirements for the degree of Master in Computer Science

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thesis submitted to the Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Information Management – Geographic Information Systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada para obtenção de Grau de Doutor em Bioquímica,Bioquímica Estrutural, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thèse pour obtenir le grade de DOCTEUR DE L' UNIVERSITÉ PARIS XII, Discipline: Urbanisme Aménagement