52 resultados para Data mining, Business intelligence, Previsioni di mercato


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho de Projecto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho de Projeto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para a obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho de Projeto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica, Sistemas e Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Atualmente o setor segurador enfrenta diversas dificuldades, não só pela crise económica internacional e pelo mercado cada vez mais competitivo, como também pelas exigências impostas pela entidade reguladora - Instituto de Seguros de Portugal (ISP). Desta forma, apenas as seguradoras que consigam monitorizar os seus riscos, adequando os prémios praticados, conseguirão sobreviver. A forma de o fazer é através de uma adequada tarifação. Neste contexto de elevada instabilidade, as plataformas de Business Intelligence (BI) têm vindo a desempenhar um papel cada vez mais importante no processo de tomada de decisão, nomeadamente, o Business Analytics (BA), que proporciona os métodos e ferramentas de análise. O objetivo deste projeto é desenvolver um protótipo de solução de BA que forneça os inputs necessários ao processo de tomada de decisão, através da monitorização da tarifa em vigor e da simulação do impacto da introdução de uma nova tarifa. A solução desenvolvida apenas abrange a tarifa de responsabilidade civil automóvel (RCA). Ao nível das ferramentas analíticas, o foco foi a análise visual, nomeadamente a construção de dashboards, onde se inclui a análise de sensibilidade ou what-if analysis (WIF). A motivação para o desenvolvimento deste projeto foi a constatação de inexistência de soluções para este fim nos ambientes profissionais em que estive envolvido.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, a significant increase on the demand for interoperable systems for exchanging data in business collaborative environments has been noticed. Consequently, cooperation agreements between each of the involved enterprises have been brought to light. However, due to the fact that even in a same community or domain, there is a big variety of knowledge representation not semantically coincident, which embodies the existence of interoperability problems in the enterprises information systems that need to be addressed. Moreover, in relation to this, most organizations face other problems about their information systems, as: 1) domain knowledge not being easily accessible by all the stakeholders (even intra-enterprise); 2) domain knowledge not being represented in a standard format; 3) and even if it is available in a standard format, it is not supported by semantic annotations or described using a common and understandable lexicon. This dissertation proposes an approach for the establishment of an enterprise reference lexicon from business models. It addresses the automation in the information models mapping for the reference lexicon construction. It aggregates a formal and conceptual representation of the business domain, with a clear definition of the used lexicon to facilitate an overall understanding by all the involved stakeholders, including non-IT personnel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Based in internet growth, through semantic web, together with communication speed improvement and fast development of storage device sizes, data and information volume rises considerably every day. Because of this, in the last few years there has been a growing interest in structures for formal representation with suitable characteristics, such as the possibility to organize data and information, as well as the reuse of its contents aimed for the generation of new knowledge. Controlled Vocabulary, specifically Ontologies, present themselves in the lead as one of such structures of representation with high potential. Not only allow for data representation, as well as the reuse of such data for knowledge extraction, coupled with its subsequent storage through not so complex formalisms. However, for the purpose of assuring that ontology knowledge is always up to date, they need maintenance. Ontology Learning is an area which studies the details of update and maintenance of ontologies. It is worth noting that relevant literature already presents first results on automatic maintenance of ontologies, but still in a very early stage. Human-based processes are still the current way to update and maintain an ontology, which turns this into a cumbersome task. The generation of new knowledge aimed for ontology growth can be done based in Data Mining techniques, which is an area that studies techniques for data processing, pattern discovery and knowledge extraction in IT systems. This work aims at proposing a novel semi-automatic method for knowledge extraction from unstructured data sources, using Data Mining techniques, namely through pattern discovery, focused in improving the precision of concept and its semantic relations present in an ontology. In order to verify the applicability of the proposed method, a proof of concept was developed, presenting its results, which were applied in building and construction sector.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The interest in using information to improve the quality of living in large urban areas and its governance efficiency has been around for decades. Nevertheless, the improvements in Information and Communications Technology has sparked a new dynamic in academic research, usually under the umbrella term of Smart Cities. This concept of Smart City can probably be translated, in a simplified version, into cities that are lived, managed and developed in an information-saturated environment. While it makes perfect sense and we can easily foresee the benefits of such a concept, presently there are still several significant challenges that need to be tackled before we can materialize this vision. In this work we aim at providing a small contribution in this direction, which maximizes the relevancy of the available information resources. One of the most detailed and geographically relevant information resource available, for the study of cities, is the census, more specifically the data available at block level (Subsecção Estatística). In this work, we use Self-Organizing Maps (SOM) and the variant Geo-SOM to explore the block level data from the Portuguese census of Lisbon city, for the years of 2001 and 2011. We focus on gauging change, proposing ways that allow the comparison of the two time periods, which have two different underlying geographical bases. We proceed with the analysis of the data using different SOM variants, aiming at producing a two-fold portrait: one, of the evolution of Lisbon during the first decade of the XXI century, another, of how the census dataset and SOM’s can be used to produce an informational framework for the study of cities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relatório de Projecto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação