17 resultados para Methods: Data Analysis
em Instituto Politécnico do Porto, Portugal
Resumo:
This document presents a tool able to automatically gather data provided by real energy markets and to generate scenarios, capture and improve market players’ profiles and strategies by using knowledge discovery processes in databases supported by artificial intelligence techniques, data mining algorithms and machine learning methods. It provides the means for generating scenarios with different dimensions and characteristics, ensuring the representation of real and adapted markets, and their participating entities. The scenarios generator module enhances the MASCEM (Multi-Agent Simulator of Competitive Electricity Markets) simulator, endowing a more effective tool for decision support. The achievements from the implementation of the proposed module enables researchers and electricity markets’ participating entities to analyze data, create real scenarios and make experiments with them. On the other hand, applying knowledge discovery techniques to real data also allows the improvement of MASCEM agents’ profiles and strategies resulting in a better representation of real market players’ behavior. This work aims to improve the comprehension of electricity markets and the interactions among the involved entities through adequate multi-agent simulation.
Resumo:
Controlled fires in forest areas are frequently used in most Mediterranean countries as a preventive technique to avoid severe wildfires in summer season. In Portugal, this forest management method of fuel mass availability is also used and has shown to be beneficial as annual statistical reports confirm that the decrease of wildfires occurrence have a direct relationship with the controlled fire practice. However prescribed fire can have serious side effects in some forest soil properties. This work shows the changes that occurred in some forest soils properties after a prescribed fire action. The experiments were carried out in soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, Portugal, that had not been burn for four years. The composed soil samples were collected from five plots at three different layers (0-3cm, 3-6cm and 6-18cm) during a three-year monitoring period after the prescribed burning. Principal Component Analysis was used to reach the presented conclusions.
Resumo:
The industrial activity is inevitably associated with a certain degradation of the environmental quality, because is not possible to guarantee that a manufacturing process can be totally innocuous. The eco-efficiency concept is globally accepted as a philosophy of entreprise management, that encourages the companies to become more competitive, innovative and environmentally responsible by promoting the link between its companies objectives for excellence and its objectives of environmental excellence issues. This link imposes the creation of an organizational methodology where the performance of the company is concordant with the sustainable development. The main propose of this project is to apply the concept of eco-efficiency to the particular case of the metallurgical and metal workshop industries through the development of the particular indicators needed and to produce a manual of procedures for implementation of the accurate solution.
Resumo:
This paper presents the creation and development of technological schools directly linked to the business community and to higher public education. Establishing themselves as the key interface between the two sectors they make a signigicant contribution by having a greater competitive edge when faced with increasing competition in the tradional markets. The development of new business strategies supported by references of excellence, quality and competitiveness also provides a good link between the estalishment of partnerships aiming at the qualification of education boards at a medium level between the technological school and higher education with a technological foundation. We present a case study as an example depicting the success of Escola Tecnológica de Vale de Cambra.
Resumo:
Objectives : The purpose of this article is to find out differences between surveys using paper and online questionnaires. The author has deep knowledge in the case of questions concerning opinions in the development of survey based research, e.g. the limits of postal and online questionnaires. Methods : In the physician studies carried out in 1995 (doctors graduated in 1982-1991), 2000 (doctors graduated in 1982-1996), 2005 (doctors graduated in 1982-2001), 2011 (doctors graduated in 1977-2006) and 457 family doctors in 2000, were used paper and online questionnaires. The response rates were 64%, 68%, 64%, 49% and 73%, respectively. Results : The results of the physician studies showed that there were differences between methods. These differences were connected with using paper-based questionnaire and online questionnaire and response rate. The online-based survey gave a lower response rate than the postal survey. The major advantages of online survey were short response time; very low financial resource needs and data were directly loaded in the data analysis software, thus saved time and resources associated with the data entry process. Conclusions : The current article helps researchers with planning the study design and choosing of the right data collection method.
Resumo:
Seismic data is difficult to analyze and classical mathematical tools reveal strong limitations in exposing hidden relationships between earthquakes. In this paper, we study earthquake phenomena in the perspective of complex systems. Global seismic data, covering the period from 1962 up to 2011 is analyzed. The events, characterized by their magnitude, geographic location and time of occurrence, are divided into groups, either according to the Flinn-Engdahl (F-E) seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Two methods of analysis are considered and compared in this study. In a first method, the distributions of magnitudes are approximated by Gutenberg-Richter (G-R) distributions and the parameters used to reveal the relationships among regions. In the second method, the mutual information is calculated and adopted as a measure of similarity between regions. In both cases, using clustering analysis, visualization maps are generated, providing an intuitive and useful representation of the complex relationships that are present among seismic data. Such relationships might not be perceived on classical geographic maps. Therefore, the generated charts are a valid alternative to other visualization tools, for understanding the global behavior of earthquakes.
Resumo:
Beyond the classical statistical approaches (determination of basic statistics, regression analysis, ANOVA, etc.) a new set of applications of different statistical techniques has increasingly gained relevance in the analysis, processing and interpretation of data concerning the characteristics of forest soils. This is possible to be seen in some of the recent publications in the context of Multivariate Statistics. These new methods require additional care that is not always included or refered in some approaches. In the particular case of geostatistical data applications it is necessary, besides to geo-reference all the data acquisition, to collect the samples in regular grids and in sufficient quantity so that the variograms can reflect the spatial distribution of soil properties in a representative manner. In the case of the great majority of Multivariate Statistics techniques (Principal Component Analysis, Correspondence Analysis, Cluster Analysis, etc.) despite the fact they do not require in most cases the assumption of normal distribution, they however need a proper and rigorous strategy for its utilization. In this work, some reflections about these methodologies and, in particular, about the main constraints that often occur during the information collecting process and about the various linking possibilities of these different techniques will be presented. At the end, illustrations of some particular cases of the applications of these statistical methods will also be presented.
Resumo:
Complex industrial plants exhibit multiple interactions among smaller parts and with human operators. Failure in one part can propagate across subsystem boundaries causing a serious disaster. This paper analyzes the industrial accident data series in the perspective of dynamical systems. First, we process real world data and show that the statistics of the number of fatalities reveal features that are well described by power law (PL) distributions. For early years, the data reveal double PL behavior, while, for more recent time periods, a single PL fits better into the experimental data. Second, we analyze the entropy of the data series statistics over time. Third, we use the Kullback–Leibler divergence to compare the empirical data and multidimensional scaling (MDS) techniques for data analysis and visualization. Entropy-based analysis is adopted to assess complexity, having the advantage of yielding a single parameter to express relationships between the data. The classical and the generalized (fractional) entropy and Kullback–Leibler divergence are used. The generalized measures allow a clear identification of patterns embedded in the data.
Resumo:
Trabalho de Projeto apresentado ao Instituto Superior de Contabilidade e Administração do Porto para obtenção do grau de Mestre em Auditoria Orientado por: Doutora Alcina Augusta de Sena Portugal Dias
Resumo:
Catastrophic events, such as wars and terrorist attacks, tornadoes and hurricanes, earthquakes, tsunamis, floods and landslides, are always accompanied by a large number of casualties. The size distribution of these casualties has separately been shown to follow approximate power law (PL) distributions. In this paper, we analyze the statistical distributions of the number of victims of catastrophic phenomena, in particular, terrorism, and find double PL behavior. This means that the data sets are better approximated by two PLs instead of a single one. We plot the PL parameters, corresponding to several events, and observe an interesting pattern in the charts, where the lines that connect each pair of points defining the double PLs are almost parallel to each other. A complementary data analysis is performed by means of the computation of the entropy. The results reveal relationships hidden in the data that may trigger a future comprehensive explanation of this type of phenomena.
Resumo:
This study analysed 22 strawberry and soil samples after their collection over the course of 2 years to compare the residue profiles from organic farming with integrated pest management practices in Portugal. For sample preparation, we used the citrate-buffered version of the quick, easy, cheap, effective, rugged, and safe (QuEChERS) method. We applied three different methods for analysis: (1) 27 pesticides were targeted using LC-MS/MS; (2) 143 were targeted using low pressure GC-tandem mass spectrometry (LP-GC-MS/MS); and (3) more than 600 pesticides were screened in a targeted and untargeted approach using comprehensive, two-dimensional gas chromatography time-of-flight mass spectrometry (GC × GC-TOF-MS). Comparison was made of the analyses using the different methods for the shared samples. The results were similar, thereby providing satisfactory confirmation of both similarly positive and negative findings. No pesticides were found in the organic-farmed samples. In samples from integrated pest management practices, nine pesticides were determined and confirmed to be present, ranging from 2 μg kg−1 for fluazifop-pbutyl to 50 μg kg−1 for fenpropathrin. Concentrations of residues in strawberries were less than European maximum residue limits.
Resumo:
Mestrado em Engenharia Informática - Área de Especialização em Tecnologias do Conhecimento e Decisão
Resumo:
This paper presents the Realistic Scenarios Generator (RealScen), a tool that processes data from real electricity markets to generate realistic scenarios that enable the modeling of electricity market players’ characteristics and strategic behavior. The proposed tool provides significant advantages to the decision making process in an electricity market environment, especially when coupled with a multi-agent electricity markets simulator. The generation of realistic scenarios is performed using mechanisms for intelligent data analysis, which are based on artificial intelligence and data mining algorithms. These techniques allow the study of realistic scenarios, adapted to the existing markets, and improve the representation of market entities as software agents, enabling a detailed modeling of their profiles and strategies. This work contributes significantly to the understanding of the interactions between the entities acting in electricity markets by increasing the capability and realism of market simulations.
Resumo:
Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment.
Resumo:
O presente documento de dissertação retrata o desenvolvimento do projeto PDS-Portal Institucional cujo cerne é um sistema para recolha, armazenamento e análise de dados (plataforma de Business Intelligence). Este portal está enquadrado na área da saúde e é uma peça fundamental no sistema da Plataforma de dados da Saúde, que é constituído por quatro portais distintos. Esta plataforma tem como base um sistema totalmente centrado no utente, que agrega dados de saúde dos utentes e distribui pelos diversos intervenientes: utente, profissionais de saúde nacionais e internacionais e organizações de saúde. O objetivo principal deste projeto é o desenvolvimento do PDS-Portal Institucional, recorrendo a uma plataforma de Business Intelligence, com o intuito de potenciar os utilizadores de uma ferramenta analítica para análise de dados. Estando a informação armazenada em dois dos portais da Plataforma de dados da Saúde (PDS-Portal Utente e PDS-Portal Profissional), é necessário modular um armazém de dados que agregue a informação de ambos e, através do PDS-PI, distribua um conjunto de análises ao utilizador final. Para tal este sistema comtempla um mecanismo totalmente automatizado para extração, tratamento e carregamento de dados para o armazém central, assim como uma plataforma de BI que disponibiliza os dados armazenados sobre a forma de análises específicas. Esta plataforma permite uma evolução constante e é extremamente flexível, pois fornece um mecanismo de gestão de utilizadores e perfis, assim como capacita o utilizador de um ambiente Web para análise de dados, permitindo a partilha e acesso a partir de dispositivos móveis. Após a implementação deste sistema foi possível explorar os dados e tirar diversas conclusões que são de extrema importância tanto para a evolução da PDS como para os métodos de praticar os cuidados de saúde em Portugal. Por fim são identificados alguns pontos de melhoria do sistema atual e delineada uma perspetiva de evolução futura. É certo que a partir do momento que este projeto seja lançado para produção, novas oportunidades surgirão e o contributo dos utilizadores será útil para evoluir o sistema progressivamente.