40 resultados para Web Mining, Data Mining, User Topic Model, Web User Profiles
em Instituto Politécnico do Porto, Portugal
Resumo:
This paper consists in the characterization of medium voltage (MV) electric power consumers based on a data clustering approach. It is intended to identify typical load profiles by selecting the best partition of a power consumption database among a pool of data partitions produced by several clustering algorithms. The best partition is selected using several cluster validity indices. These methods are intended to be used in a smart grid environment to extract useful knowledge about customers’ behavior. The data-mining-based methodology presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partitions. To validate our approach, a case study with a real database of 1.022 MV consumers was used.
Resumo:
Para obtenção do grau de Doutor pela Universidade de Vigo com menção internacional Departamento de Informática
Resumo:
The performance of the Weather Research and Forecast (WRF) model in wind simulation was evaluated under different numerical and physical options for an area of Portugal, located in complex terrain and characterized by its significant wind energy resource. The grid nudging and integration time of the simulations were the tested numerical options. Since the goal is to simulate the near-surface wind, the physical parameterization schemes regarding the boundary layer were the ones under evaluation. Also, the influences of the local terrain complexity and simulation domain resolution on the model results were also studied. Data from three wind measuring stations located within the chosen area were compared with the model results, in terms of Root Mean Square Error, Standard Deviation Error and Bias. Wind speed histograms, occurrences and energy wind roses were also used for model evaluation. Globally, the model accurately reproduced the local wind regime, despite a significant underestimation of the wind speed. The wind direction is reasonably simulated by the model especially in wind regimes where there is a clear dominant sector, but in the presence of low wind speeds the characterization of the wind direction (observed and simulated) is very subjective and led to higher deviations between simulations and observations. Within the tested options, results show that the use of grid nudging in simulations that should not exceed an integration time of 2 days is the best numerical configuration, and the parameterization set composed by the physical schemes MM5–Yonsei University–Noah are the most suitable for this site. Results were poorer in sites with higher terrain complexity, mainly due to limitations of the terrain data supplied to the model. The increase of the simulation domain resolution alone is not enough to significantly improve the model performance. Results suggest that error minimization in the wind simulation can be achieved by testing and choosing a suitable numerical and physical configuration for the region of interest together with the use of high resolution terrain data, if available.
Resumo:
Este trabalho foi realizado no âmbito da disciplina de Dissertação/Estágio do ramo de Optimização Energética na Indústria Química, do Mestrado em Engenharia Química do Instituto Superior de Engenharia do Porto e foi desenvolvido na empresa GreenWatt. O principal objectivo é efectuar uma auditoria energética e uma auditoria QAI a uma clínica de fisiatria de forma a preparar as ferramentas necessárias para a Certificação Energética e da QAI no enquadramento do Sistema de Certificação Energética. Na auditoria QAI foram analisados parâmetros físicos - temperatura, humidade relativa e partículas respiráveis PM10, parâmetros químicos - CO2, CO, O3, COVs, HCOH e o radão, e ainda parâmetros microbiológicos - bactérias, fungos e legionella. Na auditoria energética foi feita a caracterização dos vectores de energia utilizados no edifício, nomeadamente, gás natural e electricidade. Para esta caracterização efectuou-se um levantamento de toda a informação disponível relativa aos combustíveis utilizados, iluminação instalada, outros equipamentos consumidores de energia e perfis de utilização. Com recurso a analisadores de energia foram ainda medidos os consumos eléctricos do edifício. Com suporte nos dados provenientes da auditoria energética e das facturas anuais efectuou-se a validação da simulação dinâmica do edifício. Esta simulação é a base do cálculo do IEEnominal do edifício. Os resultados da auditoria QAI, permitiram verificar que existem valores nãoregulamentares em relação aos compostos orgânicos voláteis, fungos e bactérias. Da auditoria energética concluiu-se que o principal consumo de energia é o gás natural utilizado pelas caldeiras existentes. Este valor representa cerca de 81% do consumo total de energia, reproduzindo os mesmos resultados obtidos pela desagregação das facturas energéticas. No que respeita à electricidade concluiu-se que as bombas de água e os equipamentos eléctricos são os maiores consumidores deste vector, com, respectivamente, 53% e 23% do consumo total de energia eléctrica. Após a realização da simulação dinâmica, com base nos levantamentos realizados no edifício e na auditoria energética efectuada, obteve-se uma fotografia do edifício no que respeita ao seu desempenho energético, e calculou-se um IEEnominal de 40,54 kgep/m2.ano o que qualifica o edifício com uma Classe Energética E. O valor de CO2 emitido por este edifício em termos nominais, anualmente, é de 76,39 toneladas.
Resumo:
E-Learning frameworks are conceptual tools to organize networks of elearning services. Most frameworks cover areas that go beyond the scope of e-learning, from course to financial management, and neglects the typical activities in everyday life of teachers and students at schools such as the creation, delivery, resolution and evaluation of assignments. This paper presents the Ensemble framework - an e-learning framework exclusively focused on the teaching-learning process through the coordination of pedagogical services. The framework presents an abstract data, integration and evaluation model based on content and communications specifications. These specifications must base the implementation of networks in specialized domains with complex evaluations. In this paper we specialize the framework for two domains with complex evaluation: computer programming and computer-aided design (CAD). For each domain we highlight two Ensemble hotspots: data and evaluations procedures. In the former we formally describe the exercise and present possible extensions. In the latter, we describe the automatic evaluation procedures.
Resumo:
A new general fitting method based on the Self-Similar (SS) organization of random sequences is presented. The proposed analytical function helps to fit the response of many complex systems when their recorded data form a self-similar curve. The verified SS principle opens new possibilities for the fitting of economical, meteorological and other complex data when the mathematical model is absent but the reduced description in terms of some universal set of the fitting parameters is necessary. This fitting function is verified on economical (price of a commodity versus time) and weather (the Earth’s mean temperature surface data versus time) and for these nontrivial cases it becomes possible to receive a very good fit of initial data set. The general conditions of application of this fitting method describing the response of many complex systems and the forecast possibilities are discussed.
Resumo:
This contribution presents novel concepts for analysis of pressure–volume curves, which offer information about the time domain dynamics of the respiratory system. The aim is to verify whether a mapping of the respiratory diseases can be obtained, allowing analysis of (dis)similarities between the dynamical pattern in the breathing in children. The groups investigated here are children, diagnosed as healthy, asthmatic, and cystic fibrosis. The pressure–volume curves have been measured by means of the noninvasive forced oscillation technique during breathing at rest. The geometrical fractal dimension is extracted from the pressure–volume curves and a power-law behavior is observed in the data. The power-law model coefficients are identified from the three sets and the results show that significant differences are present between the groups. This conclusion supports the idea that the respiratory system changes with disease in terms of airway geometry, tissue parameters, leading in turn to variations in the fractal dimension of the respiratory tree and its dynamics.
Resumo:
Context-aware recommendation of personalised tourism resources is possible because of personal mobile devices and powerful data filtering algorithms. The devices contribute with computing capabilities, on board sensors, ubiquitous Internet access and continuous user monitoring, whereas the filtering algorithms provide the ability to match the profile (interests and the context) of the tourist against a large knowledge bases of tourism resources. While, in terms of technology, personal mobile devices can gather user-related information, including the user context and access multiple data sources, the creation and maintenance of an updated knowledge base of tourism-related resources requires a collaborative approach due to the heterogeneity, volume and dynamic nature of the resources. The current PhD thesis aims to contribute to the solution of this problem by adopting a Crowdsourcing approach for the collaborative maintenance of the knowledge base of resources, Trust and Reputation for the validation of uploaded resources as well as publishers, Big Data for user profiling and context-aware filtering algorithms for the personalised recommendation of tourism resources.
Resumo:
As technology is increasingly being seen as a facilitator to learning, open remote laboratories are increasingly available and in widespread use around the world. They provide some advantages over traditional hands-on labs or simulations. This paper presents the results of integrating the open remote laboratory VISIR into several courses, in various contexts and using various methodologies. These integrations, all related to higher education engineering, were designed by teachers with different perspectives to achieve a range of learning outcomes. The degree to which these VISIR-related outcomes were accomplished is discussed. The results reflect the levels of student engagement and learning and of teacher involvement. From the analysis, a connection between these two aspects was traced, although only related to the user profiles. VISIR is shown to be always of benefit for more motivated students, but this benefit can be maximized under particular conditions and characteristics.
Resumo:
This paper reports on the design and development of an Android-based context-aware system to support Erasmus students during their mobility in Porto. It enables: (i) guest users to create, rate and store personal points of interest (POI) in a private, local on board database; and (ii) authenticated users to upload and share POI as well as get and rate recommended POI from the shared central database. The system is a distributed client / server application. The server interacts with a central database that maintains the user profiles and the shared POI organized by category and rating. The Android GUI application works both as a standalone application and as a client module. In standalone mode, guest users have access to generic info, a map-based interface and a local database to store and retrieve personal POI. Upon successful authentication, users can, additionally, share POI as well as get and rate recommendations sorted by category, rating and distance-to-user.
Resumo:
O aumento de tecnologias disponíveis na Web favoreceu o aparecimento de diversas formas de informação, recursos e serviços. Este aumento aliado à constante necessidade de formação e evolução das pessoas, quer a nível pessoal como profissional, incentivou o desenvolvimento área de sistemas de hipermédia adaptativa educacional - SHAE. Estes sistemas têm a capacidade de adaptar o ensino consoante o modelo do aluno, características pessoais, necessidades, entre outros aspetos. Os SHAE permitiram introduzir mudanças relativamente à forma de ensino, passando do ensino tradicional que se restringia apenas ao uso de livros escolares até à utilização de ferramentas informáticas que através do acesso à internet disponibilizam material didático, privilegiando o ensino individualizado. Os SHAE geram grande volume de dados, informação contida no modelo do aluno e todos os dados relativos ao processo de aprendizagem de cada aluno. Facilmente estes dados são ignorados e não se procede a uma análise cuidada que permita melhorar o conhecimento do comportamento dos alunos durante o processo de ensino, alterando a forma de aprendizagem de acordo com o aluno e favorecendo a melhoria dos resultados obtidos. O objetivo deste trabalho foi selecionar e aplicar algumas técnicas de Data Mining a um SHAE, PCMAT - Mathematics Collaborative Educational System. A aplicação destas técnicas deram origem a modelos de dados que transformaram os dados em informações úteis e compreensíveis, essenciais para a geração de novos perfis de alunos, padrões de comportamento de alunos, regras de adaptação e pedagógicas. Neste trabalho foram criados alguns modelos de dados recorrendo à técnica de Data Mining de classificação, abordando diferentes algoritmos. Os resultados obtidos permitirão definir novas regras de adaptação e padrões de comportamento dos alunos, poderá melhorar o processo de aprendizagem disponível num SHAE.
Resumo:
Doctoral Thesis in Information Systems and Technologies Area of Engineering and Manag ement Information Systems
Resumo:
This paper describes a methodology that was developed for the classification of Medium Voltage (MV) electricity customers. Starting from a sample of data bases, resulting from a monitoring campaign, Data Mining (DM) techniques are used in order to discover a set of a MV consumer typical load profile and, therefore, to extract knowledge regarding to the electric energy consumption patterns. In first stage, it was applied several hierarchical clustering algorithms and compared the clustering performance among them using adequacy measures. In second stage, a classification model was developed in order to allow classifying new consumers in one of the obtained clusters that had resulted from the previously process. Finally, the interpretation of the discovered knowledge are presented and discussed.
Resumo:
In recent decades, all over the world, competition in the electric power sector has deeply changed the way this sector’s agents play their roles. In most countries, electric process deregulation was conducted in stages, beginning with the clients of higher voltage levels and with larger electricity consumption, and later extended to all electrical consumers. The sector liberalization and the operation of competitive electricity markets were expected to lower prices and improve quality of service, leading to greater consumer satisfaction. Transmission and distribution remain noncompetitive business areas, due to the large infrastructure investments required. However, the industry has yet to clearly establish the best business model for transmission in a competitive environment. After generation, the electricity needs to be delivered to the electrical system nodes where demand requires it, taking into consideration transmission constraints and electrical losses. If the amount of power flowing through a certain line is close to or surpasses the safety limits, then cheap but distant generation might have to be replaced by more expensive closer generation to reduce the exceeded power flows. In a congested area, the optimal price of electricity rises to the marginal cost of the local generation or to the level needed to ration demand to the amount of available electricity. Even without congestion, some power will be lost in the transmission system through heat dissipation, so prices reflect that it is more expensive to supply electricity at the far end of a heavily loaded line than close to an electric power generation. Locational marginal pricing (LMP), resulting from bidding competition, represents electrical and economical values at nodes or in areas that may provide economical indicator signals to the market agents. This article proposes a data-mining-based methodology that helps characterize zonal prices in real power transmission networks. To test our methodology, we used an LMP database from the California Independent System Operator for 2009 to identify economical zones. (CAISO is a nonprofit public benefit corporation charged with operating the majority of California’s high-voltage wholesale power grid.) To group the buses into typical classes that represent a set of buses with the approximate LMP value, we used two-step and k-means clustering algorithms. By analyzing the various LMP components, our goal was to extract knowledge to support the ISO in investment and network-expansion planning.
Resumo:
This paper presents a methodology supported on the data base knowledge discovery process (KDD), in order to find out the failure probability of electrical equipments’, which belong to a real electrical high voltage network. Data Mining (DM) techniques are used to discover a set of outcome failure probability and, therefore, to extract knowledge concerning to the unavailability of the electrical equipments such us power transformers and high-voltages power lines. The framework includes several steps, following the analysis of the real data base, the pre-processing data, the application of DM algorithms, and finally, the interpretation of the discovered knowledge. To validate the proposed methodology, a case study which includes real databases is used. This data have a heavy uncertainty due to climate conditions for this reason it was used fuzzy logic to determine the set of the electrical components failure probabilities in order to reestablish the service. The results reflect an interesting potential of this approach and encourage further research on the topic.