797 resultados para educational data mining


Relevância:

80.00% 80.00%

Publicador:

Resumo:

As plataformas de e-Learning são cada vez mais utilizadas na educação à distância, facto que se encontra diretamente relacionado com a possibilidade de proporcionarem aos seus alunos a valência de poderem assistir a cursos em qualquer lugar. Dentro do âmbito das plataformas de e-Learning encontra-se um grupo especialmente interessante: as plataformas adaptativas, que tendem a substituir o professor (presencial) através de interatividade, variabilidade de conteúdos, automatização e capacidade para resolução de problemas e simulação de comportamentos educacionais. O projeto ADAPT (plataforma adaptativa de e-Learning) consiste na criação de uma destas plataformas, implementando tutoria inteligente, resolução de problemas com base em experiências passadas, algoritmos genéticos e link-mining. É na área de link-mining que surge o desenvolvimento desta dissertação que documenta o desenvolvimento de quatro módulos distintos: O primeiro módulo consiste num motor de busca para sugestão de conteúdos alternativos; o segundo módulo consiste na identificação de mudanças de estilo de aprendizagem; o terceiro módulo consiste numa plataforma de análise de dados que implementa várias técnicas de data mining e estatística para fornecer aos professores/tutores informações importantes que não seriam visíveis sem recurso a este tipo de técnicas; por fim, o último módulo consiste num sistema de recomendações que sugere aos alunos os artigos mais adequados com base nas consultas de alunos com perfis semelhantes. Esta tese documenta o desenvolvimento dos vários protótipos para cada um destes módulos. Os testes efetuados para cada módulo mostram que as metodologias utilizadas são válidas e viáveis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As with all new ideas, the concept of Open Innovation requires extensive empirical investigation, testing and development. This paper analyzes Procter and Gamble's 'Connect and Develop' strategy as a case study of the major organizational and technological changes associated with open innovation. It argues that although some of the organizational changes accompanying open innovation are beginning to be described in the literature, more analysis is warranted into the ways technological changes have facilitated open innovation strategies, particularly related to new product development. Information and communications technologies enable the exchange of distributed sources of information in the open innovation process. The case study shows that furthermore a suite of new technologies for data mining, simulation, prototyping and visual representation, what we call 'innovation technology', help to support open innovation in Procter and Gamble. The paper concludes with a suggested research agenda for furthering understanding of the role played by and consequences of this technology.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sharing data among organizations often leads to mutual benefit. Recent technology in data mining has enabled efficient extraction of knowledge from large databases. This, however, increases risks of disclosing the sensitive knowledge when the database is released to other parties. To address this privacy issue, one may sanitize the original database so that the sensitive knowledge is hidden. The challenge is to minimize the side effect on the quality of the sanitized database so that nonsensitive knowledge can still be mined. In this paper, we study such a problem in the context of hiding sensitive frequent itemsets by judiciously modifying the transactions in the database. To preserve the non-sensitive frequent itemsets, we propose a border-based approach to efficiently evaluate the impact of any modification to the database during the hiding process. The quality of database can be well maintained by greedily selecting the modifications with minimal side effect. Experiments results are also reported to show the effectiveness of the proposed approach. © 2005 IEEE

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The design, development, and use of complex systems models raises a unique class of challenges and potential pitfalls, many of which are commonly recurring problems. Over time, researchers gain experience in this form of modeling, choosing algorithms, techniques, and frameworks that improve the quality, confidence level, and speed of development of their models. This increasing collective experience of complex systems modellers is a resource that should be captured. Fields such as software engineering and architecture have benefited from the development of generic solutions to recurring problems, called patterns. Using pattern development techniques from these fields, insights from communities such as learning and information processing, data mining, bioinformatics, and agent-based modeling can be identified and captured. Collections of such 'pattern languages' would allow knowledge gained through experience to be readily accessible to less-experienced practitioners and to other domains. This paper proposes a methodology for capturing the wisdom of computational modelers by introducing example visualization patterns, and a pattern classification system for analyzing the relationship between micro and macro behaviour in complex systems models. We anticipate that a new field of complex systems patterns will provide an invaluable resource for both practicing and future generations of modelers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. In this paper, we investigate the problem of evaluating the top k distinguished “features” for a “cluster” based on weighted proximity relationships between the cluster and features. We measure proximity in an average fashion to address possible nonuniform data distribution in a cluster. Combining a standard multi-step paradigm with new lower and upper proximity bounds, we presented an efficient algorithm to solve the problem. The algorithm is implemented in several different modes. Our experiment results not only give a comparison among them but also illustrate the efficiency of the algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A major task of traditional temporal event sequence mining is to find all frequent event patterns from a long temporal sequence. In many real applications, however, events are often grouped into different types, and not all types are of equal importance. In this paper, we consider the problem of efficient mining of temporal event sequences which lead to an instance of a specific type of event. Temporal constraints are used to ensure sensibility of the mining results. We will first generalise and formalise the problem of event-oriented temporal sequence data mining. After discussing some unique issues in this new problem, we give a set of criteria, which are adapted from traditional data mining techniques, to measure the quality of patterns to be discovered. Finally we present an algorithm to discover potentially interesting patterns.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pattern discovery in temporal event sequences is of great importance in many application domains, such as telecommunication network fault analysis. In reality, not every type of event has an accurate timestamp. Some of them, defined as inaccurate events may only have an interval as possible time of occurrence. The existence of inaccurate events may cause uncertainty in event ordering. The traditional support model cannot deal with this uncertainty, which would cause some interesting patterns to be missing. A new concept, precise support, is introduced to evaluate the probability of a pattern contained in a sequence. Based on this new metric, we define the uncertainty model and present an algorithm to discover interesting patterns in the sequence database that has one type of inaccurate event. In our model, the number of types of inaccurate events can be extended to k readily, however, at a cost of increasing computational complexity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Esta dissertação visa apresentar o mapeamento do uso das teorias de sistemas de informações, usando técnicas de recuperação de informação e metodologias de mineração de dados e textos. As teorias abordadas foram Economia de Custos de Transações (Transactions Costs Economics TCE), Visão Baseada em Recursos da Firma (Resource-Based View-RBV) e Teoria Institucional (Institutional Theory-IT), sendo escolhidas por serem teorias de grande relevância para estudos de alocação de investimentos e implementação em sistemas de informação, tendo como base de dados o conteúdo textual (em inglês) do resumo e da revisão teórica dos artigos dos periódicos Information System Research (ISR), Management Information Systems Quarterly (MISQ) e Journal of Management Information Systems (JMIS) no período de 2000 a 2008. Os resultados advindos da técnica de mineração textual aliada à mineração de dados foram comparadas com a ferramenta de busca avançada EBSCO e demonstraram uma eficiência maior na identificação de conteúdo. Os artigos fundamentados nas três teorias representaram 10% do total de artigos dos três períodicos e o período mais profícuo de publicação foi o de 2001 e 2007.(AU)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Purpose – Academic writing is often considered to be a weakness in contemporary students, while good reporting and writing skills are highly valued by graduate employers. A number of universities have introduced writing centres aimed at addressing this problem; however, the evaluation of such centres is usually qualitative. The paper seeks to consider the efficacy of a writing centre by looking at the impact of attendance on two “real world” quantitative outcomes – achievement and progression. Design/methodology/approach – Data mining was used to obtain records of 806 first-year students, of whom 45 had attended the writing centre and 761 had not. Findings – A highly significant association between writing centre attendance and achievement was found. Progression to year two was also significantly associated with writing centre attendance. Originality/value – Further, quantitative evaluation of writing centres is advocated using random allocation to a comparison condition to control for potential confounds such as motivation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last two decades there have been substantial developments in the mathematical theory of inverse optimization problems, and their applications have expanded greatly. In parallel, time series analysis and forecasting have become increasingly important in various fields of research such as data mining, economics, business, engineering, medicine, politics, and many others. Despite the large uses of linear programming in forecasting models there is no a single application of inverse optimization reported in the forecasting literature when the time series data is available. Thus the goal of this paper is to introduce inverse optimization into forecasting field, and to provide a streamlined approach to time series analysis and forecasting using inverse linear programming. An application has been used to demonstrate the use of inverse forecasting developed in this study. © 2007 Elsevier Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Customer Base Analysis is perhaps the first stage of analysis in customer value, aiming to predict purchase frequency and customer lifecycle. An important part of the customer purchase frequency and its retention has to do with the service upgrade. Many models have tried to predict purchase frequency as well as upgrading. The comparison of these models seems important to provide academics with a picture of the current situation. The purpose of this research is to evaluate how models can predict service upgrade among a customer database of an online DVD rental company and suggest an alternative based on data mining techniques and data on historical transactions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.