844 resultados para Data mining and knowledge discovery


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed and collaborative data stream mining in a mobile computing environment is referred to as Pocket Data Mining PDM. Large amounts of available data streams to which smart phones can subscribe to or sense, coupled with the increasing computational power of handheld devices motivates the development of PDM as a decision making system. This emerging area of study has shown to be feasible in an earlier study using technological enablers of mobile software agents and stream mining techniques [1]. A typical PDM process would start by having mobile agents roam the network to discover relevant data streams and resources. Then other (mobile) agents encapsulating stream mining techniques visit the relevant nodes in the network in order to build evolving data mining models. Finally, a third type of mobile agents roam the network consulting the mining agents for a final collaborative decision, when required by one or more users. In this paper, we propose the use of distributed Hoeffding trees and Naive Bayes classifers in the PDM framework over vertically partitioned data streams. Mobile policing, health monitoring and stock market analysis are among the possible applications of PDM. An extensive experimental study is reported showing the effectiveness of the collaborative data mining with the two classifers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article clarifies what was done with the sub-7-man positions in data-mining Harold van der Heijden's 'HHdbIV' database of chess studies prior to its publication. It emphasises that only positions in the main lines of studies were examined and that the information about uniqueness of move was not incorporated in HHdbIV. There is some reflection on the separate technical and artistic dimensions of study evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article explores the contribution that artisanal and small-scale mining (ASM) makes to poverty reduction in Tanzania, based on data on gold and diamond mining in Mwanza Region. The evidence suggests that people working in mining or related services are less likely to be in poverty than those with other occupations. However, the picture is complex; while mining income can help reduce poverty and provide a buffer from livelihood shocks, peoples inability to obtain a formal mineral claim, or to effectively exploit their claims, contributes to insecurity. This is reinforced by a context in which ASM is peripheral to large-scale mining interests, is only gradually being addressed within national poverty reduction policies, and is segregated from district-level planning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The third chapter, data mining in education, examines potentials and constraints in the use of data mining in education, summarizing the potential they have to offer meaningful support to: students, teachers, tutors, authors, developers, researchers, and the education and training institutions in which they work and study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter introduces the latest practices and technologies in the interactive interpretation of environmental data. With environmental data becoming ever larger, more diverse and more complex, there is a need for a new generation of tools that provides new capabilities over and above those of the standard workhorses of science. These new tools aid the scientist in discovering interesting new features (and also problems) in large datasets by allowing the data to be explored interactively using simple, intuitive graphical tools. In this way, new discoveries are made that are commonly missed by automated batch data processing. This chapter discusses the characteristics of environmental science data, common current practice in data analysis and the supporting tools and infrastructure. New approaches are introduced and illustrated from the points of view of both the end user and the underlying technology. We conclude by speculating as to future developments in the field and what must be achieved to fulfil this vision.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Much recent research in SLA is guided by the hypothesis of L2 interface vulnerability (see Sorace 2005). This study contributes to this general project by examining the acquisition of two classes of subjunctive complement clauses in L2 Spanish: subjunctive complements of volitional predicates (purely syntactic) and subjunctive vs. indicative complements with negated epistemic matrix predicates, where the mood distinction is discourse dependent (thus involving the syntax-discourse interface). We provide an analysis of the volitional subjunctive in English and Spanish, suggesting that English learners of L2 Spanish need to access the functional projection Mood P and an uninterpretable modal feature on the Force head available to them from their formal English register grammar, and simultaneously must unacquire the structure of English for-to clauses. For negated epistemic predicates, our analysis maintains that they need to revalue the modal feature on the Force head from uninterpretable to interpretable, within the L2 grammar.With others (e.g. Borgonovo & Prévost 2003; Borgonovo, Bruhn de Garavito & Prévost 2005) and in line with Sorace's (2000, 2003, 2005) notion of interface vulnerability, we maintain that the latter case is more difficult for L2 learners, which is borne out in the data we present. However, the data also show that the indicative/subjunctive distinction with negated epistemics can be acquired by advanced stages of acquisition, questioning the notion of obligatory residual optionality for all properties which require the integration of syntactic and discourse information.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Twitter is both a micro-blogging service and a platform for public conversation. Direct conversation is facilitated in Twitter through the use of @’s (mentions) and replies. While the conversational element of Twitter is of particular interest to the marketing sector, relatively few data-mining studies have focused on this area. We analyse conversations associated with reciprocated mentions that take place in a data-set consisting of approximately 4 million tweets collected over a period of 28 days that contain at least one mention. We ignore tweet content and instead use the mention network structure and its dynamical properties to identify and characterise Twitter conversations between pairs of users and within larger groups. We consider conversational balance, meaning the fraction of content contributed by each party. The goal of this work is to draw out some of the mechanisms driving conversation in Twitter, with the potential aim of developing conversational models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabalho apresenta um estudo de caso de mineração de dados no varejo. O negócio em questão é a comercialização de móveis e materiais de construção. A mineração foi realizada sobre informações geradas das transações de vendas por um período de 8 meses. Informações cadastrais de clientes também foram usadas e cruzadas com informações de venda, visando obter resultados que possam ser convertidos em ações que, por conseqüência, gerem lucro para a empresa. Toda a modelagem, preparação e transformação dos dados, foi feita visando facilitar a aplicação das técnicas de mineração que as ferramentas de mineração de dados proporcionam para a descoberta de conhecimento. O processo foi detalhado para uma melhor compreensão dos resultados obtidos. A metodologia CRISP usada no trabalho também é discutida, levando-se em conta as dificuldades e facilidades que se apresentaram durante as fases do processo de obtenção dos resultados. Também são analisados os pontos positivos e negativos das ferramentas de mineração utilizadas, o IBM Intelligent Miner e o WEKA - Waikato Environment for Knowledge Analysis, bem como de todos os outros softwares necessários para a realização do trabalho. Ao final, os resultados obtidos são apresentados e discutidos, sendo também apresentada a opinião dos proprietários da empresa sobre tais resultados e qual valor cada um deles poderá agregar ao negócio.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tendo como motivação o desenvolvimento de uma representação gráfica de redes com grande número de vértices, útil para aplicações de filtro colaborativo, este trabalho propõe a utilização de superfícies de coesão sobre uma base temática multidimensionalmente escalonada. Para isso, utiliza uma combinação de escalonamento multidimensional clássico e análise de procrustes, em algoritmo iterativo que encaminha soluções parciais, depois combinadas numa solução global. Aplicado a um exemplo de transações de empréstimo de livros pela Biblioteca Karl A. Boedecker, o algoritmo proposto produz saídas interpretáveis e coerentes tematicamente, e apresenta um stress menor que a solução por escalonamento clássico.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Industrial companies in developing countries are facing rapid growths, and this requires having in place the best organizational processes to cope with the market demand. Sales forecasting, as a tool aligned with the general strategy of the company, needs to be as much accurate as possible, in order to achieve the sales targets by making available the right information for purchasing, planning and control of production areas, and finally attending in time and form the demand generated. The present dissertation uses a single case study from the subsidiary of an international explosives company based in Brazil, Maxam, experiencing high growth in sales, and therefore facing the challenge to adequate its structure and processes properly for the rapid growth expected. Diverse sales forecast techniques have been analyzed to compare the actual monthly sales forecast, based on the sales force representatives’ market knowledge, with forecasts based on the analysis of historical sales data. The dissertation findings show how the combination of both qualitative and quantitative forecasts, by the creation of a combined forecast that considers both client´s demand knowledge from the sales workforce with time series analysis, leads to the improvement on the accuracy of the company´s sales forecast.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The literature has emphasized that absorptive capacity (AC) leads to performance, but in projects its influences still unclear. Additionally, the project success is not well understood by the literature, and AC can be an important mechanism to explain it. Therefore, the purpose of this study is to investigate the effect of absorptive capacity on project performance in the construction industry of São Paulo State. We study this influence through potential and realized absorptive capacity proposed by Zahra and George (2002). For achieving this goal, we use a combination of qualitative and quantitative research. The qualitative research is based on 15 interviews with project managers in different sectors to understand the main constructs and support the next quantitative phase. The content analysis was the technique used to analyze those interviews. In quantitative phase through a survey questionnaire, we collected 157 responses in the construction sector with project managers. The confirmatory factor analysis and hierarchical linear regression were the techniques used to assess the data. Our findings suggest that the realized absorptive capacity has a positive influence on performance, but potential absorptive capacity and the interactions effect have no influence on performance. Moreover, the planning and monitoring have a positive impact on budget and schedule, and customer satisfaction while risk coping capacity has a positive impact on business success. In academics terms, this research enables a better understanding of the importance of absorptive capacity in the construction industry and it confirms that knowledge application in processes and routines enhances performance. For management, the absorptive capacity enables the improvements of internal capabilities reflected in the increased project management efficiency. Indeed, when a company manages project practices efficiently it enhances business and project performance; however, it needs initially to improve its internal abilities to enrich processes and routines through relevant knowledge.