8 resultados para Information Mining

em Indian Institute of Science - Bangalore - Índia


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Land cover (LC) changes play a major role in global as well as at regional scale patterns of the climate and biogeochemistry of the Earth system. LC information presents critical insights in understanding of Earth surface phenomena, particularly useful when obtained synoptically from remote sensing data. However, for developing countries and those with large geographical extent, regular LC mapping is prohibitive with data from commercial sensors (high cost factor) of limited spatial coverage (low temporal resolution and band swath). In this context, free MODIS data with good spectro-temporal resolution meet the purpose. LC mapping from these data has continuously evolved with advances in classification algorithms. This paper presents a comparative study of two robust data mining techniques, the multilayer perceptron (MLP) and decision tree (DT) on different products of MODIS data corresponding to Kolar district, Karnataka, India. The MODIS classified images when compared at three different spatial scales (at district level, taluk level and pixel level) shows that MLP based classification on minimum noise fraction components on MODIS 36 bands provide the most accurate LC mapping with 86% accuracy, while DT on MODIS 36 bands principal components leads to less accurate classification (69%).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper primarily intends to develop a GIS (geographical information system)-based data mining approach for optimally selecting the locations and determining installed capacities for setting up distributed biomass power generation systems in the context of decentralized energy planning for rural regions. The optimal locations within a cluster of villages are obtained by matching the installed capacity needed with the demand for power, minimizing the cost of transportation of biomass from dispersed sources to power generation system, and cost of distribution of electricity from the power generation system to demand centers or villages. The methodology was validated by using it for developing an optimal plan for implementing distributed biomass-based power systems for meeting the rural electricity needs of Tumkur district in India consisting of 2700 villages. The approach uses a k-medoid clustering algorithm to divide the total region into clusters of villages and locate biomass power generation systems at the medoids. The optimal value of k is determined iteratively by running the algorithm for the entire search space for different values of k along with demand-supply matching constraints. The optimal value of the k is chosen such that it minimizes the total cost of system installation, costs of transportation of biomass, and transmission and distribution. A smaller region, consisting of 293 villages was selected to study the sensitivity of the results to varying demand and supply parameters. The results of clustering are represented on a GIS map for the region.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining involves nontrivial process of extracting knowledge or patterns from large databases. Genetic Algorithms are efficient and robust searching and optimization methods that are used in data mining. In this paper we propose a Self-Adaptive Migration Model GA (SAMGA), where parameters of population size, the number of points of crossover and mutation rate for each population are adaptively fixed. Further, the migration of individuals between populations is decided dynamically. This paper gives a mathematical schema analysis of the method stating and showing that the algorithm exploits previously discovered knowledge for a more focused and concentrated search of heuristically high yielding regions while simultaneously performing a highly explorative search on the other regions of the search space. The effective performance of the algorithm is then shown using standard testbed functions and a set of actual classification datamining problems. Michigan style of classifier was used to build the classifier and the system was tested with machine learning databases of Pima Indian Diabetes database, Wisconsin Breast Cancer database and few others. The performance of our algorithm is better than others.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the emergence of large-volume and high-speed streaming data, the recent techniques for stream mining of CFIpsilas (closed frequent itemsets) will become inefficient. When concept drift occurs at a slow rate in high speed data streams, the rate of change of information across different sliding windows will be negligible. So, the user wonpsilat be devoid of change in information if we slide window by multiple transactions at a time. Therefore, we propose a novel approach for mining CFIpsilas cumulatively by making sliding width(ges1) over high speed data streams. However, it is nontrivial to mine CFIpsilas cumulatively over stream, because such growth may lead to the generation of exponential number of candidates for closure checking. In this study, we develop an efficient algorithm, stream-close, for mining CFIpsilas over stream by exploring some interesting properties. Our performance study reveals that stream-close achieves good scalability and has promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rapid urbanisation in India has posed serious challenges to the decision makers in regional planning involving plethora of issues including provision of basic amenities (like electricity, water, sanitation, transport, etc.). Urban planning entails an understanding of landscape and urban dynamics with causal factors. Identifying, delineating and mapping landscapes on temporal scale provide an opportunity to monitor the changes, which is important for natural resource management and sustainable planning activities. Multi-source, multi-sensor, multi-temporal, multi-frequency or multi-polarization remote sensing data with efficient classification algorithms and pattern recognition techniques aid in capturing these dynamics. This paper analyses the landscape dynamics of Greater Bangalore by: (i) characterisation of direct impervious surface, (ii) computation of forest fragmentation indices and (iii) modeling to quantify and categorise urban changes. Linear unmixing is used for solving the mixed pixel problem of coarse resolution super spectral MODIS data for impervious surface characterisation. Fragmentation indices were used to classify forests – interior, perforated, edge, transitional, patch and undetermined. Based on this, urban growth model was developed to determine the type of urban growth – Infill, Expansion and Outlying growth. This helped in visualising urban growth poles and consequence of earlier policy decisions that can help in evolving strategies for effective land use policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mining association rules from a large collection of databases is based on two main tasks. One is generation of large itemsets; and the other is finding associations between the discovered large itemsets. Existing formalism for association rules are based on a single transaction database which is not sufficient to describe the association rules based on multiple database environment. In this paper, we give a general characterization of association rules and also give a framework for knowledge-based mining of multiple databases for association rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The disclosure of information and its misuse in Privacy Preserving Data Mining (PPDM) systems is a concern to the parties involved. In PPDM systems data is available amongst multiple parties collaborating to achieve cumulative mining accuracy. The vertically partitioned data available with the parties involved cannot provide accurate mining results when compared to the collaborative mining results. To overcome the privacy issue in data disclosure this paper describes a Key Distribution-Less Privacy Preserving Data Mining (KDLPPDM) system in which the publication of local association rules generated by the parties is published. The association rules are securely combined to form the combined rule set using the Commutative RSA algorithm. The combined rule sets established are used to classify or mine the data. The results discussed in this paper compare the accuracy of the rules generated using the C4. 5 based KDLPPDM system and the CS. 0 based KDLPPDM system using receiver operating characteristics curves (ROC).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Online Social Networks (OSNs) facilitate to create and spread information easily and rapidly, influencing others to participate and propagandize. This work proposes a novel method of profiling Influential Blogger (IB) based on the activities performed on one's blog documents who influences various other bloggers in Social Blog Network (SBN). After constructing a social blogging site, a SBN is analyzed with appropriate parameters to get the Influential Blog Power (IBP) of each blogger in the network and demonstrate that profiling IB is adequate and accurate. The proposed Profiling Influential Blogger (PIB) Algorithm survival rate of IB is high and stable. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).