807 resultados para Link Mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For years, choosing the right career by monitoring the trends and scope for different career paths have been a requirement for all youngsters all over the world. In this paper we provide a scientific, data mining based method for job absorption rate prediction and predicting the waiting time needed for 100% placement, for different engineering courses in India. This will help the students in India in a great deal in deciding the right discipline for them for a bright future. Information about passed out students are obtained from the NTMIS ( National technical manpower information system ) NODAL center in Kochi, India residing in Cochin University of science and technology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the current study, epidemiology study is done by means of literature survey in groups identified to be at higher potential for DDIs as well as in other cases to explore patterns of DDIs and the factors affecting them. The structure of the FDA Adverse Event Reporting System (FAERS) database is studied and analyzed in detail to identify issues and challenges in data mining the drug-drug interactions. The necessary pre-processing algorithms are developed based on the analysis and the Apriori algorithm is modified to suit the process. Finally, the modules are integrated into a tool to identify DDIs. The results are compared using standard drug interaction database for validation. 31% of the associations obtained were identified to be new and the match with existing interactions was 69%. This match clearly indicates the validity of the methodology and its applicability to similar databases. Formulation of the results using the generic names expanded the relevance of the results to a global scale. The global applicability helps the health care professionals worldwide to observe caution during various stages of drug administration thus considerably enhancing pharmacovigilance

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research is a study about knowledge interface that aims to analyse knowledge discontinuities, the dynamic and emergent characters of struggles and interactions within gender system and ethnicity differences. The cacao boom phenomenon in Central Sulawesi is the main context for a changing of social relations of production, especially when the mode of production has shifted or is still underway from subsistence to petty commodity production. This agrarian change is not only about a change of relationship and practice, but, as my previous research has shown, also about the shift of knowledge domination, because knowledge construes social practice in a dialectical process. Agroecological knowledge is accumulated through interaction, practice and experience. At the same time the knowledge gained from new practices and experiences changes mode of interaction, so such processes provide the arena where an interface of knowledge is manifested. In the process of agro-ecological knowledge interface, gender and ethnic group interactions materialise in the decision-making of production and resource allocation at the household and community level. At this point, power/knowledge is interplayed to gain authority in decision-making. When authority dominates, power encounters resistance, whereas the dominant power and its resistance are aimed to ensure socio-economic security. Eventually, the process of struggle can be identified through the pattern of resource utilisation as a realisation of production decision-making. Such processes are varied from one community to another, and therefore, it shows uniqueness and commonalities, especially when it is placed in a context of shifting mode of production. The focus is placed on actors: men and women in their institutional and cultural setting, including the role of development agents. The inquiry is informed by 4 major questions: 1) How do women and men acquire, disseminate, and utilise their agro ecological knowledge, specifically in rice farming as a subsistence commodity, as well as in cacao farming as a petty commodity? How and why do such mechanisms construct different knowledge domains between two genders? How does the knowledge mechanism apply in different ethnics? What are the implications for gender and ethnicity based relation of production? ; 2) Using the concept of valued knowledge in a shifting mode of production context: is there any knowledge that dominates others? How does the process of domination occur and why? Is there any form of struggle, strategies, negotiation, and compromise over this domination? How do these processes take place at a household as well as community level? How does it relate to production decision-making? ; 3) Putting the previous questions in two communities with a different point of arrival on a path of agricultural commercialisation, how do the processes of struggle vary? What are the bases of the commonalities and peculiarities in both communities?; 4) How the decisions of production affect rice field - cacao plantation - forest utilisation in the two villages? How does that triangle of resource use reflect the constellation of local knowledge in those two communities? What is the implication of this knowledge constellation for the cacao-rice-forest agroecosystem in the forest margin area? Employing a qualitative approach as the main method of inquiry, indepth and dialogic interviews, participant observer role, and document review are used to gather information. A small survey and children’s writing competition are supplementary to this data collection method. The later two methods are aimed to give wider information on household decision making and perception toward the forest. It was found that local knowledge, particularly knowledge pertaining to rice-forest-cacao agroecology is divided according to gender and ethnicity. This constellation places a process of decision-making as ‘the arena of interface’ between feminine and masculine knowledge, as well as between dominant and less dominant ethnic groups. Transition from subsistence to a commercial mode of production is a context that frames a process where knowledge about cacao commodity is valued higher than rice. Market mechanism, as an external power, defines valued knowledge. Valued knowledge defines the dominant knowledge holder, and decision. Therefore, cacao cultivation becomes a dominant practice. Its existence sacrifices the presence of rice field and the forest. Knowledge about rice production and forest ecosystem exist, but is less valued. So it is unable to challenge the domination of cacao. Various forms of struggles - within gender an ethnicity context - to resist cacao domination are an expression of unequal knowledge possession. Knowledge inequality implies to unequal access to withdraw benefit from market valued crop. When unequal knowledge fails to construct a negotiated field or struggles fail to reveal ‘marginal’ decision, e.g. intensification instead of cacao expansion to the forest, interface only produces divergence. Gender and ethnicity divided knowledge is unabridged, since negotiation is unable to produce new knowledge that accommodates both interests. Rice is loaded by ecological interest to conserve the forest, while cacao is driven by economic interest to increase welfare status. The implication of this unmediated dominant knowledge of cacao production is the construction of access; access to the forest, mainly to withdraw its economic benefit by eliminating its ecological benefit. Then, access to cacao as the social relationship of production to acquire cacao knowledge; lastly, access to defend sustainable benefit from cacao by expansion. ‘Socio-economic Security’ is defined by Access. The convergence of rice and cacao knowledge, however, should be made possible across gender and ethnicity, not only for the sake of forest conservation as the insurance of ecological security, but also for community’s socio-economic security. The convergence might be found in a range of alternative ways to conduct cacao sustainable production, from agroforestry system to intensification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of the relevance and the usefulness of extracted association rules is of primary importance because, in the majority of cases, real-life databases lead to several thousands association rules with high confidence and among which are many redundancies. Using the closure of the Galois connection, we define two new bases for association rules which union is a generating set for all valid association rules with support and confidence. These bases are characterized using frequent closed itemsets and their generators; they consist of the non-redundant exact and approximate association rules having minimal antecedents and maximal consequences, i.e. the most relevant association rules. Algorithms for extracting these bases are presented and results of experiments carried out on real-life databases show that the proposed bases are useful, and that their generation is not time consuming.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. The idea is to improve, on the one hand, the results of Web Mining by exploiting the new semantic structures in the Web; and to make use of Web Mining, on overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. These systems provide currently relatively few structure. We discuss in this paper, how association rule mining can be adopted to analyze and structure folksonomies, and how the results can be used for ontology learning and supporting emergent semantics. We demonstrate our approach on a large scale dataset stemming from an online system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: an increasing number of researchers is working on improving the results of Web Mining by exploiting semantic structures in the Web, and they make use of Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The Semantic Web is the second-generation WWW, enriched by machine-processable information which supports the user in his tasks. Given the enormous size even of today’s Web, it is impossible to manually enrich all of these resources. Therefore, automated schemes for learning the relevant information are increasingly being used. Web Mining aims at discovering insights about the meaning of Web resources and their usage. Given the primarily syntactical nature of the data being mined, the discovery of meaning is impossible based on these data only. Therefore, formalizations of the semantics of Web sites and navigation behavior are becoming more and more common. Furthermore, mining the Semantic Web itself is another upcoming application. We argue that the two areas Web Mining and Semantic Web need each other to fulfill their goals, but that the full potential of this convergence is not yet realized. This paper gives an overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Association rules are a popular knowledge discovery technique for warehouse basket analysis. They indicate which items of the warehouse are frequently bought together. The problem of association rule mining has first been stated in 1993. Five years later, several research groups discovered that this problem has a strong connection to Formal Concept Analysis (FCA). In this survey, we will first introduce some basic ideas of this connection along a specific algorithm, TITANIC, and show how FCA helps in reducing the number of resulting rules without loss of information, before giving a general overview over the history and state of the art of applying FCA for association rule mining.