942 resultados para Knowledge Discovery Database


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose a new similarity measure to compute the pair-wise similarity of text-based documents based on patterns of the words in the documents. First we develop a kappa measure for pair-wise comparison of documents then we use ordered weighting averaging operator to define a document similarity measure for a set of documents.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We introduce a flexible visual data mining framework which combines advanced projection algorithms from the machine learning domain and visual techniques developed in the information visualization domain. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection algorithms, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates and billboarding, to provide a visual data mining framework. Results on a real-life chemoinformatics dataset using GTM are promising and have been analytically compared with the results from the traditional projection methods. It is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework. Copyright 2006 ACM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective - To evaluate behavioural components and strategies associated with increased uptake and effectiveness of screening for coronary heart disease and diabetes with an implementation science focus. Design - Realist review. Data sources - PubMed, Web of Knowledge, Cochrane Database of Systematic Reviews, Cochrane Controlled Trials Register and reference chaining. Searches limited to English language studies published since 1990. Eligibility criteria - Eligible studies evaluated interventions designed to increase the uptake of cardiovascular disease (CVD) and diabetes screening and examined behavioural and/or strategic designs. Studies were excluded if they evaluated changes in risk factors or cost-effectiveness only. Results - In 12 eligible studies, several different intervention designs and evidence-based strategies were evaluated. Salient themes were effects of feedback on behaviour change or benefits of health dialogues over simple feedback. Studies provide mixed evidence about the benefits of these intervention constituents, which are suggested to be situation and design specific, broadly supporting their use, but highlighting concerns about the fidelity of intervention delivery, raising implementation science issues. Three studies examined the effects of informed choice or loss versus gain frame invitations, finding no effect on screening uptake but highlighting opportunistic screening as being more successful for recruiting higher CVD and diabetes risk patients than an invitation letter, with no differences in outcomes once recruited. Two studies examined differences between attenders and non-attenders, finding higher risk factors among non-attenders and higher diagnosed CVD and diabetes among those who later dropped out of longitudinal studies. Conclusions - If the risk and prevalence of these diseases are to be reduced, interventions must take into account what we know about effective health behaviour change mechanisms, monitor delivery by trained professionals and examine the possibility of tailoring programmes according to contexts such as risk level to reach those most in need. Further research is needed to determine the best strategies for lifelong approaches to screening.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Traditional Chinese Medicine (TCM) has been actively researched through various approaches, including computational techniques. A review on basic elements of TCM is provided to illuminate various challenges and progresses in its study using computational methods. Information on various TCM formulations, in particular resources on databases of TCM formulations and their integration to Western medicine, are analyzed in several facets, such as TCM classifications, types of databases, and mining tools. Aspects of computational TCM diagnosis, namely inspection, auscultation, pulse analysis as well as TCM expert systems are reviewed in term of their benefits and drawbacks. Various approaches on exploring relationships among TCM components and finding genes/proteins relating to TCM symptom complex are also studied. This survey provides a summary on the advance of computational approaches for TCM and will be useful for future knowledge discovery in this area. © 2007 Elsevier Ireland Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To be competitive in contemporary turbulent environments, firms must be capable of processing huge amounts of information, and effectively convert it into actionable knowledge. This is particularly the case in the marketing context, where problems are also usually highly complex, unstructured and ill-defined. In recent years, the development of marketing management support systems has paralleled this evolution in informational problems faced by managers, leading to a growth in the study (and use) of artificial intelligence and soft computing methodologies. Here, we present and implement a novel intelligent system that incorporates fuzzy logic and genetic algorithms to operate in an unsupervised manner. This approach allows the discovery of interesting association rules, which can be linguistically interpreted, in large scale databases (KDD or Knowledge Discovery in Databases.) We then demonstrate its application to a distribution channel problem. It is shown how the proposed system is able to return a number of novel and potentially-interesting associations among variables. Thus, it is argued that our method has significant potential to improve the analysis of marketing and business databases in practice, especially in non-programmed decisional scenarios, as well as to assist scholarly researchers in their exploratory analysis. © 2013 Elsevier Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The principles of design of information-analytical system (IAS) intended for design of new inorganic compounds are considered. IAS includes the integrated system of databases on properties of inorganic substances and materials, the system of the programs of pattern recognition, the knowledge base and managing program. IAS allows a prediction of inorganic compounds not yet synthesized and estimation of their some properties.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Information extraction or knowledge discovery from large data sets should be linked to data aggregation process. Data aggregation process can result in a new data representation with decreased number of objects of a given set. A deterministic approach to separable data aggregation means a lesser number of objects without mixing of objects from different categories. A statistical approach is less restrictive and allows for almost separable data aggregation with a low level of mixing of objects from different categories. Layers of formal neurons can be designed for the purpose of data aggregation both in the case of deterministic and statistical approach. The proposed designing method is based on minimization of the of the convex and piecewise linear (CPL) criterion functions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The problems of constructing the selfsrtucturized systems of memory of intelligence information processing tools, allowing formation of associative links in the memory, hierarchical organization and classification, generating concepts in the process of the information input, are discussed. The principles and methods for realization of selfstructurized systems on basis of hierarchic network structures of some special class – growing pyramidal network are studied. The algorithms for building, learning and recognition on basis of such type network structures are proposed. The examples of practical application are demonstrated.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper considers the problem of concept generalization in decision-making systems where such features of real-world databases as large size, incompleteness and inconsistence of the stored information are taken into account. The methods of the rough set theory (like lower and upper approximations, positive regions and reducts) are used for the solving of this problem. The new discretization algorithm of the continuous attributes is proposed. It essentially increases an overall performance of generalization algorithms and can be applied to processing of real value attributes in large data tables. Also the search algorithm of the significant attributes combined with a stage of discretization is developed. It allows avoiding splitting of continuous domains of insignificant attributes into intervals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The paper introduces a method for dependencies discovery during human-machine interaction. It is based on an analysis of numerical data sets in knowledge-poor environments. The driven procedures are independent and they interact on a competitive principle. The research focuses on seven of them. The application is in Number Theory.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Product recommender systems are often deployed by e-commerce websites to improve user experience and increase sales. However, recommendation is limited by the product information hosted in those e-commerce sites and is only triggered when users are performing e-commerce activities. In this paper, we develop a novel product recommender system called METIS, a MErchanT Intelligence recommender System, which detects users' purchase intents from their microblogs in near real-time and makes product recommendation based on matching the users' demographic information extracted from their public profiles with product demographics learned from microblogs and online reviews. METIS distinguishes itself from traditional product recommender systems in the following aspects: 1) METIS was developed based on a microblogging service platform. As such, it is not limited by the information available in any specific e-commerce website. In addition, METIS is able to track users' purchase intents in near real-time and make recommendations accordingly. 2) In METIS, product recommendation is framed as a learning to rank problem. Users' characteristics extracted from their public profiles in microblogs and products' demographics learned from both online product reviews and microblogs are fed into learning to rank algorithms for product recommendation. We have evaluated our system in a large dataset crawled from Sina Weibo. The experimental results have verified the feasibility and effectiveness of our system. We have also made a demo version of our system publicly available and have implemented a live system which allows registered users to receive recommendations in real time. © 2014 ACM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The purpose of this article is to evaluate the effectiveness of learning by doing as a practical tool for managing the training of students in "Library Management" at the ULSIT, Sofia, Bulgaria, by using the creation of project 'Data Base “Bulgarian Revival Towns” (CD), financed by Bulgarian Ministry of Education, Youth and Science (1/D002/144/13.10.2011) headed by Prof. DSc Ivanka Yankova, which aims to create new information resource for the towns which will serve the needs of scientific researches. By participating in generating the an array in the database through searching, selection and digitization of documents from these period, at the same time students get an opportunity to expand their skills to work effectively in a team, finding the interdisciplinary, a causal connection between the studied items, objects and subjects and foremost – practical experience in the field of digitization, information behavior, strategies for information search, etc. This method achieves good results for the accumulation of sustainable knowledge and it generates motivation to work in the field of library and information professions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a family of attributed graph kernels based on mutual information measures, i.e., the Jensen-Tsallis (JT) q-differences (for q  ∈ [1,2]) between probability distributions over the graphs. To this end, we first assign a probability to each vertex of the graph through a continuous-time quantum walk (CTQW). We then adopt the tree-index approach [1] to strengthen the original vertex labels, and we show how the CTQW can induce a probability distribution over these strengthened labels. We show that our JT kernel (for q  = 1) overcomes the shortcoming of discarding non-isomorphic substructures arising in the R-convolution kernels. Moreover, we prove that the proposed JT kernels generalize the Jensen-Shannon graph kernel [2] (for q = 1) and the classical subtree kernel [3] (for q = 2), respectively. Experimental evaluations demonstrate the effectiveness and efficiency of the JT kernels.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

School of thought analysis is an important yet not-well-elaborated scientific knowledge discovery task. This paper makes the first attempt at this problem. We focus on one aspect of the problem: do characteristic school-of-thought words exist and whether they are characterizable? To answer these questions, we propose a probabilistic generative School-Of-Thought (SOT) model to simulate the scientific authoring process based on several assumptions. SOT defines a school of thought as a distribution of topics and assumes that authors determine the school of thought for each sentence before choosing words to deliver scientific ideas. SOT distinguishes between two types of school-of-thought words for either the general background of a school of thought or the original ideas each paper contributes to its school of thought. Narrative and quantitative experiments show positive and promising results to the questions raised above © 2013 Association for Computational Linguistics. © 2013 Association for Computational Linguistics.