913 resultados para Knowledge Discovery Tools


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Procedural knowledge is the knowledge required to perform certain tasks. It forms an important part of expertise, and is crucial for learning new tasks. This paper summarises existing work on procedural knowledge acquisition, and identifies two major challenges that remain to be solved in this field; namely, automating the acquisition process to tackle bottleneck in the formalization of procedural knowledge, and enabling machine understanding and manipulation of procedural knowledge. It is believed that recent advances in information extraction techniques can be applied compose a comprehensive solution to address these challenges. We identify specific tasks required to achieve the goal, and present detailed analyses of new research challenges and opportunities. It is expected that these analyses will interest researchers of various knowledge management tasks, particularly knowledge acquisition and capture.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Traditional Chinese Medicine (TCM) has been actively researched through various approaches, including computational techniques. A review on basic elements of TCM is provided to illuminate various challenges and progresses in its study using computational methods. Information on various TCM formulations, in particular resources on databases of TCM formulations and their integration to Western medicine, are analyzed in several facets, such as TCM classifications, types of databases, and mining tools. Aspects of computational TCM diagnosis, namely inspection, auscultation, pulse analysis as well as TCM expert systems are reviewed in term of their benefits and drawbacks. Various approaches on exploring relationships among TCM components and finding genes/proteins relating to TCM symptom complex are also studied. This survey provides a summary on the advance of computational approaches for TCM and will be useful for future knowledge discovery in this area. © 2007 Elsevier Ireland Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problems of constructing the selfsrtucturized systems of memory of intelligence information processing tools, allowing formation of associative links in the memory, hierarchical organization and classification, generating concepts in the process of the information input, are discussed. The principles and methods for realization of selfstructurized systems on basis of hierarchic network structures of some special class – growing pyramidal network are studied. The algorithms for building, learning and recognition on basis of such type network structures are proposed. The examples of practical application are demonstrated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Our approach for knowledge presentation is based on the idea of expert system shell. At first we will build a graph shell of both possible dependencies and possible actions. Then, reasoning by means of Loglinear models, we will activate some nodes and some directed links. In this way a Bayesian network and networks presenting loglinear models are generated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Competition between Higher Education Institutions is increasing at an alarming rate, while changes of the surrounding environment and demands of labour market are frequent and substantial. Universities must meet the requirements of both the national and European legislation environment. The Bologna Declaration aims at providing guidelines and solutions for these problems and challenges of European Higher Education. One of its main goals is the introduction of a common framework of transparent and comparable degrees that ensures the recognition of knowledge and qualifications of citizens all across the European Union. This paper will discuss a knowledge management approach that highlights the importance of such knowledge representation tools as ontologies. The discussed ontology-based model supports the creation of transparent curricula content (Educational Ontology) and the promotion of reliable knowledge testing (Adaptive Knowledge Testing System).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analyzing large-scale gene expression data is a labor-intensive and time-consuming process. To make data analysis easier, we developed a set of pipelines for rapid processing and analysis poplar gene expression data for knowledge discovery. Of all pipelines developed, differentially expressed genes (DEGs) pipeline is the one designed to identify biologically important genes that are differentially expressed in one of multiple time points for conditions. Pathway analysis pipeline was designed to identify the differentially expression metabolic pathways. Protein domain enrichment pipeline can identify the enriched protein domains present in the DEGs. Finally, Gene Ontology (GO) enrichment analysis pipeline was developed to identify the enriched GO terms in the DEGs. Our pipeline tools can analyze both microarray gene data and high-throughput gene data. These two types of data are obtained by two different technologies. A microarray technology is to measure gene expression levels via microarray chips, a collection of microscopic DNA spots attached to a solid (glass) surface, whereas high throughput sequencing, also called as the next-generation sequencing, is a new technology to measure gene expression levels by directly sequencing mRNAs, and obtaining each mRNA’s copy numbers in cells or tissues. We also developed a web portal (http://sys.bio.mtu.edu/) to make all pipelines available to public to facilitate users to analyze their gene expression data. In addition to the analyses mentioned above, it can also perform GO hierarchy analysis, i.e. construct GO trees using a list of GO terms as an input.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A problemática relacionada com a modelação da qualidade da água de albufeiras pode ser abordada de diversos pontos de vista. Neste trabalho recorre-se a metodologias de resolução de problemas que emanam da Área Cientifica da Inteligência Artificial, assim como a ferramentas utilizadas na procura de soluções como as Árvores de Decisão, as Redes Neuronais Artificiais e a Aproximação de Vizinhanças. Actualmente os métodos de avaliação da qualidade da água são muito restritivos já que não permitem aferir a qualidade da água em tempo real. O desenvolvimento de modelos de previsão baseados em técnicas de Descoberta de Conhecimento em Bases de Dados, mostrou ser uma alternativa tendo em vista um comportamento pró-activo que pode contribuir decisivamente para diagnosticar, preservar e requalificar as albufeiras. No decurso do trabalho, foi utilizada a aprendizagem não-supervisionada tendo em vista estudar a dinâmica das albufeiras sendo descritos dois comportamentos distintos, relacionados com a época do ano. ABSTRACT: The problems related to the modelling of water quality in reservoirs can be approached from different viewpoints. This work resorts to methods of resolving problems emanating from the Scientific Area of Artificial lntelligence as well as to tools used in the search for solutions such as Decision Trees, Artificial Neural Networks and Nearest-Neighbour Method. Currently, the methods for assessing water quality are very restrictive because they do not indicate the water quality in real time. The development of forecasting models, based on techniques of Knowledge Discovery in Databases, shows to be an alternative in view of a pro-active behavior that may contribute to diagnose, maintain and requalify the water bodies. ln this work. unsupervised learning was used to study the dynamics of reservoirs, being described two distinct behaviors, related to the time of year.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O processo de Bolonha apresenta directivas para a construção de um espaço europeu de ensino superior. A adopção destas directivas requer uma abordagem que favoreça, na prática, a mobilidade dos estudantes que têm dificuldades em compreenderem as oportunidades que lhes são oferecidas. Neste contexto, esta dissertação explora a hipótese de utilização de uma rede social para apoiar a mobilidade de estudantes no espaço europeu. No âmbito desta dissertação propõe-se um modelo de conhecimento para representar os membros de uma rede social vocacionada para apoiar cenários de mobilidade, designada por rede social académica. Este modelo foi obtido pela fusão da ontologia Academic Ontology to Support the Bologna Mobility Process com a ontologia Friend of a Friend Ontology. Para efeitos de avaliação experimental, foi criado um demonstrador numa rede social disponível publicamente na Internet que utiliza uma versão simplificada do modelo proposto. Os cenários usados nas experiências representam situações reais às quais foi aplicado um processo rudimentar de descoberta de conhecimento

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Este trabalho consiste no desenvolvimento de um Sistema de Apoio à Criminologia – SAC, onde se pretende ajudar os detectives/analistas na prevenção proactiva da criminalidade e na gestão dos seus recursos materiais e humanos, bem como impulsionar estudos sobre a alta incidência de determinados tipos de crime numa dada região. Historicamente, a resolução de crimes tem sido uma prerrogativa da justiça penal e dos seus especialistas e, com o aumento da utilização de sistemas computacionais no sistema judicial para registar todos os dados que dizem respeito a ocorrências de crimes, dados de suspeitos e vítimas, registo criminal de indivíduos e outros dados que fluem dentro da organização, cresce a necessidade de transformar estes dados em informação proveitosa no combate à criminalidade. O SAC tira partido de técnicas de extracção de conhecimento de informação e aplica-as a um conjunto de dados de ocorrências de crimes numa dada região e espaço temporal, bem como a um conjunto de variáveis que influenciam a criminalidade, as quais foram estudadas e identificadas neste trabalho. Este trabalho é constituído por um modelo de extracção de conhecimento de informação e por uma aplicação que permite ao utilizador fornecer um conjunto de dados adequado, garantindo a máxima eficácia do modelo.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Included on these efforts there can be enumerated SEMMA and CRISP-DM. Both grow as industrial standards and define a set of sequential steps that pretends to guide the implementation of data mining applications. The question of the existence of substantial differences between them and the traditional KDD process arose. In this paper, is pretended to establish a parallel between these and the KDD process as well as an understanding of the similarities between them.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Business Intelligence (BI) is one emergent area of the Decision Support Systems (DSS) discipline. Over the last years, the evolution in this area has been considerable. Similarly, in the last years, there has been a huge growth and consolidation of the Data Mining (DM) field. DM is being used with success in BI systems, but a truly DM integration with BI is lacking. Therefore, a lack of an effective usage of DM in BI can be found in some BI systems. An architecture that pretends to conduct to an effective usage of DM in BI is presented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A definition of medium voltage (MV) load diagrams was made, based on the data base knowledge discovery process. Clustering techniques were used as support for the agents of the electric power retail markets to obtain specific knowledge of their customers’ consumption habits. Each customer class resulting from the clustering operation is represented by its load diagram. The Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) were applied to an electricity consumption data from a utility client’s database in order to form the customer’s classes and to find a set of representative consumption patterns. The WEACS approach is a clustering ensemble combination approach that uses subsampling and that weights differently the partitions in the co-association matrix. As a complementary step to the WEACS approach, all the final data partitions produced by the different variations of the method are combined and the Ward Link algorithm is used to obtain the final data partition. Experiment results showed that WEACS approach led to better accuracy than many other clustering approaches. In this paper the WEACS approach separates better the customer’s population than Two-step clustering algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper deals with the establishment of a characterization methodology of electric power profiles of medium voltage (MV) consumers. The characterization is supported on the data base knowledge discovery process (KDD). Data Mining techniques are used with the purpose of obtaining typical load profiles of MV customers and specific knowledge of their customers’ consumption habits. In order to form the different customers’ classes and to find a set of representative consumption patterns, a hierarchical clustering algorithm and a clustering ensemble combination approach (WEACS) are used. Taking into account the typical consumption profile of the class to which the customers belong, new tariff options were defined and new energy coefficients prices were proposed. Finally, and with the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.