873 resultados para classification of knowledge
Resumo:
"First edition of this rare little treatise."-Goldschmidt's cat. 24.
Resumo:
Mode of access: Internet.
Resumo:
"Appendix. Bibliography. A select catalogue of books on all the branches of human knowledge": p. [541]-563.
Resumo:
Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.
Resumo:
A major drawback of artificial neural networks is their black-box character. Therefore, the rule extraction algorithm is becoming more and more important in explaining the extracted rules from the neural networks. In this paper, we use a method that can be used for symbolic knowledge extraction from neural networks, once they have been trained with desired function. The basis of this method is the weights of the neural network trained. This method allows knowledge extraction from neural networks with continuous inputs and output as well as rule extraction. An example of the application is showed. This example is based on the extraction of average load demand of a power plant.
Resumo:
Short text messages a.k.a Microposts (e.g. Tweets) have proven to be an effective channel for revealing information about trends and events, ranging from those related to Disaster (e.g. hurricane Sandy) to those related to Violence (e.g. Egyptian revolution). Being informed about such events as they occur could be extremely important to authorities and emergency professionals by allowing such parties to immediately respond. In this work we study the problem of topic classification (TC) of Microposts, which aims to automatically classify short messages based on the subject(s) discussed in them. The accurate TC of Microposts however is a challenging task since the limited number of tokens in a post often implies a lack of sufficient contextual information. In order to provide contextual information to Microposts, we present and evaluate several graph structures surrounding concepts present in linked knowledge sources (KSs). Traditional TC techniques enrich the content of Microposts with features extracted only from the Microposts content. In contrast our approach relies on the generation of different weighted semantic meta-graphs extracted from linked KSs. We introduce a new semantic graph, called category meta-graph. This novel meta-graph provides a more fine grained categorisation of concepts providing a set of novel semantic features. Our findings show that such category meta-graph features effectively improve the performance of a topic classifier of Microposts. Furthermore our goal is also to understand which semantic feature contributes to the performance of a topic classifier. For this reason we propose an approach for automatic estimation of accuracy loss of a topic classifier on new, unseen Microposts. We introduce and evaluate novel topic similarity measures, which capture the similarity between the KS documents and Microposts at a conceptual level, considering the enriched representation of these documents. Extensive evaluation in the context of Emergency Response (ER) and Violence Detection (VD) revealed that our approach outperforms previous approaches using single KS without linked data and Twitter data only up to 31.4% in terms of F1 measure. Our main findings indicate that the new category graph contains useful information for TC and achieves comparable results to previously used semantic graphs. Furthermore our results also indicate that the accuracy of a topic classifier can be accurately predicted using the enhanced text representation, outperforming previous approaches considering content-based similarity measures. © 2014 Elsevier B.V. All rights reserved.
Resumo:
The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problems and new challenges of the emerging field of mining semi-structured documents, and (2) studying and assessing the potential of Machine Learning (ML) techniques for dealing with generic ML tasks in the structured domain, i.e., classification and clustering of semi-structured documents. This track has run for six editions during INEX 2005, 2006, 2007, 2008, 2009 and 2010. The first five editions have been summarized in previous editions and we focus here on the 2010 edition. INEX 2010 included two tasks in the XML Mining track: (1) unsupervised clustering task and (2) semi-supervised classification task where documents are organized in a graph. The clustering task requires the participants to group the documents into clusters without any knowledge of category labels using an unsupervised learning algorithm. On the other hand, the classification task requires the participants to label the documents in the dataset into known categories using a supervised learning algorithm and a training set. This report gives the details of clustering and classification tasks.
Resumo:
This paper, which is abstracted from a larger study into the acquisition and exercise of nephrology nursing expertise, aims to explore the role of knowledge in expert practice. Using grounded theory methodology, the study involved 17 registered nurses who were practicing in a metropolitan renal unit in New South Wales, Australia. Concurrent data collection and analysis was undertaken, incorporating participants' observations and interviews. Having extensive nephrology nursing knowledge was a striking characteristic of a nursing expert. Expert nurses clearly relied on and utilized extensive nephrology nursing knowledge to practice. Of importance for nursing, the results of this study indicate that domain-specific knowledge is a crucial feature of expert practice.
Resumo:
Gliomagenesis is driven by a complex network of genetic alterations and while the glioma genome has been a focus of investigation for many years; critical gaps in our knowledge of this disease remain. The identification of novel molecular biomarkers remains a focus of the greater cancer community as a method to improve the consistency and accuracy of pathological diagnosis. In addition, novel molecular biomarkers are drastically needed for the identification of targets that may ultimately result in novel therapeutics aimed at improving glioma treatment. Through the identification of new biomarkers, laboratories will focus future studies on the molecular mechanisms that underlie glioma development. Here, we report a series of genomic analyses identifying novel molecular biomarkers in multiple histopathological subtypes of glioma and refine the classification of malignant gliomas. We have completed a large scale analysis of the WHO grade II-III astrocytoma exome and report frequent mutations in the chromatin modifier, alpha thalassemia mental retardation x-linked (
Resumo:
Previous studies have revealed considerable interobserver and intraobserver variation in the histological classification of preinvasive cervical squamous lesions. The aim of the present study was to develop a decision support system (DSS) for the histological interpretation of these lesions. Knowledge and uncertainty were represented in the form of a Bayesian belief network that permitted the storage of diagnostic knowledge and, for a given case, the collection of evidence in a cumulative manner that provided a final probability for the possible diagnostic outcomes. The network comprised 8 diagnostic histological features (evidence nodes) that were each independently linked to the diagnosis (decision node) by a conditional probability matrix. Diagnostic outcomes comprised normal; koilocytosis; and cervical intraepithelial neoplasia (CIN) 1, CIN II, and CIN M. For each evidence feature, a set of images was recorded that represented the full spectrum of change for that feature. The system was designed to be interactive in that the histopathologist was prompted to enter evidence into the network via a specifically designed graphical user interface (i-Path Diagnostics, Belfast, Northern Ireland). Membership functions were used to derive the relative likelihoods for the alternative feature outcomes, the likelihood vector was entered into the network, and the updated diagnostic belief was computed for the diagnostic outcomes and displayed. A cumulative probability graph was generated throughout the diagnostic process and presented on screen. The network was tested on 50 cervical colposcopic biopsy specimens, comprising 10 cases each of normal, koilocytosis, CIN 1, CIN H, and CIN III. These had been preselected by a consultant gynecological pathologist. Using conventional morphological assessment, the cases were classified on 2 separate occasions by 2 consultant and 2 junior pathologists. The cases were also then classified using the DSS on 2 occasions by the 4 pathologists and by 2 medical students with no experience in cervical histology. Interobserver and intraobserver agreement using morphology and using the DSS was calculated with K statistics. Intraobserver reproducibility using conventional unaided diagnosis was reasonably good (kappa range, 0.688 to 0.861), but interobserver agreement was poor (kappa range, 0.347 to 0.747). Using the DSS improved overall reproducibility between individuals. Using the DSS, however, did not enhance the diagnostic performance of junior pathologists when comparing their DSS-based diagnosis against an experienced consultant. However, the generation of a cumulative probability graph also allowed a comparison of individual performance, how individual features were assessed in the same case, and how this contributed to diagnostic disagreement between individuals. Diagnostic features such as nuclear pleomorphism were shown to be particularly problematic and poorly reproducible. DSSs such as this therefore not only have a role to play in enhancing decision making but also in the study of diagnostic protocol, education, self-assessment, and quality control. (C) 2003 Elsevier Inc. All rights reserved.
Molecular classification of non-invasive breast lesions for personalised therapy and chemoprevention
Resumo:
Breast cancer screening has led to a dramatic increase in the detection of pre-invasive breast lesions. While mastectomy is almost guaranteed to treat the disease, more conservative approaches could be as effective if patients can be stratified based on risk of co-existing or recurrent invasive disease.Here we use a range of biomarkers to interrogate and classify purely non-invasive lesions (PNL) and those with co-existing invasive breast cancer (CEIN). Apart from Ductal Carcinoma In Situ (DCIS), relative homogeneity is observed. DCIS contained a greater spread of molecular subtypes. Interestingly, high expression of p-mTOR was observed in all PNL with lower expression in DCIS and invasive carcinoma while the opposite expression pattern was observed for TOP2A.Comparing PNL with CEIN, we have identified p53 and Ki67 as predictors of CEIN with a combined PPV and NPV of 90.48% and 43.3% respectively. Furthermore, HER2 expression showed the best concordance between DCIS and its invasive counterpart.We propose that these biomarkers can be used to improve the management of patients with pre-invasive breast lesions following further validation and clinical trials. p53 and Ki67 could be used to stratify patients into low and high-risk groups for co-existing disease. Knowledge of expression of more actionable targets such as HER2 or TOP2A can be used to design chemoprevention or neo-adjuvant strategies. Increased knowledge of the molecular profile of pre-invasive lesions can only serve to enhance our understanding of the disease and, in the era of personalised medicine, bring us closer to improving breast cancer care.
Resumo:
This paper describes a methodology that was developed for the classification of Medium Voltage (MV) electricity customers. Starting from a sample of data bases, resulting from a monitoring campaign, Data Mining (DM) techniques are used in order to discover a set of a MV consumer typical load profile and, therefore, to extract knowledge regarding to the electric energy consumption patterns. In first stage, it was applied several hierarchical clustering algorithms and compared the clustering performance among them using adequacy measures. In second stage, a classification model was developed in order to allow classifying new consumers in one of the obtained clusters that had resulted from the previously process. Finally, the interpretation of the discovered knowledge are presented and discussed.
Resumo:
Site-specific management requires accurate knowledge of the spatial variation in a range of soil properties within fields. This involves considerable sampling effort, which is costly. Ancillary data, such as crop yield, elevation and apparent electrical conductivity (ECa) of the soil, can provide insight into the spatial variation of some soil properties. A multivariate classification with spatial constraint imposed by the variogram was used to classify data from two arable crop fields. The yield data comprised 5 years of crop yield, and the ancillary data 3 years of yield data, elevation and ECa. Information on soil chemical and physical properties was provided by intensive surveys of the soil. Multivariate variograms computed from these data were used to constrain sites spatially within classes to increase their contiguity. The constrained classifications resulted in coherent classes, and those based on the ancillary data were similar to those from the soil properties. The ancillary data seemed to identify areas in the field where the soil is reasonably homogeneous. The results of targeted sampling showed that these classes could be used as a basis for management and to guide future sampling of the soil.
Resumo:
Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.