942 resultados para Knowledge Discovery Database
Resumo:
Narcolepsy with cataplexy is a rare disease with an estimated prevalence of 0.02% in European populations. Narcolepsy shares many features of rare disorders, in particular the lack of awareness of the disease with serious consequences for healthcare supply. Similar to other rare diseases, only a few European countries have registered narcolepsy cases in databases of the International Classification of Diseases or in registries of the European health authorities. A promising approach to identify disease-specific adverse health effects and needs in healthcare delivery in the field of rare diseases is to establish a distributed expert network. A first and important step is to create a database that allows collection, storage and dissemination of data on narcolepsy in a comprehensive and systematic way. Here, the first prospective web-based European narcolepsy database hosted by the European Narcolepsy Network is introduced. The database structure, standardization of data acquisition and quality control procedures are described, and an overview provided of the first 1079 patients from 18 European specialized centres. Due to its standardization this continuously increasing data pool is most promising to provide a better insight into many unsolved aspects of narcolepsy and related disorders, including clear phenotype characterization of subtypes of narcolepsy, more precise epidemiological data and knowledge on the natural history of narcolepsy, expectations about treatment effects, identification of post-marketing medication side-effects, and will contribute to improve clinical trial designs and provide facilities to further develop phase III trials.
Resumo:
Introduction: Online databases can support the implementation of evidence-based practice by providing easy access to research. OTseeker (www.otseeker.com), an electronic evidence database, was introduced in 2003 to assist occupational therapists to locate and interpret research. Objectives: This study explored Australian occupational therapists' use and perceptions of OTseeker and its impact on their knowledge and practice. Methods: A postal survey questionnaire was distributed to two samples: (i) a proportionate random sample of 400 occupational therapists from all states and territories of Australia, and (ii) a random sample of occupational therapists working in 95 facilities in two Australian states (Queensland and New South Wales). Results: The questionnaire was completed by 213 participants. While most participants (85.9%) had heard of OTseeker, only 103 (56.6%) had accessed it, with lack of time being the main reason for non-use. Of the 103 participants who had accessed OTseeker, 68.9% had done so infrequently, 63.1% agreed that it had increased their knowledge and 13.6% had changed their practice after accessing information on OTseeker. Conclusion: Despite OTseeker being developed to provide occupational therapists with easy access to research, lack of time was the main reason why over half of the participants in this study had not accessed it. This exploratory research suggests, however, that there is potential for the database to influence occupational therapists' knowledge and practice about treatment efficacy through access to the research literature.
Resumo:
Knowledge sharing is an essential component of effective knowledge management. However, evaluation apprehension, or the fear that your work may be critiqued, can inhibit knowledge sharing. Using the general framework of social exchange theory, we examined the effects of evaluation apprehension and perceived benefit of knowledge sharing ( such as enhanced reputation) on employees' knowledge sharing intentions in two contexts: interpersonal (i.e., by direct contact between two employees) and database (i.e., via repositories). Evaluation apprehension was negatively associated with knowledge sharing intentions in both contexts while perceived bene. it was only positively associated with knowledge sharing intentions in the database context. Moreover, compared to the interpersonal context, evaluation apprehension was higher and knowledge sharing lower in the database context. Finally, the negative effects of evaluation apprehension upon knowledge sharing intentions were worse when perceived benefits were low compared to when perceived benefits were high.
Resumo:
While others have attempted to determine, by way of mathematical formulae, optimal resource duplication strategies for random walk protocols, this paper is concerned with studying the emergent effects of dynamic resource propagation and replication. In particular, we show, via modelling and experimentation, that under any given decay (purge) rate the number of nodes that have knowledge of particular resource converges to a fixed point or a limit cycle. We also show that even for high rates of decay - that is, when few nodes have knowledge of a particular resource - the number of hops required to find that resource is small.
Resumo:
This paper challenges current practices in the use of digital media to communicate Australian Aboriginal knowledge practices in a learning context. It proposes that any digital representation of Aboriginal knowledge practices needs to examine the epistemology and ontology of these practices in order to design digital environments that effectively support and enable existing Aboriginal knowledge practices in the real world. Central to this is the essential task of any new digital representation of Aboriginal knowledge to resolve the conflict between database and narrative views of knowledge (L. Manovich, 2001). This is in order to provide a tool that complements rather than supplants direct experience of traditional knowledge practices (V. Hart, 2001). This paper concludes by reporting on the recent development of an advanced learning technology that addresses this.
Resumo:
Pattern discovery in temporal event sequences is of great importance in many application domains, such as telecommunication network fault analysis. In reality, not every type of event has an accurate timestamp. Some of them, defined as inaccurate events may only have an interval as possible time of occurrence. The existence of inaccurate events may cause uncertainty in event ordering. The traditional support model cannot deal with this uncertainty, which would cause some interesting patterns to be missing. A new concept, precise support, is introduced to evaluate the probability of a pattern contained in a sequence. Based on this new metric, we define the uncertainty model and present an algorithm to discover interesting patterns in the sequence database that has one type of inaccurate event. In our model, the number of types of inaccurate events can be extended to k readily, however, at a cost of increasing computational complexity.
Resumo:
Gli organismi vegetali mostrano una notevole capacità di adattamento alle condizioni di stress e lo studio delle componenti molecolari alla base dell'adattamento in colture cerealicole di interesse alimentare, come il frumento, è di particolare interesse per lo studio di varietà che consentano una buona produzione con basso input anche in condizioni ambientali non ottimali. L'esposizione delle colture cerealicole a stress termico durante determinate fasi del ciclo vitale influisce negativamente sulla resa e sulla qualità, a questo fine è necessario chiarire le basi genetiche e molecolari della termotolleranza per identificare geni e alleli vantaggiosi da impiegare in programmi di incrocio volti al miglioramento genetico. Numerosi studi dimostrano il coinvolgimento delle sHSP a localizzazione cloroplastica (in frumento sHSP26) nel meccanismo di acquisizione della termotolleranza e la loro interazione con diverse componenti del fotosistema II (PSII) che determinerebbe un’azione protettiva in condizioni di stress termico e altri tipi di stress. Lo scopo del progetto è quello di caratterizzare in frumento duro nuove varianti alleliche correlate alla tolleranza a stress termico mediate l'utilizzo del TILLING (Target Induced Local Lesion In Genome), un approccio di genetica inversa che prevede la mutagenesi e l'identificazione delle mutazioni indotte in siti di interesse. Durante la tesi sono state isolate e caratterizzate 3 sequenze geniche complete per smallHsp26 denominate TdHsp26-A1; TdHsp26-A2; TdHsp26-B1 e un putativo pseudogene denominato TdHsp26-A3. I geni isolati sono stati usati come target in analisi di TILLING in due popolazioni di frumento duro mutagenizzate con EMS (EtilMetanoSulfonato). Nel nostro studio sono stati impiegati due differenti approcci di TILLING: un approccio di TILLING classico mediante screening con High Resolution Melting (HRM) e un approccio innovativo che sfrutta un database di TILLING recentemente sviluppato. La popolazione di mutanti cv. Kronos è stata analizzata per la presenza di mutazioni in tutti e tre i geni individuati mediante ricerca online nel database di TILLING, il quale sfrutta la tecnica dell’exome capture sulla popolazione di TILLING seguito da sequenziamento ad alta processività. Attraverso questa tecnica sono state individuate, nella popolazione mutagenizzata di frumento duro cv. Kronos, 36 linee recanti mutazioni missenso. Contemporaneamente lo screening con HRM, effettuato su 960 genotipi della libreria di TILLING di frumento duro cv. Cham1 ha consentito di individuare mutazioni in una regione di 211bp di interesse funzionale del gene TdHsp26-B1, tra le quali 3 linee mutanti recanti mutazioni missenso in omozigosi. Alcune mutazioni missenso individuate sui due geni TdHsp26-A1 e TdHsp26-B1 sono state confermate in vivo nelle piante delle rispettive linee mutanti generando marcatori codominanti KASP (Kompetitive Allele Specific PCR) con cui è stato possibile verificare anche il grado di zigosità di tali mutazioni. Al fine di ridurre il numero di mutazioni non desiderate nelle linee risultate più interessanti, è stato eseguito il re-incrocio dei mutanti con i relativi parentali wild type ed inoltre sono stati generati alcuni doppi mutanti che consentiranno di comprendere meglio i meccanismi molecolari presieduti da questa classe genica. Gli individui F1 degli incroci sono stati poi genotipizzati con i medesimi marcatori KASP specifici per la mutazione di interesse per verificare la buona riuscita dell’incrocio. Questo approccio ha permesso di individuare ed implementare risorse genetiche utili ad intraprendere studi funzionali relativi al ruolo di smallHSP plastidiche implicate nella acquisizione di termotolleranza in frumento duro e di generare marcatori potenzialmente utili in futuri programmi di breeding.
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
The continuing threat of infectious disease and future pandemics, coupled to the continuous increase of drug-resistant pathogens, makes the discovery of new and better vaccines imperative. For effective vaccine development, antigen discovery and validation is a prerequisite. The compilation of information concerning pathogens, virulence factors and antigenic epitopes has resulted in many useful databases. However, most such immunological databases focus almost exclusively on antigens where epitopes are known and ignore those for which epitope information was unavailable. We have compiled more than 500 antigens into the AntigenDB database, making use of the literature and other immunological resources. These antigens come from 44 important pathogenic species. In AntigenDB, a database entry contains information regarding the sequence, structure, origin, etc. of an antigen with additional information such as B and T-cell epitopes, MHC binding, function, gene-expression and post translational modifications, where available. AntigenDB also provides links to major internal and external databases. We shall update AntigenDB on a rolling basis, regularly adding antigens from other organisms and extra data analysis tools. AntigenDB is available freely at http://www.imtech.res.in/raghava/antigendb and its mirror site http://www.bic.uams.edu/raghava/antigendb.
Resumo:
This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.
Resumo:
Computer-Based Learning systems of one sort or another have been in existence for almost 20 years, but they have yet to achieve real credibility within Commerce, Industry or Education. A variety of reasons could be postulated for this, typically: - cost - complexity - inefficiency - inflexibility - tedium Obviously different systems deserve different levels and types of criticism, but it still remains true that Computer-Based Learning (CBL) is falling significantly short of its potential. Experience of a small, but highly successful CBL system within a large, geographically distributed industry (the National Coal Board) prompted an investigation into currently available packages, the original intention being to purchase the most suitable software and run it on existing computer hardware, alongside existing software systems. It became apparent that none of the available CBL packages were suitable, and a decision was taken to develop an in-house Computer-Assisted Instruction system according to the following criteria: - cheap to run; - easy to author course material; - easy to use; - requires no computing knowledge to use (as either an author or student) ; - efficient in the use of computer resources; - has a comprehensive range of facilities at all levels. This thesis describes the initial investigation, resultant observations and the design, development and implementation of the SCHOOL system. One of the principal characteristics c£ SCHOOL is that it uses a hierarchical database structure for the storage of course material - thereby providing inherently a great deal of the power, flexibility and efficiency originally required. Trials using the SCHOOL system on IBM 303X series equipment are also detailed, along with proposed and current development work on what is essentially an operational CBL system within a large-scale Industrial environment.
Resumo:
The work described was carried out as part of a collaborative Alvey software engineering project (project number SE057). The project collaborators were the Inter-Disciplinary Higher Degrees Scheme of the University of Aston in Birmingham, BIS Applied Systems Ltd. (BIS) and the British Steel Corporation. The aim of the project was to investigate the potential application of knowledge-based systems (KBSs) to the design of commercial data processing (DP) systems. The work was primarily concerned with BIS's Structured Systems Design (SSD) methodology for DP systems development and how users of this methodology could be supported using KBS tools. The problems encountered by users of SSD are discussed and potential forms of computer-based support for inexpert designers are identified. The architecture for a support environment for SSD is proposed based on the integration of KBS and non-KBS tools for individual design tasks within SSD - The Intellipse system. The Intellipse system has two modes of operation - Advisor and Designer. The design, implementation and user-evaluation of Advisor are discussed. The results of a Designer feasibility study, the aim of which was to analyse major design tasks in SSD to assess their suitability for KBS support, are reported. The potential role of KBS tools in the domain of database design is discussed. The project involved extensive knowledge engineering sessions with expert DP systems designers. Some practical lessons in relation to KBS development are derived from this experience. The nature of the expertise possessed by expert designers is discussed. The need for operational KBSs to be built to the same standards as other commercial and industrial software is identified. A comparison between current KBS and conventional DP systems development is made. On the basis of this analysis, a structured development method for KBSs in proposed - the POLITE model. Some initial results of applying this method to KBS development are discussed. Several areas for further research and development are identified.
Resumo:
The World Wide Web provides plentiful contents for Web-based learning, but its hyperlink-based architecture connects Web resources for browsing freely rather than for effective learning. To support effective learning, an e-learning system should be able to discover and make use of the semantic communities and the emerging semantic relations in a dynamic complex network of learning resources. Previous graph-based community discovery approaches are limited in ability to discover semantic communities. This paper first suggests the Semantic Link Network (SLN), a loosely coupled semantic data model that can semantically link resources and derive out implicit semantic links according to a set of relational reasoning rules. By studying the intrinsic relationship between semantic communities and the semantic space of SLN, approaches to discovering reasoning-constraint, rule-constraint, and classification-constraint semantic communities are proposed. Further, the approaches, principles, and strategies for discovering emerging semantics in dynamic SLNs are studied. The basic laws of the semantic link network motion are revealed for the first time. An e-learning environment incorporating the proposed approaches, principles, and strategies to support effective discovery and learning is suggested.
Resumo:
In current organizations, valuable enterprise knowledge is often buried under rapidly expanding huge amount of unstructured information in the form of web pages, blogs, and other forms of human text communications. We present a novel unsupervised machine learning method called CORDER (COmmunity Relation Discovery by named Entity Recognition) to turn these unstructured data into structured information for knowledge management in these organizations. CORDER exploits named entity recognition and co-occurrence data to associate individuals in an organization with their expertise and associates. We discuss the problems associated with evaluating unsupervised learners and report our initial evaluation experiments in an expert evaluation, a quantitative benchmarking, and an application of CORDER in a social networking tool called BuddyFinder.
Resumo:
Discovering who works with whom, on which projects and with which customers is a key task in knowledge management. Although most organizations keep models of organizational structures, these models do not necessarily accurately reflect the reality on the ground. In this paper we present a text mining method called CORDER which first recognizes named entities (NEs) of various types from Web pages, and then discovers relations from a target NE to other NEs which co-occur with it. We evaluated the method on our departmental Website. We used the CORDER method to first find related NEs of four types (organizations, people, projects, and research areas) from Web pages on the Website and then rank them according to their co-occurrence with each of the people in our department. 20 representative people were selected and each of them was presented with ranked lists of each type of NE. Each person specified whether these NEs were related to him/her and changed or confirmed their rankings. Our results indicate that the method can find the NEs with which these people are closely related and provide accurate rankings.