805 resultados para Knowledge Discovery in Databases
Resumo:
Flexibility of information systems (IS) have been studied to improve the adaption in support of the business agility as the set of capabilities to compete more effectively and adapt to rapid changes in market conditions (Glossary of business agility terms, 2003). However, most of work on IS flexibility has been limited to systems architecture, ignoring the analysis of interoperability as a part of flexibility from the requirements. This paper reports a PhD project, which proposes an approach to develop IS with flexibility features, considering some challenges of flexibility in small and medium enterprises (SMEs) such as the lack of interoperability and the agility of their business. The motivation of this research are the high prices of IS in developing countries and the usefulness of organizational semiotics to support the analysis of requirements for IS. (Liu, 2005).
Resumo:
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Resumo:
Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.
Resumo:
We show how multivariate GARCH models can be used to generate a time-varying “information share” (Hasbrouck, 1995) to represent the changing patterns of price discovery in closely related securities. We find that time-varying information shares can improve credit spread predictions.
Resumo:
This book introduces six general procedures for teaching grammar to learners of English as a second language. The procedures are designed to encourage learners to notice, explore and practice grammar in context. Each description and discussion of a procedure is followed by two sample lesson plans together with sample texts and worksheets. Teachers can either use these 'as is' or adapt them for their own students. The lessons are suitable for a wide range of students from beginning learners to advanced learners. A final chapter provides examples of lessons in which several procedures are combined. In addition, before each sample lesson plan, the grammar focus of the lesson is briefly explained for the teacher. These procedures all illustrate how grammar can be taught through texts, and they are based on an understanding of the latest research on pedagogical grammar and the role of language awareness and discovery in second language learning and provide teachers with principles they can apply in developing their own teaching materials and activities. The grammar explanations preceding each teaching plan provide a fresh look at English grammar drawing on work in systemic functional linguistics.
Resumo:
This literature review evaluates print knowledge ability in normally hearing pre-readers with and without specific language impairments. It then discusses implications of print knowledge ability in students who are deaf or hard of hearing and early intervention strategies.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
BACKGROUND: Nurses and allied health care professionals (physiotherapists, occupational therapists, speech and language pathologists, dietitians) form more than half of the clinical health care workforce and play a central role in health service delivery. There is a potential to improve the quality of health care if these professionals routinely use research evidence to guide their clinical practice. However, the use of research evidence remains unpredictable and inconsistent. Leadership is consistently described in implementation research as critical to enhancing research use by health care professionals. However, this important literature has not yet been synthesized and there is a lack of clarity on what constitutes effective leadership for research use, or what kinds of intervention effectively develop leadership for the purpose of enabling and enhancing research use in clinical practice. We propose to synthesize the evidence on leadership behaviours amongst front line and senior managers that are associated with research evidence by nurses and allied health care professionals, and then determine the effectiveness of interventions that promote these behaviours.Methods/design: Using an integrated knowledge translation approach that supports a partnership between researchers and knowledge users throughout the research process, we will follow principles of knowledge synthesis using a systematic method to synthesize different types of evidence involving: searching the literature, study selection, data extraction and quality assessment, and analysis. A narrative synthesis will be conducted to explore relationships within and across studies and meta-analysis will be performed if sufficient homogeneity exists across studies employing experimental randomized control trial designs. DISCUSSION: With the engagement of knowledge users in leadership and practice, we will synthesize the research from a broad range of disciplines to understand the key elements of leadership that supports and enables research use by health care practitioners, and how to develop leadership for the purpose of enhancing research use in clinical practice.
Resumo:
A system for weed management on railway embankments that is both adapted to the environment and efficient in terms of resources requires knowledge and understanding about the growing conditions of vegetation so that methods to control its growth can be adapted accordingly. Automated records could complement present-day manual inspections and over time come to replace these. One challenge is to devise a method that will result in a reasonable breakdown of gathered information that can be managed rationally by affected parties and, at the same time, serve as a basis for decisions with sufficient precision. The project examined two automated methods that may be useful for the Swedish Transport Administration in the future: 1) A machine vision method, which makes use of camera sensors as a way of sensing the environment in the visible and near infrared spectrum; and 2) An N-Sensor method, which transmits light within an area that is reflected by the chlorophyll in the plants. The amount of chlorophyll provides a value that can be correlated with the biomass. The choice of technique depends on how the information is to be used. If the purpose is to form a general picture of the growth of vegetation on railway embankments as a way to plan for maintenance measures, then the N-Sensor technique may be the right choice. If the plan is to form a general picture as well as monitor and survey current and exact vegetation status on the surface over time as a way to fight specific vegetation with the correct means, then the machine vision method is the better of the two. Both techniques involve registering data using GPS positioning. In the future, it will be possible to store this information in databases that are directly accessible to stakeholders online during or in conjunction with measures to deal with the vegetation. The two techniques were compared with manual (visual) estimations as to the levels of vegetation growth. The observers (raters) visual estimation of weed coverage (%) differed statistically from person to person. In terms of estimating the frequency (number) of woody plants (trees and bushes) in the test areas, the observers were generally in agreement. The same person is often consistent in his or her estimation: it is the comparison with the estimations of others that can lead to misleading results. The system for using the information about vegetation growth requires development. The threshold for the amount of weeds that can be tolerated in different track types is an important component in such a system. The classification system must be capable of dealing with the demands placed on it so as to ensure the quality of the track and other pre-conditions such as traffic levels, conditions pertaining to track location, and the characteristics of the vegetation. The project recommends that the Swedish Transport Administration: Discusses how threshold values for the growth of vegetation on railway embankments can be determined Carries out registration of the growth of vegetation over longer and a larger number of railway sections using one or more of the methods studied in the project Introduces a system that effectively matches the information about vegetation to its position Includes information about the growth of vegetation in the records that are currently maintained of the track’s technical quality, and link the data material to other maintenance-related databases Establishes a number of representative surfaces in which weed inventories (by measuring) are regularly conducted, as a means of developing an overview of the long-term development that can serve as a basis for more precise prognoses in terms of vegetation growth Ensures that necessary opportunities for education are put in place
Resumo:
One of the most pervasive classes of services needed to support e-Science applications are those responsible for the discovery of resources. We have developed a solution to the problem of service discovery in a Semantic Web/Grid setting. We do this in the context of bioinformatics, which is the use of computational and mathematical techniques to store, manage, and analyse the data from molecular biology in order to answer questions about biological phenomena. Our specific application is myGrid (http: //www.mygrid.org.uk) that is developing open source, service-based middleware upon which bioin- formatics applications can be built. myGrid is specif- ically targeted at developing open source high-level service Grid middleware for bioinformatics.
Resumo:
We take a broad view that ultimately Grid- or Web-services must be located via personalised, semantic-rich discovery processes. We argue that such processes must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. Examples of such metadata are reliability metrics, quality of service data, or semantic service description markup. This paper presents UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. We also discuss the use of a rich, graph-based RDF query language for syntactic queries on this data. Finally, we analyse the performance of each of these contributions in our implementation.
Resumo:
The Grid is a large-scale computer system that is capable of coordinating resources that are not subject to centralised control, whilst using standard, open, general-purpose protocols and interfaces, and delivering non-trivial qualities of service. In this chapter, we argue that Grid applications very strongly suggest the use of agent-based computing, and we review key uses of agent technologies in Grids: user agents, able to customize and personalise data; agent communication languages offering a generic and portable communication medium; and negotiation allowing multiple distributed entities to reach service level agreements. In the second part of the chapter, we focus on Grid service discovery, which we have identified as a prime candidate for use of agent technologies: we show that Grid-services need to be located via personalised, semantic-rich discovery processes, which must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. We present UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. The outcome is a flexible service registry which is compatible with existing standards and also provides metadata-enhanced service discovery.
Resumo:
We define personalisation as the set of capabilities that enables a user or an organisation to customise their working environment to suit their specific needs, preferences and circumstances. In the context of service discovery on the Grid, the demand for personalisation comes from individual users, who want their preferences to be taken into account during the search and selection of suitable services. These preferences can express, for example, the reliability of a service, quality of results, functionality, and so on. In this paper, we identify the problems related to personalising service discovery and present our solution: a personalised service registry or View. We describe scenarios in which personsalised service discovery would be useful and describe how our technology achieves them.