903 resultados para Data-driven knowledge acquisition
Resumo:
The study and practice of knowledge management has grown rapidly since the 90s, driven by social, economic, and technological trends. Tourism has been slow in adopting this app oach due to not only a lack of gearing between researchers and tourism, but also to a 'hostile' knowledge adoption environment. Its acquisition would close the gap and also provide both insights and potential applications for tourism. Research in Australia supports the assertion that this field is a late adopter of knowledge management. In response, this paper provides a model for tourism. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Visualising data for exploratory analysis is a big challenge in scientific and engineering domains where there is a need to gain insight into the structure and distribution of the data. Typically, visualisation methods like principal component analysis and multi-dimensional scaling are used, but it is difficult to incorporate prior knowledge about structure of the data into the analysis. In this technical report we discuss a complementary approach based on an extension of a well known non-linear probabilistic model, the Generative Topographic Mapping. We show that by including prior information of the covariance structure into the model, we are able to improve both the data visualisation and the model fit.
Resumo:
Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources an dWeb services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial ‘mashups’ to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and ‘correlation’ of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research.
Resumo:
Visualising data for exploratory analysis is a major challenge in many applications. Visualisation allows scientists to gain insight into the structure and distribution of the data, for example finding common patterns and relationships between samples as well as variables. Typically, visualisation methods like principal component analysis and multi-dimensional scaling are employed. These methods are favoured because of their simplicity, but they cannot cope with missing data and it is difficult to incorporate prior knowledge about properties of the variable space into the analysis; this is particularly important in the high-dimensional, sparse datasets typical in geochemistry. In this paper we show how to utilise a block-structured correlation matrix using a modification of a well known non-linear probabilistic visualisation model, the Generative Topographic Mapping (GTM), which can cope with missing data. The block structure supports direct modelling of strongly correlated variables. We show that including prior structural information it is possible to improve both the data visualisation and the model fit. These benefits are demonstrated on artificial data as well as a real geochemical dataset used for oil exploration, where the proposed modifications improved the missing data imputation results by 3 to 13%.
Resumo:
Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.
Resumo:
In a certain automobile factory, batch-painting of the body types in colours is controlled by an allocation system. This tries to balance production with orders, whilst making optimally-sized batches of colours. Sequences of cars entering painting cannot be optimised for easy selection of colour and batch size. `Over-production' is not allowed, in order to reduce buffer stocks of unsold vehicles. Paint quality is degraded by random effects. This thesis describes a toolkit which supports IKBS in an object-centred formalism. The intended domain of use for the toolkit is flexible manufacturing. A sizeable application program was developed, using the toolkit, to test the validity of the IKBS approach in solving the real manufacturing problem above, for which an existing conventional program was already being used. A detailed statistical analysis of the operating circumstances of the program was made to evaluate the likely need for the more flexible type of program for which the toolkit was intended. The IKBS program captures the many disparate and conflicting constraints in the scheduling knowledge and emulates the behaviour of the program installed in the factory. In the factory system, many possible, newly-discovered, heuristics would be awkward to represent and it would be impossible to make many new extensions. The representation scheme is capable of admitting changes to the knowledge, relying on the inherent encapsulating properties of object-centres programming to protect and isolate data. The object-centred scheme is supported by an enhancement of the `C' programming language and runs under BSD 4.2 UNIX. The structuring technique, using objects, provides a mechanism for separating control of expression of rule-based knowledge from the knowledge itself and allowing explicit `contexts', within which appropriate expression of knowledge can be done. Facilities are provided for acquisition of knowledge in a consistent manner.
Resumo:
Origin of hydrodynamic turbulence in rotating shear flows is investigated. The particular emphasis is on flows whose angular velocities decrease but specific angular momenta increase with increasing radial coordinate. Such flows are Rayleigh stable, but must be turbulent in order to explain observed data. Such a mismatch between the linear theory and observations/experiments is more severe when any hydromagnetic/magnetohydrodynamic instability and the corresponding turbulence therein is ruled out. The present work explores the effect of stochastic noise on such hydrodynamic flows. We focus on a small section of such a flow which is essentially a plane shear flow supplemented by the Coriolis effect. This also mimics a small section of an astrophysical accretion disk. It is found that such stochastically driven flows exhibit large temporal and spatial correlations of perturbation velocities, and hence large energy dissipations, that presumably generate instability. A range of angular velocity profiles (for the steady flow), starting with the constant angular momentum to that of the constant circular velocity are explored. It is shown that the growth and roughness exponents calculated from the contour (envelope) of the perturbed flows are all identical, revealing a unique universality class for the stochastically forced hydrodynamics of rotating shear flows. This work, to the best of our knowledge, is the first attempt to understand origin of instability and turbulence in the three-dimensional Rayleigh stable rotating shear flows by introducing additive stochastic noise to the underlying linearized governing equations. This has important implications in resolving the turbulence problem in astrophysical hydrodynamic flows such as accretion disks.
Resumo:
It is generally believed that the structural reforms that were introduced in India following the macro-economic crisis of 1991 ushered in competition and forced companies to become more efficient. However, whether the post-1991 growth is an outcome of more efficient use of resources or greater use of factor inputs remains an open empirical question. In this paper, we use plant-level data from 1989–1990 and 2000–2001 to address this question. Our results indicate that while there was an increase in the productivity of factor inputs during the 1990s, most of the growth in value added is explained by growth in the use of factor inputs. We also find that median technical efficiency declined in all but one of the industries between 1989–1990 and 2000–2001, and that change in technical efficiency explains a very small proportion of the change in gross value added.
Resumo:
Origin of hydrodynamic turbulence in rotating shear flows is investigated. The particular emphasis is on flows whose angular velocities decrease but specific angular momenta increase with increasing radial coordinate. Such flows are Rayleigh stable, but must be turbulent in order to explain observed data. Such a mismatch between the linear theory and observations/experiments is more severe when any hydromagnetic/magnetohydrodynamic instability and the corresponding turbulence therein is ruled out. The present work explores the effect of stochastic noise on such hydrodynamic flows. We focus on a small section of such a flow which is essentially a plane shear flow supplemented by the Coriolis effect. This also mimics a small section of an astrophysical accretion disk. It is found that such stochastically driven flows exhibit large temporal and spatial correlations of perturbation velocities, and hence large energy dissipations, that presumably generate instability. A range of angular velocity profiles (for the steady flow), starting with the constant angular momentum to that of the constant circular velocity are explored. It is shown that the growth and roughness exponents calculated from the contour (envelope) of the perturbed flows are all identical, revealing a unique universality class for the stochastically forced hydrodynamics of rotating shear flows. This work, to the best of our knowledge, is the first attempt to understand origin of instability and turbulence in the three-dimensional Rayleigh stable rotating shear flows by introducing additive stochastic noise to the underlying linearized governing equations. This has important implications in resolving the turbulence problem in astrophysical hydrodynamic flows such as accretion disks.
Resumo:
The management and sharing of complex data, information and knowledge is a fundamental and growing concern in the Water and other Industries for a variety of reasons. For example, risks and uncertainties associated with climate, and other changes require knowledge to prepare for a range of future scenarios and potential extreme events. Formal ways in which knowledge can be established and managed can help deliver efficiencies on acquisition, structuring and filtering to provide only the essential aspects of the knowledge really needed. Ontologies are a key technology for this knowledge management. The construction of ontologies is a considerable overhead on any knowledge management programme. Hence current computer science research is investigating generating ontologies automatically from documents using text mining and natural language techniques. As an example of this, results from application of the Text2Onto tool to stakeholder documents for a project on sustainable water cycle management in new developments are presented. It is concluded that by adopting ontological representations sooner, rather than later in an analytical process, decision makers will be able to make better use of highly knowledgeable systems containing automated services to ensure that sustainability considerations are included.