14 resultados para data-driven Stochastic Subspace Identification (SSI-data)
em Aston University Research Archive
Resumo:
The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.
Resumo:
A word may have many potential meanings, but its actual meaning in any authentic written or spoken text is determined by its context: its collocations, structural patterns, and pragmatic functions. Large language corpora offer access to words in a wide range of natural contexts, which can improve and enrich both language learning and teaching.
Resumo:
The inclusion of high-level scripting functionality in state-of-the-art rendering APIs indicates a movement toward data-driven methodologies for structuring next generation rendering pipelines. A similar theme can be seen in the use of composition languages to deploy component software using selection and configuration of collaborating component implementations. In this paper we introduce the Fluid framework, which places particular emphasis on the use of high-level data manipulations in order to develop component based software that is flexible, extensible, and expressive. We introduce a data-driven, object oriented programming methodology to component based software development, and demonstrate how a rendering system with a similar focus on abstract manipulations can be incorporated, in order to develop a visualization application for geospatial data. In particular we describe a novel SAS script integration layer that provides access to vertex and fragment programs, producing a very controllable, responsive rendering system. The proposed system is very similar to developments speculatively planned for DirectX 10, but uses open standards and has cross platform applicability. © The Eurographics Association 2007.
Resumo:
Most current 3D landscape visualisation systems either use bespoke hardware solutions, or offer a limited amount of interaction and detail when used in realtime mode. We are developing a modular, data driven 3D visualisation system that can be readily customised to specific requirements. By utilising the latest software engineering methods and bringing a dynamic data driven approach to geo-spatial data visualisation we will deliver an unparalleled level of customisation in near-photo realistic, realtime 3D landscape visualisation. In this paper we show the system framework and describe how this employs data driven techniques. In particular we discuss how data driven approaches are applied to the spatiotemporal management aspect of the application framework, and describe the advantages these convey.
Resumo:
Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources an dWeb services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial ‘mashups’ to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and ‘correlation’ of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research.
Resumo:
Software development methodologies are becoming increasingly abstract, progressing from low level assembly and implementation languages such as C and Ada, to component based approaches that can be used to assemble applications using technologies such as JavaBeans and the .NET framework. Meanwhile, model driven approaches emphasise the role of higher level models and notations, and embody a process of automatically deriving lower level representations and concrete software implementations. The relationship between data and software is also evolving. Modern data formats are becoming increasingly standardised, open and empowered in order to support a growing need to share data in both academia and industry. Many contemporary data formats, most notably those based on XML, are self-describing, able to specify valid data structure and content, and can also describe data manipulations and transformations. Furthermore, while applications of the past have made extensive use of data, the runtime behaviour of future applications may be driven by data, as demonstrated by the field of dynamic data driven application systems. The combination of empowered data formats and high level software development methodologies forms the basis of modern game development technologies, which drive software capabilities and runtime behaviour using empowered data formats describing game content. While low level libraries provide optimised runtime execution, content data is used to drive a wide variety of interactive and immersive experiences. This thesis describes the Fluid project, which combines component based software development and game development technologies in order to define novel component technologies for the description of data driven component based applications. The thesis makes explicit contributions to the fields of component based software development and visualisation of spatiotemporal scenes, and also describes potential implications for game development technologies. The thesis also proposes a number of developments in dynamic data driven application systems in order to further empower the role of data in this field.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
It is generally believed that the structural reforms that were introduced in India following the macro-economic crisis of 1991 ushered in competition and forced companies to become more efficient. However, whether the post-1991 growth is an outcome of more efficient use of resources or greater use of factor inputs remains an open empirical question. In this paper, we use plant-level data from 1989–1990 and 2000–2001 to address this question. Our results indicate that while there was an increase in the productivity of factor inputs during the 1990s, most of the growth in value added is explained by growth in the use of factor inputs. We also find that median technical efficiency declined in all but one of the industries between 1989–1990 and 2000–2001, and that change in technical efficiency explains a very small proportion of the change in gross value added.
Resumo:
One of the greatest concerns related to the popularity of GPS-enabled devices and applications is the increasing availability of the personal location information generated by them and shared with application and service providers. Moreover, people tend to have regular routines and be characterized by a set of “significant places”, thus making it possible to identify a user from his/her mobility data. In this paper we present a series of techniques for identifying individuals from their GPS movements. More specifically, we study the uniqueness of GPS information for three popular datasets, and we provide a detailed analysis of the discriminatory power of speed, direction and distance of travel. Most importantly, we present a simple yet effective technique for the identification of users from location information that are not included in the original dataset used for training, thus raising important privacy concerns for the management of location datasets.
Resumo:
Most current 3D landscape visualisation systems either use bespoke hardware solutions, or offer a limited amount of interaction and detail when used in realtime mode. We are developing a modular, data driven 3D visualisation system that can be readily customised to specific requirements. By utilising the latest software engineering methods and bringing a dynamic data driven approach to geo-spatial data visualisation we will deliver an unparalleled level of customisation in near-photo realistic, realtime 3D landscape visualisation. In this paper we show the system framework and describe how this employs data driven techniques. In particular we discuss how data driven approaches are applied to the spatiotemporal management aspect of the application framework, and describe the advantages these convey. © Springer-Verlag Berlin Heidelberg 2006.
Resumo:
The breadth and depth of available clinico-genomic information, present an enormous opportunity for improving our ability to study disease mechanisms and meet the individualised medicine needs. A difficulty occurs when the results are to be transferred 'from bench to bedside'. Diversity of methods is one of the causes, but the most critical one relates to our inability to share and jointly exploit data and tools. This paper presents a perspective on current state-of-the-art in the analysis of clinico-genomic data and its relevance to medical decision support. It is an attempt to investigate the issues related to data and knowledge integration. Copyright © 2010 Inderscience Enterprises Ltd.
Resumo:
Purpose: This paper extends the use of Radio Frequency Identification (RFID) data for accounting of warehouse costs and services. Time Driven Activity Based Costing (TDABC) methodology is enhanced with the real-time collected RFID data about duration of warehouse activities. This allows warehouse managers to have accurate and instant calculations of costs. The RFID enhanced TDABC (RFID-TDABC) is proposed as a novel application of the RFID technology. Research Approach: Application of RFID-TDABC in a warehouse is implemented on warehouse processes of a case study company. Implementation covers receiving, put-away, order picking, and despatching. Findings and Originality: RFID technology is commonly used for the identification and tracking items. The use of the RFID generated information with the TDABC can be successfully extended to the area of costing. This RFID-TDABC costing model will benefit warehouse managers with accurate and instant calculations of costs. Research Impact: There are still unexplored benefits to RFID technology in its applications in warehousing and the wider supply chain. A multi-disciplinary research approach led to combining RFID technology and TDABC accounting method in order to propose RFID-TDABC. Combining methods and theories from different fields with RFID, may lead researchers to develop new techniques such as RFID-TDABC presented in this paper. Practical Impact: RFID-TDABC concept will be of value to practitioners by showing how warehouse costs can be accurately measured by using this approach. Providing better understanding of incurred costs may result in a further optimisation of warehousing operations, lowering costs of activities, and thus provide competitive pricing to customers. RFID-TDABC can be applied in a wider supply chain.