37 resultados para Data selection

em Aston University Research Archive


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Indicators which summarise the characteristics of spatiotemporal data coverages significantly simplify quality evaluation, decision making and justification processes by providing a number of quality cues that are easy to manage and avoiding information overflow. Criteria which are commonly prioritised in evaluating spatial data quality and assessing a dataset’s fitness for use include lineage, completeness, logical consistency, positional accuracy, temporal and attribute accuracy. However, user requirements may go far beyond these broadlyaccepted spatial quality metrics, to incorporate specific and complex factors which are less easily measured. This paper discusses the results of a study of high level user requirements in geospatial data selection and data quality evaluation. It reports on the geospatial data quality indicators which were identified as user priorities, and which can potentially be standardised to enable intercomparison of datasets against user requirements. We briefly describe the implications for tools and standards to support the communication and intercomparison of data quality, and the ways in which these can contribute to the generation of a GEO label.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The main objective of the project is to enhance the already effective health-monitoring system (HUMS) for helicopters by analysing structural vibrations to recognise different flight conditions directly from sensor information. The goal of this paper is to develop a new method to select those sensors and frequency bands that are best for detecting changes in flight conditions. We projected frequency information to a 2-dimensional space in order to visualise flight-condition transitions using the Generative Topographic Mapping (GTM) and a variant which supports simultaneous feature selection. We created an objective measure of the separation between different flight conditions in the visualisation space by calculating the Kullback-Leibler (KL) divergence between Gaussian mixture models (GMMs) fitted to each class: the higher the KL-divergence, the better the interclass separation. To find the optimal combination of sensors, they were considered in pairs, triples and groups of four sensors. The sensor triples provided the best result in terms of KL-divergence. We also found that the use of a variational training algorithm for the GMMs gave more reliable results.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper suggests a data envelopment analysis (DEA) model for selecting the most efficient alternative in advanced manufacturing technology in the presence of both cardinal and ordinal data. The paper explains the problem of using an iterative method for finding the most efficient alternative and proposes a new DEA model without the need of solving a series of LPs. A numerical example illustrates the model, and an application in technology selection with multi-inputs/multi-outputs shows the usefulness of the proposed approach. © 2012 Springer-Verlag London Limited.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Developers of interactive software are confronted by an increasing variety of software tools to help engineer the interactive aspects of software applications. Not only do these tools fall into different categories in terms of functionality, but within each category there is a growing number of competing tools with similar, although not identical, features. Choice of user interface development tool (UIDT) is therefore becoming increasingly complex.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A practical Bayesian approach for inference in neural network models has been available for ten years, and yet it is not used frequently in medical applications. In this chapter we show how both regularisation and feature selection can bring significant benefits in diagnostic tasks through two case studies: heart arrhythmia classification based on ECG data and the prognosis of lupus. In the first of these, the number of variables was reduced by two thirds without significantly affecting performance, while in the second, only the Bayesian models had an acceptable accuracy. In both tasks, neural networks outperformed other pattern recognition approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The survival of organisations, especially SMEs, depends, to the greatest extent, on those who supply them with the required material input. This is because if the supplier fails to deliver the right materials at the right time and place, and at the right price, then the recipient organisation is bound to fail in its obligations to satisfy the needs of its customers, and to stay in business. Hence, the task of choosing a supplier(s) from a list of vendors, that an organisation will trust with its very existence, is not an easy one. This project investigated how purchasing personnel in organisations solve the problem of vendor selection. The investigation went further to ascertain whether an Expert Systems model could be developed and used as a plausible solution to the problem. An extensive literature review indicated that very scanty research has been conducted in the area of Expert Systems for Vendor Selection, whereas many research theories in expert systems and in purchasing and supply management chain, respectively, had been reported. A survey questionnaire was designed and circulated to people in the industries who actually perform the vendor selection tasks. Analysis of the collected data confirmed the various factors which are considered during the selection process, and established the order in which those factors are ranked. Five of the factors, namely, Production Methods Used, Vendors Financial Background, Manufacturing Capacity, Size of Vendor Organisations, and Suppliers Position in the Industry; appeared to have similar patterns in the way organisations ranked them. These patterns suggested that the bigger the organisation, the more importantly they regarded the above factors. Further investigations revealed that respondents agreed that the most important factors were: Product Quality, Product Price and Delivery Date. The most apparent pattern was observed for the Vendors Financial Background. This generated curiosity which led to the design and development of a prototype expert system for assessing the financial profile of a potential supplier(s). This prototype was called ESfNS. It determines whether a prospective supplier(s) has good financial background or not. ESNS was tested by the potential users who then confirmed that expert systems have great prospects and commercial viability in the domain for solving vendor selection problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Construction projects are risky. However, the characteristics of the risk highly depend on the type of procurement being adopted for managing the project. A build-operate-transfer (BOT) project is recognized as one of the most risky project schemes. There are instances of project failure where a BOT scheme was employed. Ineffective rts are increasingly being managed using various risk management tools and techniques. However, application of those tools depends on the nature of the project, organization's policy, project management strategy, risk attitude of the project team members, and availability of the resources. Understanding of the contents and contexts of BOT projects, together with a thorough understanding of risk management tools and techniques, helps select processes of risk management for effective project implementation in a BOT scheme. This paper studies application of risk management tools and techniques in BOT projects through reviews of relevant literatures and develops a model for selecting risk management process for BOT projects. The application to BOT projects is considered from the viewpoints of the major project participants. Discussion is also made with regard to political risks. This study would contribute to the establishment of a framework for systematic risk management in BOT projects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces a compact form for the maximum value of the non-Archimedean in Data Envelopment Analysis (DEA) models applied for the technology selection, without the need to solve a linear programming (LP). Using this method the computational performance the common weight multi-criteria decision-making (MCDM) DEA model proposed by Karsak and Ahiska (International Journal of Production Research, 2005, 43(8), 1537-1554) is improved. This improvement is significant when computational issues and complexity analysis are a concern.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper extends previous analyses of the choice between internal and external R&D to consider the costs of internal R&D. The Heckman two-stage estimator is used to estimate the determinants of internal R&D unit cost (i.e. cost per product innovation) allowing for sample selection effects. Theory indicates that R&D unit cost will be influenced by scale issues and by the technological opportunities faced by the firm. Transaction costs encountered in research activities are allowed for and, in addition, consideration is given to issues of market structure which influence the choice of R&D mode without affecting the unit cost of internal or external R&D. The model is tested on data from a sample of over 500 UK manufacturing plants which have engaged in product innovation. The key determinants of R&D mode are the scale of plant and R&D input, and market structure conditions. In terms of the R&D cost equation, scale factors are again important and have a non-linear relationship with R&D unit cost. Specificities in physical and human capital also affect unit cost, but have no clear impact on the choice of R&D mode. There is no evidence of technological opportunity affecting either R&D cost or the internal/external decision.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inclusion of high-level scripting functionality in state-of-the-art rendering APIs indicates a movement toward data-driven methodologies for structuring next generation rendering pipelines. A similar theme can be seen in the use of composition languages to deploy component software using selection and configuration of collaborating component implementations. In this paper we introduce the Fluid framework, which places particular emphasis on the use of high-level data manipulations in order to develop component based software that is flexible, extensible, and expressive. We introduce a data-driven, object oriented programming methodology to component based software development, and demonstrate how a rendering system with a similar focus on abstract manipulations can be incorporated, in order to develop a visualization application for geospatial data. In particular we describe a novel SAS script integration layer that provides access to vertex and fragment programs, producing a very controllable, responsive rendering system. The proposed system is very similar to developments speculatively planned for DirectX 10, but uses open standards and has cross platform applicability. © The Eurographics Association 2007.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Digital image processing is exploited in many diverse applications but the size of digital images places excessive demands on current storage and transmission technology. Image data compression is required to permit further use of digital image processing. Conventional image compression techniques based on statistical analysis have reached a saturation level so it is necessary to explore more radical methods. This thesis is concerned with novel methods, based on the use of fractals, for achieving significant compression of image data within reasonable processing time without introducing excessive distortion. Images are modelled as fractal data and this model is exploited directly by compression schemes. The validity of this is demonstrated by showing that the fractal complexity measure of fractal dimension is an excellent predictor of image compressibility. A method of fractal waveform coding is developed which has low computational demands and performs better than conventional waveform coding methods such as PCM and DPCM. Fractal techniques based on the use of space-filling curves are developed as a mechanism for hierarchical application of conventional techniques. Two particular applications are highlighted: the re-ordering of data during image scanning and the mapping of multi-dimensional data to one dimension. It is shown that there are many possible space-filling curves which may be used to scan images and that selection of an optimum curve leads to significantly improved data compression. The multi-dimensional mapping property of space-filling curves is used to speed up substantially the lookup process in vector quantisation. Iterated function systems are compared with vector quantisers and the computational complexity or iterated function system encoding is also reduced by using the efficient matching algcnithms identified for vector quantisers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of digital communication systems is increasing very rapidly. This is due to lower system implementation cost compared to analogue transmission and at the same time, the ease with which several types of data sources (data, digitised speech and video, etc.) can be mixed. The emergence of packet broadcast techniques as an efficient type of multiplexing, especially with the use of contention random multiple access protocols, has led to a wide-spread application of these distributed access protocols in local area networks (LANs) and a further extension of them to radio and mobile radio communication applications. In this research, a proposal for a modified version of the distributed access contention protocol which uses the packet broadcast switching technique has been achieved. The carrier sense multiple access with collision avoidance (CSMA/CA) is found to be the most appropriate protocol which has the ability to satisfy equally the operational requirements for local area networks as well as for radio and mobile radio applications. The suggested version of the protocol is designed in a way in which all desirable features of its precedents is maintained. However, all the shortcomings are eliminated and additional features have been added to strengthen its ability to work with radio and mobile radio channels. Operational performance evaluation of the protocol has been carried out for the two types of non-persistent and slotted non-persistent, through mathematical and simulation modelling of the protocol. The results obtained from the two modelling procedures validate the accuracy of both methods, which compares favourably with its precedent protocol CSMA/CD (with collision detection). A further extension of the protocol operation has been suggested to operate with multichannel systems. Two multichannel systems based on the CSMA/CA protocol for medium access are therefore proposed. These are; the dynamic multichannel system, which is based on two types of channel selection, the random choice (RC) and the idle choice (IC), and the sequential multichannel system. The latter has been proposed in order to supress the effect of the hidden terminal, which always represents a major problem with the usage of the contention random multiple access protocols with radio and mobile radio channels. Verification of their operation performance evaluation has been carried out using mathematical modelling for the dynamic system. However, simulation modelling has been chosen for the sequential system. Both systems are found to improve system operation and fault tolerance when compared to single channel operation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.