15 resultados para Modelling with data
em Aston University Research Archive
Resumo:
It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural network framework which accounts for input noise provided that a model of the noise process exists. In the limit where the noise process is small and symmetric it is shown, using the Laplace approximation, that this method adds an extra term to the usual Bayesian error bar which depends on the variance of the input noise process. Further, by treating the true (noiseless) input as a hidden variable, and sampling this jointly with the network’s weights, using a Markov chain Monte Carlo method, it is demonstrated that it is possible to infer the regression over the noiseless input. This leads to the possibility of training an accurate model of a system using less accurate, or more uncertain, data. This is demonstrated on both the, synthetic, noisy sine wave problem and a real problem of inferring the forward model for a satellite radar backscatter system used to predict sea surface wind vectors.
Resumo:
Developers of interactive software are confronted by an increasing variety of software tools to help engineer the interactive aspects of software applications. Not only do these tools fall into different categories in terms of functionality, but within each category there is a growing number of competing tools with similar, although not identical, features. Choice of user interface development tool (UIDT) is therefore becoming increasingly complex.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
The educational process is characterised by multiple outcomes such as the achievement of academic results of various standards and non-academic achievements. This paper shows how data envelopment analysis (DEA) can be used to guide secondary schools to improved performance through role-model identification and target setting in a way which recognises the multi-outcome nature of the education process and reflects the relative desirability of improving individual outcomes. The approach presented in the paper draws from a DEA-based assessment of the schools of a local education authority carried out by the authors. Data from that assessment are used to illustrate the approach presented in the paper. (Key words: Data envelopment analysis, education, target setting.)
Resumo:
In wireless sensor networks where nodes are powered by batteries, it is critical to prolong the network lifetime by minimizing the energy consumption of each node. In this paper, the cooperative multiple-input-multiple-output (MIMO) and data-aggregation techniques are jointly adopted to reduce the energy consumption per bit in wireless sensor networks by reducing the amount of data for transmission and better using network resources through cooperative communication. For this purpose, we derive a new energy model that considers the correlation between data generated by nodes and the distance between them for a cluster-based sensor network by employing the combined techniques. Using this model, the effect of the cluster size on the average energy consumption per node can be analyzed. It is shown that the energy efficiency of the network can significantly be enhanced in cooperative MIMO systems with data aggregation, compared with either cooperative MIMO systems without data aggregation or data-aggregation systems without cooperative MIMO, if sensor nodes are properly clusterized. Both centralized and distributed data-aggregation schemes for the cooperating nodes to exchange and compress their data are also proposed and appraised, which lead to diverse impacts of data correlation on the energy performance of the integrated cooperative MIMO and data-aggregation systems.
Resumo:
For analysing financial time series two main opposing viewpoints exist, either capital markets are completely stochastic and therefore prices follow a random walk, or they are deterministic and consequently predictable. For each of these views a great variety of tools exist with which it can be tried to confirm the hypotheses. Unfortunately, these methods are not well suited for dealing with data characterised in part by both paradigms. This thesis investigates these two approaches in order to model the behaviour of financial time series. In the deterministic framework methods are used to characterise the dimensionality of embedded financial data. The stochastic approach includes here an estimation of the unconditioned and conditional return distributions using parametric, non- and semi-parametric density estimation techniques. Finally, it will be shown how elements from these two approaches could be combined to achieve a more realistic model for financial time series.
Resumo:
A spatial object consists of data assigned to points in a space. Spatial objects, such as memory states and three dimensional graphical scenes, are diverse and ubiquitous in computing. We develop a general theory of spatial objects by modelling abstract data types of spatial objects as topological algebras of functions. One useful algebra is that of continuous functions, with operations derived from operations on space and data, and equipped with the compact-open topology. Terms are used as abstract syntax for defining spatial objects and conditional equational specifications are used for reasoning. We pose a completeness problem: Given a selection of operations on spatial objects, do the terms approximate all the spatial objects to arbitrary accuracy? We give some general methods for solving the problem and consider their application to spatial objects with real number attributes. © 2011 British Computer Society.
Resumo:
Purpose – The purpose of this paper is to investigate the joint effects of market orientation (MO) and corporate social responsibility (CSR) on firm performance. Design/methodology/approach – Data were collected via a questionnaire survey of star-rated hotels in China and a total of 143 valid responses were received. The hypotheses were tested by employing structural equation modelling with a maximum likelihood estimation option. Findings – It was found that although both MO and CSR could enhance performance, once the effects of CSR are accounted for, the direct effects of MO on performance diminish considerably to almost non-existent. Although this result may be due to the fact that the research is conducted in China, a country where CSR might be crucially important to performance given the country's socialist legacy, it nonetheless provides strong evidence that MO's impact on organizational performance is mediated by CSR. Research limitations/implications – The main limitations include the use of cross-sectional data, the subjective measurement of performance and the uniqueness of the research setting (China). The findings provide an additional important insight into the processes by which a market oriented culture is transformed into superior organizational performance. Originality/value – This paper is one of the first to examine the joint effects of MO and CSR on business performance. The empirical evidence from China adds to the existing literature on the respective importance of MO and CSR.
Resumo:
Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).
Resumo:
Drying is an important unit operation in process industry. Results have suggested that the energy used for drying has increased from 12% in 1978 to 18% of the total energy used in 1990. A literature survey of previous studies regarding overall drying energy consumption has demonstrated that there is little continuity of methods and energy trends could not be established. In the ceramics, timber and paper industrial sectors specific energy consumption and energy trends have been investigated by auditing drying equipment. Ceramic products examined have included tableware, tiles, sanitaryware, electrical ceramics, plasterboard, refractories, bricks and abrasives. Data from industry has shown that drying energy has not varied significantly in the ceramics sector over the last decade, representing about 31% of the total energy consumed. Information from the timber industry has established that radical changes have occurred over the last 20 years, both in terms of equipment and energy utilisation. The energy efficiency of hardwood drying has improved by 15% since the 1970s, although no significant savings have been realised for softwood. A survey estimating the energy efficiency and operating characteristics of 192 paper dryer sections has been conducted. Drying energy was found to increase to nearly 60% of the total energy used in the early 1980s, but has fallen over the last decade, representing 23% of the total in 1993. These results have demonstrated that effective energy saving measures, such as improved pressing and heat recovery, have been successfully implemented since the 1970s. Artificial neural networks have successfully been applied to model process characteristics of microwave and convective drying of paper coated gypsum cove. Parameters modelled have included product moisture loss, core gypsum temperature and quality factors relating to paper burning and bubbling defects. Evaluation of thermal and dielectric properties have highlighted gypsum's heat sensitive characteristics in convective and electromagnetic regimes. Modelling experimental data has shown that the networks were capable of simulating drying process characteristics to a high degree of accuracy. Product weight and temperature were predicted to within 0.5% and 5C of the target data respectively. Furthermore, it was demonstrated that the underlying properties of the data could be predicted through a high level of input noise.
Resumo:
Fluctuations of liquids at the scales where the hydrodynamic and atomistic descriptions overlap are considered. The importance of these fluctuations for atomistic motions is discussed and examples of their accurate modelling with a multi-space-time-scale fluctuating hydrodynamics scheme are provided. To resolve microscopic details of liquid systems, including biomolecular solutions, together with macroscopic fluctuations in space-time, a novel hybrid atomistic-fluctuating hydrodynamics approach is introduced. For a smooth transition between the atomistic and continuum representations, an analogy with two-phase hydrodynamics is used that leads to a strict preservation of macroscopic mass and momentum conservation laws. Examples of numerical implementation of the new hybrid approach for the multiscale simulation of liquid argon in equilibrium conditions are provided. © 2014 The Author(s) Published by the Royal Society.
Resumo:
Chicken breast from nine products and from the following production regimes: conventional (chilled and frozen), organic and free range, were analysed for fatty acid composition of total lipids, preventative and chain breaking antioxidant contents and lipid oxidation during 5 days of sub-ambient storage following purchase. Total lipids were extracted with an optimal amount of a cold chloroform methanol solvent. Lipid compositions varied, but there were differences between conventional and organic products in their contents of total polyunsaturated fatty acids and n-3 and n-6 fatty acids and n-6:n-3 ratio. Of the antioxidants, a-tocopherol content was inversely correlated with lipid oxidation. The antioxidant enzyme activities of catalase, glutathione peroxidase and glutathione reductase varied between products. Modelling with partial least squares regression showed no overall relationship between total antioxidants and lipid data, but certain individual antioxidants showed a relationship with specific lipid fractions.
Resumo:
The principal theme of this thesis is the identification of additional factors affecting, and consequently to better allow, the prediction of soft contact lens fit. Various models have been put forward in an attempt to predict the parameters that influence soft contact lens fit dynamics; however, the factors that influence variation in soft lens fit are still not fully understood. The investigations in this body of work involved the use of a variety of different imaging techniques to both quantify the anterior ocular topography and assess lens fit. The use of Anterior-Segment Optical Coherence Tomography (AS-OCT) allowed for a more complete characterisation of the cornea and corneoscleral profile (CSP) than either conventional keratometry or videokeratoscopy alone, and for the collection of normative data relating to the CSP for a substantial sample size. The scleral face was identified as being rotationally asymmetric, the mean corneoscleral junction (CSJ) angle being sharpest nasally and becoming progressively flatter at the temporal, inferior and superior limbal junctions. Additionally, 77% of all CSJ angles were within ±50 of 1800, demonstrating an almost tangential extension of the cornea to form the paralimbal sclera. Use of AS-OCT allowed for a more robust determination of corneal diameter than that of white-to-white (WTW) measurement, which is highly variable and dependent on changes in peripheral corneal transparency. Significant differences in ocular topography were found between different ethnicities and sexes, most notably for corneal diameter and corneal sagittal height variables. Lens tightness was found to be significantly correlated with the difference between horizontal CSJ angles (r =+0.40, P =0.0086). Modelling of the CSP data gained allowed for prediction of up to 24% of the variance in contact lens fit; however, it was likely that stronger associations and an increase in the modelled prediction of variance in fit may have occurred had an objective method of lens fit assessment have been made. A subsequent investigation to determine the validity and repeatability of objective contact lens fit assessment using digital video capture showed no significant benefit over subjective evaluation. The technique, however, was employed in the ensuing investigation to show significant changes in lens fit between 8 hours (the longest duration of wear previously examined) and 16 hours, demonstrating that wearing time is an additional factor driving lens fit dynamics. The modelling of data from enhanced videokeratoscopy composite maps alone allowed for up to 77% of the variance in soft contact lens fit, and up to almost 90% to be predicted when used in conjunction with OCT. The investigations provided further insight into the ocular topography and factors affecting soft contact lens fit.
Resumo:
Predicting species potential and future distribution has become a relevant tool in biodiversity monitoring and conservation.In this data article we present the suitability map of a virtual species generated based on two bioclimatic variables, and a dataset containing more than 700,000 random observations at the extent of Europe. The dataset includes spatial attributes such as: distance to roads, protected areas, country codes, and the habitat suitability of two spatially clustered species (grassland and forest species) and a wide-spread species.