52 resultados para non-trivial data structures
em Aston University Research Archive
Resumo:
Grape is one of the world's largest fruit crops with approximately 67.5 million tonnes produced each year and energy is an important element in modern grape productions as it heavily depends on fossil and other energy resources. Efficient use of these energies is a necessary step toward reducing environmental hazards, preventing destruction of natural resources and ensuring agricultural sustainability. Hence, identifying excessive use of energy as well as reducing energy resources is the main focus of this paper to optimize energy consumption in grape production.In this study we use a two-stage methodology to find the association of energy efficiency and performance explained by farmers' specific characteristics. In the first stage a non-parametric Data Envelopment Analysis is used to model efficiencies as an explicit function of human labor, machinery, chemicals, FYM (farmyard manure), diesel fuel, electricity and water for irrigation energies. In the second step, farm specific variables such as farmers' age, gender, level of education and agricultural experience are used in a Tobit regression framework to explain how these factors influence efficiency of grape farming.The result of the first stage shows substantial inefficiency between the grape producers in the studied area while the second stage shows that the main difference between efficient and inefficient farmers was in the use of chemicals, diesel fuel and water for irrigation. The use of chemicals such as insecticides, herbicides and fungicides were considerably less than inefficient ones. The results revealed that the more educated farmers are more energy efficient in comparison with their less educated counterparts. © 2013.
Resumo:
Physical systems with co-existence and interplay of processes featuring distinct spatio-temporal scales are found in various research areas ranging from studies of brain activity to astrophysics. The complexity of such systems makes their theoretical and experimental analysis technically and conceptually challenging. Here, we discovered that while radiation of partially mode-locked fibre lasers is stochastic and intermittent on a short time scale, it exhibits non-trivial periodicity and long-scale correlations over slow evolution from one round-trip to another. A new technique for evolution mapping of intensity autocorrelation function has enabled us to reveal a variety of localized spatio-temporal structures and to experimentally study their symbiotic co-existence with stochastic radiation. Real-time characterization of dynamical spatio-temporal regimes of laser operation is set to bring new insights into rich underlying nonlinear physics of practical active- and passive-cavity photonic systems.
Resumo:
This paper is drawn from the use of data envelopment analysis (DEA) in helping a Portuguese bank to manage the performance of its branches. The bank wanted to set targets for the branches on such variables as growth in number of clients, growth in funds deposited and so on. Such variables can take positive and negative values but apart from some exceptions, traditional DEA models have hitherto been restricted to non-negative data. We report on the development of a model to handle unrestricted data in a DEA framework and illustrate the use of this model on data from the bank concerned.
Resumo:
Recently there has been an outburst of interest in extending topographic maps of vectorial data to more general data structures, such as sequences or trees. However, there is no general consensus as to how best to process sequences using topographicmaps, and this topic remains an active focus of neurocomputational research. The representational capabilities and internal representations of the models are not well understood. Here, we rigorously analyze a generalization of the self-organizingmap (SOM) for processing sequential data, recursive SOM (RecSOM) (Voegtlin, 2002), as a nonautonomous dynamical system consisting of a set of fixed input maps. We argue that contractive fixed-input maps are likely to produce Markovian organizations of receptive fields on the RecSOM map. We derive bounds on parameter β (weighting the importance of importing past information when processing sequences) under which contractiveness of the fixed-input maps is guaranteed. Some generalizations of SOM contain a dynamic module responsible for processing temporal contexts as an integral part of the model. We show that Markovian topographic maps of sequential data can be produced using a simple fixed (nonadaptable) dynamic module externally feeding a standard topographic model designed to process static vectorial data of fixed dimensionality (e.g., SOM). However, by allowing trainable feedback connections, one can obtain Markovian maps with superior memory depth and topography preservation. We elaborate on the importance of non-Markovian organizations in topographic maps of sequential data. © 2006 Massachusetts Institute of Technology.
Resumo:
Recently, there has been a considerable research activity in extending topographic maps of vectorial data to more general data structures, such as sequences or trees. However, the representational capabilities and internal representations of the models are not well understood. We rigorously analyze a generalization of the Self-Organizing Map (SOM) for processing sequential data, Recursive SOM (RecSOM [1]), as a non-autonomous dynamical system consisting off a set of fixed input maps. We show that contractive fixed input maps are likely to produce Markovian organizations of receptive fields o the RecSOM map. We derive bounds on parameter $\beta$ (weighting the importance of importing past information when processing sequences) under which contractiveness of the fixed input maps is guaranteed.
Resumo:
Linear typing schemes can be used to guarantee non-interference and so the soundness of in-place update with respect to a functional semantics. But linear schemes are restrictive in practice, and more restrictive than necessary to guarantee soundness of in-place update. This limitation has prompted research into static analysis and more sophisticated typing disciplines to determine when in-place update may be safely used, or to combine linear and non-linear schemes. Here we contribute to this direction by defining a new typing scheme that better approximates the semantic property of soundness of in-place update for a functional semantics. We begin from the observation that some data are used only in a read-only context, after which it may be safely re-used before being destroyed. Formalising the in-place update interpretation in a machine model semantics allows us to refine this observation, motivating three usage aspects apparent from the semantics that are used to annotate function argument types. The aspects are (1) used destructively, (2), used read-only but shared with result, and (3) used read-only and not shared with the result. The main novelty is aspect (2), which allows a linear value to be safely read and even aliased with a result of a function without being consumed. This novelty makes our type system more expressive than previous systems for functional languages in the literature. The system remains simple and intuitive, but it enjoys a strong soundness property whose proof is non-trivial. Moreover, our analysis features principal types and feasible type reconstruction, as shown in M. Konen'y (In TYPES 2002 workshop, Nijmegen, Proceedings, Springer-Verlag, 2003).
Resumo:
Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).
Resumo:
The design and implementation of data bases involve, firstly, the formulation of a conceptual data model by systematic analysis of the structure and information requirements of the organisation for which the system is being designed; secondly, the logical mapping of this conceptual model onto the data structure of the target data base management system (DBMS); and thirdly, the physical mapping of this structured model into storage structures of the target DBMS. The accuracy of both the logical and physical mapping determine the performance of the resulting systems. This thesis describes research which develops software tools to facilitate the implementation of data bases. A conceptual model describing the information structure of a hospital is derived using the Entity-Relationship (E-R) approach and this model forms the basis for mapping onto the logical model. Rules are derived for automatically mapping the conceptual model onto relational and CODASYL types of data structures. Further algorithms are developed for partly automating the implementation of these models onto INGRES, MIMER and VAX-11 DBMS.
Resumo:
In this paper, we discuss some practical implications for implementing adaptable network algorithms applied to non-stationary time series problems. Two real world data sets, containing electricity load demands and foreign exchange market prices, are used to test several different methods, ranging from linear models with fixed parameters, to non-linear models which adapt both parameters and model order on-line. Training with the extended Kalman filter, we demonstrate that the dynamic model-order increment procedure of the resource allocating RBF network (RAN) is highly sensitive to the parameters of the novelty criterion. We investigate the use of system noise for increasing the plasticity of the Kalman filter training algorithm, and discuss the consequences for on-line model order selection. The results of our experiments show that there are advantages to be gained in tracking real world non-stationary data through the use of more complex adaptive models.
Resumo:
Conventional DEA models assume deterministic, precise and non-negative data for input and output observations. However, real applications may be characterized by observations that are given in form of intervals and include negative numbers. For instance, the consumption of electricity in decentralized energy resources may be either negative or positive, depending on the heat consumption. Likewise, the heat losses in distribution networks may be within a certain range, depending on e.g. external temperature and real-time outtake. Complementing earlier work separately addressing the two problems; interval data and negative data; we propose a comprehensive evaluation process for measuring the relative efficiencies of a set of DMUs in DEA. In our general formulation, the intervals may contain upper or lower bounds with different signs. The proposed method determines upper and lower bounds for the technical efficiency through the limits of the intervals after decomposition. Based on the interval scores, DMUs are then classified into three classes, namely, the strictly efficient, weakly efficient and inefficient. An intuitive ranking approach is presented for the respective classes. The approach is demonstrated through an application to the evaluation of bank branches. © 2013.
Resumo:
The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.
Resumo:
A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.
Resumo:
In this paper we introduce and illustrate non-trivial upper and lower bounds on the learning curves for one-dimensional Gaussian Processes. The analysis is carried out emphasising the effects induced on the bounds by the smoothness of the random process described by the Modified Bessel and the Squared Exponential covariance functions. We present an explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behavior of the curves. The effects of the noise level and the lengthscale on the tightness of the bounds are also discussed.
Resumo:
Marketing scholars are increasingly recognizing the importance of investigating phenomena at multiple levels. However, the analyses methods that are currently dominant within marketing may not be appropriate to dealing with multilevel or nested data structures. We identify the state of contemporary multilevel marketing research, finding that typical empirical approaches within marketing research may be less effective at explicitly taking account of multilevel data structures than those in other organizational disciplines. A Monte Carlo simulation, based on results from a previously published marketing study, demonstrates that different approaches to analysis of the same data can result in very different results (both in terms of power and effect size). The implication is that marketing scholars should be cautious when analyzing multilevel or other grouped data, and we provide a discussion and introduction to the use of hierarchical linear modeling for this purpose.
Resumo:
The effects of attentional modulation on activity within the human visual cortex were investigated using magnetoencephalography. Chromatic sinusoidal stimuli were used to evoke activity from the occipital cortex, with attention directed either toward or away from the stimulus using a bar-orientation judgment task. For five observers, global magnetic field power was plotted as a function of time from stimulus onset. The major peak of each function occurred at about 120 ms latency and was well modeled by a current dipole near the calcarine sulcus. Independent component analysis (ICA) on the non-averaged data for each observer also revealed one component of calcarine origin, the location of which matched that of the dipolar source determined from the averaged data. For two observers, ICA revealed a second component near the parieto-occipital sulcus. Although no effects of attention were evident using standard averaging procedures, time-varying spectral analyses of single trials revealed that the main effect of attention was to alter the level of oscillatory activity. Most notably, a sustained increase in alpha-band (7-12 Hz) activity of both calcarine and parieto-occipital origin was evident. In addition, calcarine activity in the range of 13-21 Hz was enhanced, while calcarine activity in the range of 5-6 Hz was reduced. Our results are consistent with the hypothesis that attentional modulation affects neural processing within the calcarine and parieto-occipital cortex by altering the amplitude of alpha-band activity and other natural brain rhythms. © 2003 Elsevier Inc. All rights reserved.