31 resultados para FORMALISM
em Aston University Research Archive
Resumo:
States or state sequences in neural network models are made to represent concepts from applications. This paper motivates, introduces and discusses a formalism for denoting such representations; a representation for representations. The formalism is illustrated by using it to discuss the representation of variable binding and inference abstractly, and then to present four specific representations. One of these is an apparently novel hybrid of phasic and tensor-product representations which retains the desirable properties of each.
Resumo:
A new approach to optimisation is introduced based on a precise probabilistic statement of what is ideally required of an optimisation method. It is convenient to express the formalism in terms of the control of a stationary environment. This leads to an objective function for the controller which unifies the objectives of exploration and exploitation, thereby providing a quantitative principle for managing this trade-off. This is demonstrated using a variant of the multi-armed bandit problem. This approach opens new possibilities for optimisation algorithms, particularly by using neural network or other adaptive methods for the adaptive controller. It also opens possibilities for deepening understanding of existing methods. The realisation of these possibilities requires research into practical approximations of the exact formalism.
Resumo:
A formalism for describing the dynamics of Genetic Algorithms (GAs) using method s from statistical mechanics is applied to the problem of generalization in a perceptron with binary weights. The dynamics are solved for the case where a new batch of training patterns is presented to each population member each generation, which considerably simplifies the calculation. The theory is shown to agree closely to simulations of a real GA averaged over many runs, accurately predicting the mean best solution found. For weak selection and large problem size the difference equations describing the dynamics can be expressed analytically and we find that the effects of noise due to the finite size of each training batch can be removed by increasing the population size appropriately. If this population resizing is used, one can deduce the most computationally efficient size of training batch each generation. For independent patterns this choice also gives the minimum total number of training patterns used. Although using independent patterns is a very inefficient use of training patterns in general, this work may also prove useful for determining the optimum batch size in the case where patterns are recycled.
Resumo:
A formalism recently introduced by Prugel-Bennett and Shapiro uses the methods of statistical mechanics to model the dynamics of genetic algorithms. To be of more general interest than the test cases they consider. In this paper, the technique is applied to the subset sum problem, which is a combinatorial optimization problem with a strongly non-linear energy (fitness) function and many local minima under single spin flip dynamics. It is a problem which exhibits an interesting dynamics, reminiscent of stabilizing selection in population biology. The dynamics are solved under certain simplifying assumptions and are reduced to a set of difference equations for a small number of relevant quantities. The quantities used are the population's cumulants, which describe its shape, and the mean correlation within the population, which measures the microscopic similarity of population members. Including the mean correlation allows a better description of the population than the cumulants alone would provide and represents a new and important extension of the technique. The formalism includes finite population effects and describes problems of realistic size. The theory is shown to agree closely to simulations of a real genetic algorithm and the mean best energy is accurately predicted.
Resumo:
Neural networks have often been motivated by superficial analogy with biological nervous systems. Recently, however, it has become widely recognised that the effective application of neural networks requires instead a deeper understanding of the theoretical foundations of these models. Insight into neural networks comes from a number of fields including statistical pattern recognition, computational learning theory, statistics, information geometry and statistical mechanics. As an illustration of the importance of understanding the theoretical basis for neural network models, we consider their application to the solution of multi-valued inverse problems. We show how a naive application of the standard least-squares approach can lead to very poor results, and how an appreciation of the underlying statistical goals of the modelling process allows the development of a more general and more powerful formalism which can tackle the problem of multi-modality.
Resumo:
A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.
Resumo:
A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.
Resumo:
This thesis presents an investigation, of synchronisation and causality, motivated by problems in computational neuroscience. The thesis addresses both theoretical and practical signal processing issues regarding the estimation of interdependence from a set of multivariate data generated by a complex underlying dynamical system. This topic is driven by a series of problems in neuroscience, which represents the principal background motive behind the material in this work. The underlying system is the human brain and the generative process of the data is based on modern electromagnetic neuroimaging methods . In this thesis, the underlying functional of the brain mechanisms are derived from the recent mathematical formalism of dynamical systems in complex networks. This is justified principally on the grounds of the complex hierarchical and multiscale nature of the brain and it offers new methods of analysis to model its emergent phenomena. A fundamental approach to study the neural activity is to investigate the connectivity pattern developed by the brain’s complex network. Three types of connectivity are important to study: 1) anatomical connectivity refering to the physical links forming the topology of the brain network; 2) effective connectivity concerning with the way the neural elements communicate with each other using the brain’s anatomical structure, through phenomena of synchronisation and information transfer; 3) functional connectivity, presenting an epistemic concept which alludes to the interdependence between data measured from the brain network. The main contribution of this thesis is to present, apply and discuss novel algorithms of functional connectivities, which are designed to extract different specific aspects of interaction between the underlying generators of the data. Firstly, a univariate statistic is developed to allow for indirect assessment of synchronisation in the local network from a single time series. This approach is useful in inferring the coupling as in a local cortical area as observed by a single measurement electrode. Secondly, different existing methods of phase synchronisation are considered from the perspective of experimental data analysis and inference of coupling from observed data. These methods are designed to address the estimation of medium to long range connectivity and their differences are particularly relevant in the context of volume conduction, that is known to produce spurious detections of connectivity. Finally, an asymmetric temporal metric is introduced in order to detect the direction of the coupling between different regions of the brain. The method developed in this thesis is based on a machine learning extensions of the well known concept of Granger causality. The thesis discussion is developed alongside examples of synthetic and experimental real data. The synthetic data are simulations of complex dynamical systems with the intention to mimic the behaviour of simple cortical neural assemblies. They are helpful to test the techniques developed in this thesis. The real datasets are provided to illustrate the problem of brain connectivity in the case of important neurological disorders such as Epilepsy and Parkinson’s disease. The methods of functional connectivity in this thesis are applied to intracranial EEG recordings in order to extract features, which characterize underlying spatiotemporal dynamics before during and after an epileptic seizure and predict seizure location and onset prior to conventional electrographic signs. The methodology is also applied to a MEG dataset containing healthy, Parkinson’s and dementia subjects with the scope of distinguishing patterns of pathological from physiological connectivity.
Resumo:
The field evaporation literature has been carefully analysed and is shown to contain various confusions. After redefining consistent terminology, this thesis investigates the mechanisms of field evaporation, in particular, the relevance of the theoretical mechanisms by analysing the available experimental data. A new formalism `extended image-hump formalism' is developed and is used to devise several tests of whether the image-hump mechanism is operating. The general conclusion is that in most cases the Mueller mechanism is not operating and escape takes place via Gomer-type mechanisms.
Resumo:
Hard real-time systems are a class of computer control systems that must react to demands of their environment by providing `correct' and timely responses. Since these systems are increasingly being used in systems with safety implications, it is crucial that they are designed and developed to operate in a correct manner. This thesis is concerned with developing formal techniques that allow the specification, verification and design of hard real-time systems. Formal techniques for hard real-time systems must be capable of capturing the system's functional and performance requirements, and previous work has proposed a number of techniques which range from the mathematically intensive to those with some mathematical content. This thesis develops formal techniques that contain both an informal and a formal component because it is considered that the informality provides ease of understanding and the formality allows precise specification and verification. Specifically, the combination of Petri nets and temporal logic is considered for the specification and verification of hard real-time systems. Approaches that combine Petri nets and temporal logic by allowing a consistent translation between each formalism are examined. Previously, such techniques have been applied to the formal analysis of concurrent systems. This thesis adapts these techniques for use in the modelling, design and formal analysis of hard real-time systems. The techniques are applied to the problem of specifying a controller for a high-speed manufacturing system. It is shown that they can be used to prove liveness and safety properties, including qualitative aspects of system performance. The problem of verifying quantitative real-time properties is addressed by developing a further technique which combines the formalisms of timed Petri nets and real-time temporal logic. A unifying feature of these techniques is the common temporal description of the Petri net. A common problem with Petri net based techniques is the complexity problems associated with generating the reachability graph. This thesis addresses this problem by using concurrency sets to generate a partial reachability graph pertaining to a particular state. These sets also allows each state to be checked for the presence of inconsistencies and hazards. The problem of designing a controller for the high-speed manufacturing system is also considered. The approach adopted mvolves the use of a model-based controller: This type of controller uses the Petri net models developed, thus preservIng the properties already proven of the controller. It. also contains a model of the physical system which is synchronised to the real application to provide timely responses. The various way of forming the synchronization between these processes is considered and the resulting nets are analysed using concurrency sets.
Resumo:
A major application of computers has been to control physical processes in which the computer is embedded within some large physical process and is required to control concurrent physical processes. The main difficulty with these systems is their event-driven characteristics, which complicate their modelling and analysis. Although a number of researchers in the process system community have approached the problems of modelling and analysis of such systems, there is still a lack of standardised software development formalisms for the system (controller) development, particular at early stage of the system design cycle. This research forms part of a larger research programme which is concerned with the development of real-time process-control systems in which software is used to control concurrent physical processes. The general objective of the research in this thesis is to investigate the use of formal techniques in the analysis of such systems at their early stages of development, with a particular bias towards an application to high speed machinery. Specifically, the research aims to generate a standardised software development formalism for real-time process-control systems, particularly for software controller synthesis. In this research, a graphical modelling formalism called Sequential Function Chart (SFC), a variant of Grafcet, is examined. SFC, which is defined in the international standard IEC1131 as a graphical description language, has been used widely in industry and has achieved an acceptable level of maturity and acceptance. A comparative study between SFC and Petri nets is presented in this thesis. To overcome identified inaccuracies in the SFC, a formal definition of the firing rules for SFC is given. To provide a framework in which SFC models can be analysed formally, an extended time-related Petri net model for SFC is proposed and the transformation method is defined. The SFC notation lacks a systematic way of synthesising system models from the real world systems. Thus a standardised approach to the development of real-time process control systems is required such that the system (software) functional requirements can be identified, captured, analysed. A rule-based approach and a method called system behaviour driven method (SBDM) are proposed as a development formalism for real-time process-control systems.
Resumo:
Regression problems are concerned with predicting the values of one or more continuous quantities, given the values of a number of input variables. For virtually every application of regression, however, it is also important to have an indication of the uncertainty in the predictions. Such uncertainties are expressed in terms of the error bars, which specify the standard deviation of the distribution of predictions about the mean. Accurate estimate of error bars is of practical importance especially when safety and reliability is an issue. The Bayesian view of regression leads naturally to two contributions to the error bars. The first arises from the intrinsic noise on the target data, while the second comes from the uncertainty in the values of the model parameters which manifests itself in the finite width of the posterior distribution over the space of these parameters. The Hessian matrix which involves the second derivatives of the error function with respect to the weights is needed for implementing the Bayesian formalism in general and estimating the error bars in particular. A study of different methods for evaluating this matrix is given with special emphasis on the outer product approximation method. The contribution of the uncertainty in model parameters to the error bars is a finite data size effect, which becomes negligible as the number of data points in the training set increases. A study of this contribution is given in relation to the distribution of data in input space. It is shown that the addition of data points to the training set can only reduce the local magnitude of the error bars or leave it unchanged. Using the asymptotic limit of an infinite data set, it is shown that the error bars have an approximate relation to the density of data in input space.
Resumo:
In a certain automobile factory, batch-painting of the body types in colours is controlled by an allocation system. This tries to balance production with orders, whilst making optimally-sized batches of colours. Sequences of cars entering painting cannot be optimised for easy selection of colour and batch size. `Over-production' is not allowed, in order to reduce buffer stocks of unsold vehicles. Paint quality is degraded by random effects. This thesis describes a toolkit which supports IKBS in an object-centred formalism. The intended domain of use for the toolkit is flexible manufacturing. A sizeable application program was developed, using the toolkit, to test the validity of the IKBS approach in solving the real manufacturing problem above, for which an existing conventional program was already being used. A detailed statistical analysis of the operating circumstances of the program was made to evaluate the likely need for the more flexible type of program for which the toolkit was intended. The IKBS program captures the many disparate and conflicting constraints in the scheduling knowledge and emulates the behaviour of the program installed in the factory. In the factory system, many possible, newly-discovered, heuristics would be awkward to represent and it would be impossible to make many new extensions. The representation scheme is capable of admitting changes to the knowledge, relying on the inherent encapsulating properties of object-centres programming to protect and isolate data. The object-centred scheme is supported by an enhancement of the `C' programming language and runs under BSD 4.2 UNIX. The structuring technique, using objects, provides a mechanism for separating control of expression of rule-based knowledge from the knowledge itself and allowing explicit `contexts', within which appropriate expression of knowledge can be done. Facilities are provided for acquisition of knowledge in a consistent manner.
Resumo:
An implementation of a Lexical Functional Grammar (LFG) natural language front-end to a database is presented, and its capabilities demonstrated by reference to a set of queries used in the Chat-80 system. The potential of LFG for such applications is explored. Other grammars previously used for this purpose are briefly reviewed and contrasted with LFG. The basic LFG formalism is fully described, both as to its syntax and semantics, and the deficiencies of the latter for database access application shown. Other current LFG implementations are reviewed and contrasted with the LFG implementation developed here specifically for database access. The implementation described here allows a natural language interface to a specific Prolog database to be produced from a set of grammar rule and lexical specifications in an LFG-like notation. In addition to this the interface system uses a simple database description to compile metadata about the database for later use in planning the execution of queries. Extensions to LFG's semantic component are shown to be necessary to produce a satisfactory functional analysis and semantic output for querying a database. A diverse set of natural language constructs are analysed using LFG and the derivation of Prolog queries from the F-structure output of LFG is illustrated. The functional description produced from LFG is proposed as sufficient for resolving many problems of quantification and attachment.
Resumo:
This thesis is concerned with exact solutions of Einstein's field equations of general relativity, in particular, when the source of the gravitational field is a perfect fluid with a purely electric Weyl tensor. General relativity, cosmology and computer algebra are discussed briefly. A mathematical introduction to Riemannian geometry and the tetrad formalism is then given. This is followed by a review of some previous results and known solutions concerning purely electric perfect fluids. In addition, some orthonormal and null tetrad equations of the Ricci and Bianchi identities are displayed in a form suitable for investigating these space-times. Conformally flat perfect fluids are characterised by the vanishing of the Weyl tensor and form a sub-class of the purely electric fields in which all solutions are known (Stephani 1967). The number of Killing vectors in these space-times is investigated and results presented for the non-expanding space-times. The existence of stationary fields that may also admit 0, 1 or 3 spacelike Killing vectors is demonstrated. Shear-free fluids in the class under consideration are shown to be either non-expanding or irrotational (Collins 1984) using both orthonormal and null tetrads. A discrepancy between Collins (1984) and Wolf (1986) is resolved by explicitly solving the field equations to prove that the only purely electric, shear-free, geodesic but rotating perfect fluid is the Godel (1949) solution. The irrotational fluids with shear are then studied and solutions due to Szafron (1977) and Allnutt (1982) are characterised. The metric is simplified in several cases where new solutions may be found. The geodesic space-times in this class and all Bianchi type 1 perfect fluid metrics are shown to have a metric expressible in a diagonal form. The position of spherically symmetric and Bianchi type 1 space-times in relation to the general case is also illustrated.