985 resultados para Sequential machine theory
Resumo:
Promiscuous human leukocyte antigen (HLA) binding peptides are ideal targets for vaccine development. Existing computational models for prediction of promiscuous peptides used hidden Markov models and artificial neural networks as prediction algorithms. We report a system based on support vector machines that outperforms previously published methods. Preliminary testing showed that it can predict peptides binding to HLA-A2 and -A3 super-type molecules with excellent accuracy, even for molecules where no binding data are currently available.
Resumo:
We consider a buying-selling problem when two stops of a sequence of independent random variables are required. An optimal stopping rule and the value of a game are obtained.
Resumo:
Machine learning techniques have been recognized as powerful tools for learning from data. One of the most popular learning techniques, the Back-Propagation (BP) Artificial Neural Networks, can be used as a computer model to predict peptides binding to the Human Leukocyte Antigens (HLA). The major advantage of computational screening is that it reduces the number of wet-lab experiments that need to be performed, significantly reducing the cost and time. A recently developed method, Extreme Learning Machine (ELM), which has superior properties over BP has been investigated to accomplish such tasks. In our work, we found that the ELM is as good as, if not better than, the BP in term of time complexity, accuracy deviations across experiments, and most importantly - prevention from over-fitting for prediction of peptide binding to HLA.
Resumo:
This paper describes a logic of progress for concurrent programs. The logic is based on that of UNITY, molded to fit a sequential programming model. Integration of the two is achieved by using auxiliary variables in a systematic way that incorporates program counters into the program text. The rules for progress in UNITY are then modified to suit this new system. This modification is however subtle enough to allow the theory of Owicki and Gries to be used without change.
Resumo:
This paper formulates several mathematical models for determining the optimal sequence of component placements and assignment of component types to feeders simultaneously or the integrated scheduling problem for a type of surface mount technology placement machines, called the sequential pick-andplace (PAP) machine. A PAP machine has multiple stationary feeders storing components, a stationary working table holding a printed circuit board (PCB), and a movable placement head to pick up components from feeders and place them to a board. The objective of integrated problem is to minimize the total distance traveled by the placement head. Two integer nonlinear programming models are formulated first. Then, each of them is equivalently converted into an integer linear type. The models for the integrated problem are verified by two commercial packages. In addition, a hybrid genetic algorithm previously developed by the authors is adopted to solve the models. The algorithm not only generates the optimal solutions quickly for small-sized problems, but also outperforms the genetic algorithms developed by other researchers in terms of total traveling distance.
Resumo:
Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential framework for inference in such projected processes is presented, where the observations are considered one at a time. We introduce a C++ library for carrying out such projected, sequential estimation which adds several novel features. In particular we have incorporated the ability to use a generic observation operator, or sensor model, to permit data fusion. We can also cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the variogram parameters is based on maximum likelihood estimation. We illustrate the projected sequential method in application to synthetic and real data sets. We discuss the software implementation and suggest possible future extensions.
Resumo:
The work presented in this thesis is concerned with the dynamic behaviour of structural joints which are both loaded, and excited, normal to the joint interface. Since the forces on joints are transmitted through their interface, the surface texture of joints was carefully examined. A computerised surface measuring system was developed and computer programs were written. Surface flatness was functionally defined, measured and quantised into a form suitable for the theoretical calculation of the joint stiffness. Dynamic stiffness and damping were measured at various preloads for a range of joints with different surface textures. Dry clean and lubricated joints were tested and the results indicated an increase in damping for the lubricated joints of between 30 to 100 times. A theoretical model for the computation of the stiffness of dry clean joints was built. The model is based on the theory that the elastic recovery of joints is due to the recovery of the material behind the loaded asperities. It takes into account, in a quantitative manner, the flatness deviations present on the surfaces of the joint. The theoretical results were found to be in good agreement with those measured experimentally. It was also found that theoretical assessment of the joint stiffness could be carried out using a different model based on the recovery of loaded asperities into a spherical form. Stepwise procedures are given in order to design a joint having a particular stiffness. A theoretical model for the loss factor of dry clean joints was built. The theoretical results are in reasonable agreement with those experimentally measured. The theoretical models for the stiffness and loss factor were employed to evaluate the second natural frequency of the test rig. The results are in good agreement with the experimentally measured natural frequencies.
Resumo:
The operation state of photovoltaic Module Integrated Converter (MIC) is subjected to change due to different source and load conditions, while state-swap is usually implemented with flow chart based sequential controller in the past research. In this paper, the signatures for different operational states are evaluated and investigated, which lead to an effective control integrated finite state machine (CIFSM), providing real-time state-swap as fast as the local control loop. The proposed CIFSM is implemented digitally for a boost type MIC prototype and tested under a variety of load and source conditions. The test results prove the effectiveness of the proposed CIFSM design.
Resumo:
Heterogeneous datasets arise naturally in most applications due to the use of a variety of sensors and measuring platforms. Such datasets can be heterogeneous in terms of the error characteristics and sensor models. Treating such data is most naturally accomplished using a Bayesian or model-based geostatistical approach; however, such methods generally scale rather badly with the size of dataset, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real datasets. Limitations and extensions are discussed. © 2010 Elsevier Ltd.
Resumo:
Given the growing number of wrongful convictions involving faulty eyewitness evidence and the strong reliance by jurors on eyewitness testimony, researchers have sought to develop safeguards to decrease erroneous identifications. While decades of eyewitness research have led to numerous recommendations for the collection of eyewitness evidence, less is known regarding the psychological processes that govern identification responses. The purpose of the current research was to expand the theoretical knowledge of eyewitness identification decisions by exploring two separate memory theories: signal detection theory and dual-process theory. This was accomplished by examining both system and estimator variables in the context of a novel lineup recognition paradigm. Both theories were also examined in conjunction with confidence to determine whether it might add significantly to the understanding of eyewitness memory. ^ In two separate experiments, both an encoding and a retrieval-based manipulation were chosen to examine the application of theory to eyewitness identification decisions. Dual-process estimates were measured through the use of remember-know judgments (Gardiner & Richardson-Klavehn, 2000). In Experiment 1, the effects of divided attention and lineup presentation format (simultaneous vs. sequential) were examined. In Experiment 2, perceptual distance and lineup response deadline were examined. Overall, the results indicated that discrimination and remember judgments (recollection) were generally affected by variations in encoding quality and response criterion and know judgments (familiarity) were generally affected by variations in retrieval options. Specifically, as encoding quality improved, discrimination ability and judgments of recollection increased; and as the retrieval task became more difficult there was a shift toward lenient choosing and more reliance on familiarity. ^ The application of signal detection theory and dual-process theory in the current experiments produced predictable results on both system and estimator variables. These theories were also compared to measures of general confidence, calibration, and diagnosticity. The application of the additional confidence measures in conjunction with signal detection theory and dual-process theory gave a more in-depth explanation than either theory alone. Therefore, the general conclusion is that eyewitness identifications can be understood in a more complete manor by applying theory and examining confidence. Future directions and policy implications are discussed. ^
Resumo:
Virtual machines (VMs) are powerful platforms for building agile datacenters and emerging cloud systems. However, resource management for a VM-based system is still a challenging task. First, the complexity of application workloads as well as the interference among competing workloads makes it difficult to understand their VMs’ resource demands for meeting their Quality of Service (QoS) targets; Second, the dynamics in the applications and system makes it also difficult to maintain the desired QoS target while the environment changes; Third, the transparency of virtualization presents a hurdle for guest-layer application and host-layer VM scheduler to cooperate and improve application QoS and system efficiency. This dissertation proposes to address the above challenges through fuzzy modeling and control theory based VM resource management. First, a fuzzy-logic-based nonlinear modeling approach is proposed to accurately capture a VM’s complex demands of multiple types of resources automatically online based on the observed workload and resource usages. Second, to enable fast adaption for resource management, the fuzzy modeling approach is integrated with a predictive-control-based controller to form a new Fuzzy Modeling Predictive Control (FMPC) approach which can quickly track the applications’ QoS targets and optimize the resource allocations under dynamic changes in the system. Finally, to address the limitations of black-box-based resource management solutions, a cross-layer optimization approach is proposed to enable cooperation between a VM’s host and guest layers and further improve the application QoS and resource usage efficiency. The above proposed approaches are prototyped and evaluated on a Xen-based virtualized system and evaluated with representative benchmarks including TPC-H, RUBiS, and TerraFly. The results demonstrate that the fuzzy-modeling-based approach improves the accuracy in resource prediction by up to 31.4% compared to conventional regression approaches. The FMPC approach substantially outperforms the traditional linear-model-based predictive control approach in meeting application QoS targets for an oversubscribed system. It is able to manage dynamic VM resource allocations and migrations for over 100 concurrent VMs across multiple hosts with good efficiency. Finally, the cross-layer optimization approach further improves the performance of a virtualized application by up to 40% when the resources are contended by dynamic workloads.
Resumo:
This paper introduces the stochastic version of the Geometric Machine Model for the modelling of sequential, alternative, parallel (synchronous) and nondeterministic computations with stochastic numbers stored in a (possibly infinite) shared memory. The programming language L(D! 1), induced by the Coherence Space of Processes D! 1, can be applied to sequential and parallel products in order to provide recursive definitions for such processes, together with a domain-theoretic semantics of the Stochastic Arithmetic. We analyze both the spacial (ordinal) recursion, related to spacial modelling of the stochastic memory, and the temporal (structural) recursion, given by the inclusion relation modelling partial objects in the ordered structure of process construction.
Resumo:
The aim of this thesis is to review and augment the theory and methods of optimal experimental design. In Chapter I the scene is set by considering the possible aims of an experimenter prior to an experiment, the statistical methods one might use to achieve those aims and how experimental design might aid this procedure. It is indicated that, given a criterion for design, a priori optimal design will only be possible in certain instances and, otherwise, some form of sequential procedure would seem to be indicated. In Chapter 2 an exact experimental design problem is formulated mathematically and is compared with its continuous analogue. Motivation is provided for the solution of this continuous problem, and the remainder of the chapter concerns this problem. A necessary and sufficient condition for optimality of a design measure is given. Problems which might arise in testing this condition are discussed, in particular with respect to possible non-differentiability of the criterion function at the design being tested. Several examples are given of optimal designs which may be found analytically and which illustrate the points discussed earlier in the chapter. In Chapter 3 numerical methods of solution of the continuous optimal design problem are reviewed. A new algorithm is presented with illustrations of how it should be used in practice. It is shown that, for reasonably large sample size, continuously optimal designs may be approximated to well by an exact design. In situations where this is not satisfactory algorithms for improvement of this design are reviewed. Chapter 4 consists of a discussion of sequentially designed experiments, with regard to both the philosophies underlying, and the application of the methods of, statistical inference. In Chapter 5 we criticise constructively previous suggestions for fully sequential design procedures. Alternative suggestions are made along with conjectures as to how these might improve performance. Chapter 6 presents a simulation study, the aim of which is to investigate the conjectures of Chapter 5. The results of this study provide empirical support for these conjectures. In Chapter 7 examples are analysed. These suggest aids to sequential experimentation by means of reduction of the dimension of the design space and the possibility of experimenting semi-sequentially. Further examples are considered which stress the importance of the use of prior information in situations of this type. Finally we consider the design of experiments when semi-sequential experimentation is mandatory because of the necessity of taking batches of observations at the same time. In Chapter 8 we look at some of the assumptions which have been made and indicate what may go wrong where these assumptions no longer hold.
Resumo:
Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.
Resumo:
Data sources are often dispersed geographically in real life applications. Finding a knowledge model may require to join all the data sources and to run a machine learning algorithm on the joint set. We present an alternative based on a Multi Agent System (MAS): an agent mines one data source in order to extract a local theory (knowledge model) and then merges it with the previous MAS theory using a knowledge fusion technique. This way, we obtain a global theory that summarizes the distributed knowledge without spending resources and time in joining data sources. New experiments have been executed including statistical significance analysis. The results show that, as a result of knowledge fusion, the accuracy of initial theories is significantly improved as well as the accuracy of the monolithic solution.