994 resultados para Stochastic adding machine
Resumo:
Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The Operations Research (OR) community have defined many deterministic manufacturing control problems mainly focused on scheduling. Well-defined benchmark problems provide a mechanism for communication of the effectiveness of different optimization algorithms. Manufacturing problems within industry are stochastic and complex. Common features of these problems include: variable demand, machine part specific breakdown patterns, part machine specific process durations, continuous production, Finished Goods Inventory (FGI) buffers, bottleneck machines and limited production capacity. Discrete Event Simulation (DES) is a commonly used tool for studying manufacturing systems of realistic complexity. There are few reports of detail-rich benchmark problems for use within the simulation optimization community that are as complex as those faced by production managers. This work details an algorithm that can be used to create single and multistage production control problems. The reported software implementation of the algorithm generates text files in eXtensible Markup Language (XML) format that are easily edited and understood as well as being cross-platform compatible. The distribution and acceptance of benchmark problems generated with the algorithm would enable researchers working on simulation and optimization of manufacturing problems to effectively communicate results to benefit the field in general.
Resumo:
This paper is concerned with the problem of finite-time stabilization for some nonlinear stochastic systems. Based on the stochastic Lyapunov theorem on finite-time stability that has been established by the authors in the paper, it is proven that Euler-type stochastic nonlinear systems can be finite-time stabilized via a family of continuous feedback controllers. Using the technique of adding a power integrator, a continuous, global state feedback controller is constructed to stabilize in finite time a large class of two-dimensional lower-triangular stochastic nonlinear systems. Also, for a class of three-dimensional lower-triangular stochastic nonlinear systems, a recursive design scheme of finite-time stabilization is given by developing the technique of adding a power integrator and constructing a continuous feedback controller. Finally, a simulation example is given to illustrate the theoretical results. © 2014 John Wiley & Sons, Ltd.
Resumo:
In this paper, the problem of global finite-time stabilisation by output feedback is considered for a class of stochastic nonlinear systems. First, based on homogeneous systems theory and the adding a power integrator technique, a homogeneous reduced order observer and control law are constructed in a recursive manner for the nominal system. Then, the homogeneous domination approach is used to deal with the nonlinearities in drift and diffusion terms; it is shown that the proposed output-feedback control law can guarantee that the closed-loop system is global finite-time stable in probability. Finally, simulation examples are carried out to demonstrate the effectiveness of the proposed control scheme.
Resumo:
The Adaptive Multiple-hyperplane Machine (AMM) was recently proposed to deal with large-scale datasets. However, it has no principle to tune the complexity and sparsity levels of the solution. Addressing the sparsity is important to improve learning generalization, prediction accuracy and computational speedup. In this paper, we employ the max-margin principle and sparse approach to propose a new Sparse AMM (SAMM). We solve the new optimization objective function with stochastic gradient descent (SGD). Besides inheriting the good features of SGD-based learning method and the original AMM, our proposed Sparse AMM provides machinery and flexibility to tune the complexity and sparsity of the solution, making it possible to avoid overfitting and underfitting. We validate our approach on several large benchmark datasets. We show that with the ability to control sparsity, the proposed Sparse AMM yields superior classification accuracy to the original AMM while simultaneously achieving computational speedup.
Resumo:
Competitive learning is an important machine learning approach which is widely employed in artificial neural networks. In this paper, we present a rigorous definition of a new type of competitive learning scheme realized on large-scale networks. The model consists of several particles walking within the network and competing with each other to occupy as many nodes as possible, while attempting to reject intruder particles. The particle's walking rule is composed of a stochastic combination of random and preferential movements. The model has been applied to solve community detection and data clustering problems. Computer simulations reveal that the proposed technique presents high precision of community and cluster detections, as well as low computational complexity. Moreover, we have developed an efficient method for estimating the most likely number of clusters by using an evaluator index that monitors the information generated by the competition process itself. We hope this paper will provide an alternative way to the study of competitive learning.
Resumo:
Semisupervised learning is a machine learning approach that is able to employ both labeled and unlabeled samples in the training process. In this paper, we propose a semisupervised data classification model based on a combined random-preferential walk of particles in a network (graph) constructed from the input dataset. The particles of the same class cooperate among themselves, while the particles of different classes compete with each other to propagate class labels to the whole network. A rigorous model definition is provided via a nonlinear stochastic dynamical system and a mathematical analysis of its behavior is carried out. A numerical validation presented in this paper confirms the theoretical predictions. An interesting feature brought by the competitive-cooperative mechanism is that the proposed model can achieve good classification rates while exhibiting low computational complexity order in comparison to other network-based semisupervised algorithms. Computer simulations conducted on synthetic and real-world datasets reveal the effectiveness of the model.
Resumo:
The elephant walk model originally proposed by Schutz and Trimper to investigate non-Markovian processes led to the investigation of a series of other random-walk models. Of these, the best known is the Alzheimer walk model, because it was the first model shown to have amnestically induced persistence-i.e. superdiffusion caused by loss of memory. Here we study the robustness of the Alzheimer walk by adding a memoryless stochastic perturbation. Surprisingly, the solution of the perturbed model can be formally reduced to the solutions of the unperturbed model. Specifically, we give an exact solution of the perturbed model by finding a surjective mapping to the unperturbed model. Copyright (C) EPLA, 2012
Resumo:
During the last few years, a great deal of interest has risen concerning the applications of stochastic methods to several biochemical and biological phenomena. Phenomena like gene expression, cellular memory, bet-hedging strategy in bacterial growth and many others, cannot be described by continuous stochastic models due to their intrinsic discreteness and randomness. In this thesis I have used the Chemical Master Equation (CME) technique to modelize some feedback cycles and analyzing their properties, including experimental data. In the first part of this work, the effect of stochastic stability is discussed on a toy model of the genetic switch that triggers the cellular division, which malfunctioning is known to be one of the hallmarks of cancer. The second system I have worked on is the so-called futile cycle, a closed cycle of two enzymatic reactions that adds and removes a chemical compound, called phosphate group, to a specific substrate. I have thus investigated how adding noise to the enzyme (that is usually in the order of few hundred molecules) modifies the probability of observing a specific number of phosphorylated substrate molecules, and confirmed theoretical predictions with numerical simulations. In the third part the results of the study of a chain of multiple phosphorylation-dephosphorylation cycles will be presented. We will discuss an approximation method for the exact solution in the bidimensional case and the relationship that this method has with the thermodynamic properties of the system, which is an open system far from equilibrium.In the last section the agreement between the theoretical prediction of the total protein quantity in a mouse cells population and the observed quantity will be shown, measured via fluorescence microscopy.
Resumo:
This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
In maritime transportation, decisions are made in a dynamic setting where many aspects of the future are uncertain. However, most academic literature on maritime transportation considers static and deterministic routing and scheduling problems. This work addresses a gap in the literature on dynamic and stochastic maritime routing and scheduling problems, by focusing on the scheduling of departure times. Five simple strategies for setting departure times are considered, as well as a more advanced strategy which involves solving a mixed integer mathematical programming problem. The latter strategy is significantly better than the other methods, while adding only a small computational effort.
Resumo:
Heuristics for stochastic and dynamic vehicle routing problems are often kept relatively simple, in part due to the high computational burden resulting from having to consider stochastic information in some form. In this work, three existing heuristics are extended by three different local search variations: a first improvement descent using stochastic information, a tabu search using stochastic information when updating the incumbent solution, and a tabu search using stochastic information when selecting moves based on a list of moves determined through a proxy evaluation. In particular, the three local search variations are designed to utilize stochastic information in the form of sampled scenarios. The results indicate that adding local search using stochastic information to the existing heuristics can further reduce operating costs for shipping companies by 0.5–2 %. While the existing heuristics could produce structurally different solutions even when using similar stochastic information in the search, the appended local search methods seem able to make the final solutions more similar in structure.
Resumo:
Thesis (M.A.)--University of Illinois.
Resumo:
This thesis is concerned with approximate inference in dynamical systems, from a variational Bayesian perspective. When modelling real world dynamical systems, stochastic differential equations appear as a natural choice, mainly because of their ability to model the noise of the system by adding a variant of some stochastic process to the deterministic dynamics. Hence, inference in such processes has drawn much attention. Here two new extended frameworks are derived and presented that are based on basis function expansions and local polynomial approximations of a recently proposed variational Bayesian algorithm. It is shown that the new extensions converge to the original variational algorithm and can be used for state estimation (smoothing). However, the main focus is on estimating the (hyper-) parameters of these systems (i.e. drift parameters and diffusion coefficients). The new methods are numerically validated on a range of different systems which vary in dimensionality and non-linearity. These are the Ornstein-Uhlenbeck process, for which the exact likelihood can be computed analytically, the univariate and highly non-linear, stochastic double well and the multivariate chaotic stochastic Lorenz '63 (3-dimensional model). The algorithms are also applied to the 40 dimensional stochastic Lorenz '96 system. In this investigation these new approaches are compared with a variety of other well known methods such as the ensemble Kalman filter / smoother, a hybrid Monte Carlo sampler, the dual unscented Kalman filter (for jointly estimating the systems states and model parameters) and full weak-constraint 4D-Var. Empirical analysis of their asymptotic behaviour as a function of observation density or length of time window increases is provided.