161 resultados para Generalization Problem
Resumo:
Learning from Positive and Unlabelled examples (LPU) has emerged as an important problem in data mining and information retrieval applications. Existing techniques are not ideally suited for real world scenarios where the datasets are linearly inseparable, as they either build linear classifiers or the non-linear classifiers fail to achieve the desired performance. In this work, we propose to extend maximum margin clustering ideas and present an iterative procedure to design a non-linear classifier for LPU. In particular, we build a least squares support vector classifier, suitable for handling this problem due to symmetry of its loss function. Further, we present techniques for appropriately initializing the labels of unlabelled examples and for enforcing the ratio of positive to negative examples while obtaining these labels. Experiments on real-world datasets demonstrate that the non-linear classifier designed using the proposed approach gives significantly better generalization performance than the existing relevant approaches for LPU.
Resumo:
Earlier work on cyclic pursuit systems has shown that using heterogeneous gains for agents in linear cyclic pursuit, the point of convergence (rendezvous point) can be chosen arbitrarily. But there are some restrictions on this set of reachable points. The use of deviated cyclic pursuit, as discussed in this paper, expands this set of reachable points to include points which are not reachable by any known linear cyclic pursuit scheme. The limits on the deviations are determined by stability considerations. Such limits have been analytically obtained in this paper along with results on the expansion in reachable set and the latter has also been verified through simulations.
Resumo:
Homogenization and error analysis of an optimal interior control problem in the framework of Stokes' system, on a domain with rapidly oscillating boundary, are the subject matters of this article. We consider a three dimensional domain constituted of a parallelepiped with a large number of rectangular cylinders at the top of it. An interior control is applied in a proper subdomain of the parallelepiped, away from the oscillating volume. We consider two types of functionals, namely a functional involving the L-2-norm of the state variable and another one involving its H-1-norm. The asymptotic analysis of optimality systems for both cases, when the cross sectional area of the rectangular cylinders tends to zero, is done here. Our major contribution is to derive error estimates for the state, the co-state and the associated pressures, in appropriate functional spaces.
Resumo:
In this paper, we consider the setting of the pattern maximum likelihood (PML) problem studied by Orlitsky et al. We present a well-motivated heuristic algorithm for deciding the question of when the PML distribution of a given pattern is uniform. The algorithm is based on the concept of a ``uniform threshold''. This is a threshold at which the uniform distribution exhibits an interesting phase transition in the PML problem, going from being a local maximum to being a local minimum.
Resumo:
This paper attempts to unravel any relations that may exist between turbulent shear flows and statistical mechanics through a detailed numerical investigation in the simplest case where both can be well defined. The flow considered for the purpose is the two-dimensional (2D) temporal free shear layer with a velocity difference Delta U across it, statistically homogeneous in the streamwise direction (x) and evolving from a plane vortex sheet in the direction normal to it (y) in a periodic-in-x domain L x +/-infinity. Extensive computer simulations of the flow are carried out through appropriate initial-value problems for a ``vortex gas'' comprising N point vortices of the same strength (gamma = L Delta U/N) and sign. Such a vortex gas is known to provide weak solutions of the Euler equation. More than ten different initial-condition classes are investigated using simulations involving up to 32 000 vortices, with ensemble averages evaluated over up to 10(3) realizations and integration over 10(4)L/Delta U. The temporal evolution of such a system is found to exhibit three distinct regimes. In Regime I the evolution is strongly influenced by the initial condition, sometimes lasting a significant fraction of L/Delta U. Regime III is a long-time domain-dependent evolution towards a statistically stationary state, via ``violent'' and ``slow'' relaxations P.-H. Chavanis, Physica A 391, 3657 (2012)], over flow time scales of order 10(2) and 10(4)L/Delta U, respectively (for N = 400). The final state involves a single structure that stochastically samples the domain, possibly constituting a ``relative equilibrium.'' The vortex distribution within the structure follows a nonisotropic truncated form of the Lundgren-Pointin (L-P) equilibrium distribution (with negatively high temperatures; L-P parameter lambda close to -1). The central finding is that, in the intermediate Regime II, the spreading rate of the layer is universal over the wide range of cases considered here. The value (in terms of momentum thickness) is 0.0166 +/- 0.0002 times Delta U. Regime II, extensively studied in the turbulent shear flow literature as a self-similar ``equilibrium'' state, is, however, a part of the rapid nonequilibrium evolution of the vortex-gas system, which we term ``explosive'' as it lasts less than one L/Delta U. Regime II also exhibits significant values of N-independent two-vortex correlations, indicating that current kinetic theories that neglect correlations or consider them as O(1/N) cannot describe this regime. The evolution of the layer thickness in present simulations in Regimes I and II agree with the experimental observations of spatially evolving (3D Navier-Stokes) shear layers. Further, the vorticity-stream-function relations in Regime III are close to those computed in 2D Navier-Stokes temporal shear layers J. Sommeria, C. Staquet, and R. Robert, J. Fluid Mech. 233, 661 (1991)]. These findings suggest the dominance of what may be called the Kelvin-Biot-Savart mechanism in determining the growth of the free shear layer through large-scale momentum and vorticity dispersal.
Resumo:
Elastic Net Regularizers have shown much promise in designing sparse classifiers for linear classification. In this work, we propose an alternating optimization approach to solve the dual problems of elastic net regularized linear classification Support Vector Machines (SVMs) and logistic regression (LR). One of the sub-problems turns out to be a simple projection. The other sub-problem can be solved using dual coordinate descent methods developed for non-sparse L2-regularized linear SVMs and LR, without altering their iteration complexity and convergence properties. Experiments on very large datasets indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.
Resumo:
The Cubic Sieve Method for solving the Discrete Logarithm Problem in prime fields requires a nontrivial solution to the Cubic Sieve Congruence (CSC) x(3) equivalent to y(2)z (mod p), where p is a given prime number. A nontrivial solution must also satisfy x(3) not equal y(2)z and 1 <= x, y, z < p(alpha), where alpha is a given real number such that 1/3 < alpha <= 1/2. The CSC problem is to find an efficient algorithm to obtain a nontrivial solution to CSC. CSC can be parametrized as x equivalent to v(2)z (mod p) and y equivalent to v(3)z (mod p). In this paper, we give a deterministic polynomial-time (O(ln(3) p) bit-operations) algorithm to determine, for a given v, a nontrivial solution to CSC, if one exists. Previously it took (O) over tilde (p(alpha)) time in the worst case to determine this. We relate the CSC problem to the gap problem of fractional part sequences, where we need to determine the non-negative integers N satisfying the fractional part inequality {theta N} < phi (theta and phi are given real numbers). The correspondence between the CSC problem and the gap problem is that determining the parameter z in the former problem corresponds to determining N in the latter problem. We also show in the alpha = 1/2 case of CSC that for a certain class of primes the CSC problem can be solved deterministically in <(O)over tilde>(p(1/3)) time compared to the previous best of (O) over tilde (p(1/2)). It is empirically observed that about one out of three primes is covered by the above class. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
In this article, we analyse several discontinuous Galerkin (DG) methods for the Stokes problem under minimal regularity on the solution. We assume that the velocity u belongs to H-0(1)(Omega)](d) and the pressure p is an element of L-0(2)(Omega). First, we analyse standard DG methods assuming that the right-hand side f belongs to H-1(Omega) boolean AND L-1(Omega)](d). A DG method that is well defined for f belonging to H-1(Omega)](d) is then investigated. The methods under study include stabilized DG methods using equal-order spaces and inf-sup stable ones where the pressure space is one polynomial degree less than the velocity space.
Resumo:
We investigate the parameterized complexity of the following edge coloring problem motivated by the problem of channel assignment in wireless networks. For an integer q >= 2 and a graph G, the goal is to find a coloring of the edges of G with the maximum number of colors such that every vertex of the graph sees at most q colors. This problem is NP-hard for q >= 2, and has been well-studied from the point of view of approximation. Our main focus is the case when q = 2, which is already theoretically intricate and practically relevant. We show fixed-parameter tractable algorithms for both the standard and the dual parameter, and for the latter problem, the result is based on a linear vertex kernel.
Resumo:
The correlation clustering problem is a fundamental problem in both theory and practice, and it involves identifying clusters of objects in a data set based on their similarity. A traditional modeling of this question as a graph theoretic problem involves associating vertices with data points and indicating similarity by adjacency. Clusters then correspond to cliques in the graph. The resulting optimization problem, Cluster Editing (and several variants) are very well-studied algorithmically. In many situations, however, translating clusters to cliques can be somewhat restrictive. A more flexible notion would be that of a structure where the vertices are mutually ``not too far apart'', without necessarily being adjacent. One such generalization is realized by structures called s-clubs, which are graphs of diameter at most s. In this work, we study the question of finding a set of at most k edges whose removal leaves us with a graph whose components are s-clubs. Recently, it has been shown that unless Exponential Time Hypothesis fail (ETH) fails Cluster Editing (whose components are 1-clubs) does not admit sub-exponential time algorithm STACS, 2013]. That is, there is no algorithm solving the problem in time 2 degrees((k))n(O(1)). However, surprisingly they show that when the number of cliques in the output graph is restricted to d, then the problem can be solved in time O(2(O(root dk)) + m + n). We show that this sub-exponential time algorithm for the fixed number of cliques is rather an exception than a rule. Our first result shows that assuming the ETH, there is no algorithm solving the s-Club Cluster Edge Deletion problem in time 2 degrees((k))n(O(1)). We show, further, that even the problem of deleting edges to obtain a graph with d s-clubs cannot be solved in time 2 degrees((k))n(O)(1) for any fixed s, d >= 2. This is a radical contrast from the situation established for cliques, where sub-exponential algorithms are known.
Resumo:
We address the parameterized complexity ofMaxColorable Induced Subgraph on perfect graphs. The problem asks for a maximum sized q-colorable induced subgraph of an input graph G. Yannakakis and Gavril IPL 1987] showed that this problem is NP-complete even on split graphs if q is part of input, but gave a n(O(q)) algorithm on chordal graphs. We first observe that the problem is W2]-hard parameterized by q, even on split graphs. However, when parameterized by l, the number of vertices in the solution, we give two fixed-parameter tractable algorithms. The first algorithm runs in time 5.44(l) (n+#alpha(G))(O(1)) where #alpha(G) is the number of maximal independent sets of the input graph. The second algorithm runs in time q(l+o()l())n(O(1))T(alpha) where T-alpha is the time required to find a maximum independent set in any induced subgraph of G. The first algorithm is efficient when the input graph contains only polynomially many maximal independent sets; for example split graphs and co-chordal graphs. The running time of the second algorithm is FPT in l alone (whenever T-alpha is a polynomial in n), since q <= l for all non-trivial situations. Finally, we show that (under standard complexitytheoretic assumptions) the problem does not admit a polynomial kernel on split and perfect graphs in the following sense: (a) On split graphs, we do not expect a polynomial kernel if q is a part of the input. (b) On perfect graphs, we do not expect a polynomial kernel even for fixed values of q >= 2.
Resumo:
The efficiency of long-distance acoustic signalling of insects in their natural habitat is constrained in several ways. Acoustic signals are not only subjected to changes imposed by the physical structure of the habitat such as attenuation and degradation but also to masking interference from co-occurring signals of other acoustically communicating species. Masking interference is likely to be a ubiquitous problem in multi-species assemblages, but successful communication in natural environments under noisy conditions suggests powerful strategies to deal with the detection and recognition of relevant signals. In this review we present recent work on the role of the habitat as a driving force in shaping insect signal structures. In the context of acoustic masking interference, we discuss the ecological niche concept and examine the role of acoustic resource partitioning in the temporal, spatial and spectral domains as sender strategies to counter masking. We then examine the efficacy of different receiver strategies: physiological mechanisms such as frequency tuning, spatial release from masking and gain control as useful strategies to counteract acoustic masking. We also review recent work on the effects of anthropogenic noise on insect acoustic communication and the importance of insect sounds as indicators of biodiversity and ecosystem health.
Resumo:
This article considers a semi-infinite mathematical programming problem with equilibrium constraints (SIMPEC) defined as a semi-infinite mathematical programming problem with complementarity constraints. We establish necessary and sufficient optimality conditions for the (SIMPEC). We also formulate Wolfe- and Mond-Weir-type dual models for (SIMPEC) and establish weak, strong and strict converse duality theorems for (SIMPEC) and the corresponding dual problems under invexity assumptions.