884 resultados para Dunkl Kernel
Resumo:
Emerging embedded applications are based on evolving standards (e.g., MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a polymorphic ASIC in which specialized hardware units are replaced with basic hardware units that can create the same functionality by runtime re-composition. It is a ``future-proof'' custom hardware solution for multiple applications and their derivatives in a domain. In this article, we describe a compiler framework and supporting hardware comprising compute, storage, and communication resources. Applications described in high-level language (e.g., C) are compiled into application substructures. For each application substructure, a set of compute elements on the hardware are interconnected during runtime to form a pattern that closely matches the communication pattern of that particular application. The advantage is that the bounded CEs are neither processor cores nor logic elements as in FPGAs. Hence, REDEFINE offers the power and performance advantage of an ASIC and the hardware reconfigurability and programmability of that of an FPGA/instruction set processor. In addition, the hardware supports custom instruction pipelining. Existing instruction-set extensible processors determine a sequence of instructions that repeatedly occur within the application to create custom instructions at design time to speed up the execution of this sequence. We extend this scheme further, where a kernel is compiled into custom instructions that bear strong producer-consumer relationship (and not limited to frequently occurring sequences of instructions). Custom instructions, realized as hardware compositions effected at runtime, allow several instances of the same to be active in parallel. A key distinguishing factor in majority of the emerging embedded applications is stream processing. To reduce the overheads of data transfer between custom instructions, direct communication paths are employed among custom instructions. In this article, we present the overview of the hardware-aware compiler framework, which determines the NoC-aware schedule of transports of the data exchanged between the custom instructions on the interconnect. The results for the FFT kernel indicate a 25% reduction in the number of loads/stores, and throughput improves by log(n) for n-point FFT when compared to sequential implementation. Overall, REDEFINE offers flexibility and a runtime reconfigurability at the expense of 1.16x in power and 8x in area when compared to an ASIC. REDEFINE implementation consumes 0.1x the power of an FPGA implementation. In addition, the configuration overhead of the FPGA implementation is 1,000x more than that of REDEFINE.
Resumo:
The problem of unsupervised anomaly detection arises in a wide variety of practical applications. While one-class support vector machines have demonstrated their effectiveness as an anomaly detection technique, their ability to model large datasets is limited due to their memory and time complexity for training. To address this issue for supervised learning of kernel machines, there has been growing interest in random projection methods as an alternative to the computationally expensive problems of kernel matrix construction and sup-port vector optimisation. In this paper we leverage the theory of nonlinear random projections and propose the Randomised One-class SVM (R1SVM), which is an efficient and scalable anomaly detection technique that can be trained on large-scale datasets. Our empirical analysis on several real-life and synthetic datasets shows that our randomised 1SVM algorithm achieves comparable or better accuracy to deep auto encoder and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.
Resumo:
State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.
Resumo:
Many conventional statistical machine learning al- gorithms generalise poorly if distribution bias ex- ists in the datasets. For example, distribution bias arises in the context of domain generalisation, where knowledge acquired from multiple source domains need to be used in a previously unseen target domains. We propose Elliptical Summary Randomisation (ESRand), an efficient domain generalisation approach that comprises of a randomised kernel and elliptical data summarisation. ESRand learns a domain interdependent projection to a la- tent subspace that minimises the existing biases to the data while maintaining the functional relationship between domains. In the latent subspace, ellipsoidal summaries replace the samples to enhance the generalisation by further removing bias and noise in the data. Moreover, the summarisation enables large-scale data processing by significantly reducing the size of the data. Through comprehensive analysis, we show that our subspace-based approach outperforms state-of-the-art results on several activity recognition benchmark datasets, while keeping the computational complexity significantly low.
Resumo:
We derive a very general expression of the survival probability and the first passage time distribution for a particle executing Brownian motion in full phase space with an absorbing boundary condition at a point in the position space, which is valid irrespective of the statistical nature of the dynamics. The expression, together with the Jensen's inequality, naturally leads to a lower bound to the actual survival probability and an approximate first passage time distribution. These are expressed in terms of the position-position, velocity-velocity, and position-velocity variances. Knowledge of these variances enables one to compute a lower bound to the survival probability and consequently the first passage distribution function. As examples, we compute these for a Gaussian Markovian process and, in the case of non-Markovian process, with an exponentially decaying friction kernel and also with a power law friction kernel. Our analysis shows that the survival probability decays exponentially at the long time irrespective of the nature of the dynamics with an exponent equal to the transition state rate constant.
Resumo:
Dynamic systems involving convolution integrals with decaying kernels, of which fractionally damped systems form a special case, are non-local in time and hence infinite dimensional. Straightforward numerical solution of such systems up to time t needs O(t(2)) computations owing to the repeated evaluation of integrals over intervals that grow like t. Finite-dimensional and local approximations are thus desirable. We present here an approximation method which first rewrites the evolution equation as a coupled in finite-dimensional system with no convolution, and then uses Galerkin approximation with finite elements to obtain linear, finite-dimensional, constant coefficient approximations for the convolution. This paper is a broad generalization, based on a new insight, of our prior work with fractional order derivatives (Singh & Chatterjee 2006 Nonlinear Dyn. 45, 183-206). In particular, the decaying kernels we can address are now generalized to the Laplace transforms of known functions; of these, the power law kernel of fractional order differentiation is a special case. The approximation can be refined easily. The local nature of the approximation allows numerical solution up to time t with O(t) computations. Examples with several different kernels show excellent performance. A key feature of our approach is that the dynamic system in which the convolution integral appears is itself approximated using another system, as distinct from numerically approximating just the solution for the given initial values; this allows non-standard uses of the approximation, e. g. in stability analyses.
Resumo:
In this paper we propose a novel family of kernels for multivariate time-series classification problems. Each time-series is approximated by a linear combination of piecewise polynomial functions in a Reproducing Kernel Hilbert Space by a novel kernel interpolation technique. Using the associated kernel function a large margin classification formulation is proposed which can discriminate between two classes. The formulation leads to kernels, between two multivariate time-series, which can be efficiently computed. The kernels have been successfully applied to writer independent handwritten character recognition.
Resumo:
Support Vector Machines(SVMs) are hyperplane classifiers defined in a kernel induced feature space. The data size dependent training time complexity of SVMs usually prohibits its use in applications involving more than a few thousands of data points. In this paper we propose a novel kernel based incremental data clustering approach and its use for scaling Non-linear Support Vector Machines to handle large data sets. The clustering method introduced can find cluster abstractions of the training data in a kernel induced feature space. These cluster abstractions are then used for selective sampling based training of Support Vector Machines to reduce the training time without compromising the generalization performance. Experiments done with real world datasets show that this approach gives good generalization performance at reasonable computational expense.
Resumo:
Automatic identification of software faults has enormous practical significance. This requires characterizing program execution behavior and the use of appropriate data mining techniques on the chosen representation. In this paper, we use the sequence of system calls to characterize program execution. The data mining tasks addressed are learning to map system call streams to fault labels and automatic identification of fault causes. Spectrum kernels and SVM are used for the former while latent semantic analysis is used for the latter The techniques are demonstrated for the intrusion dataset containing system call traces. The results show that kernel techniques are as accurate as the best available results but are faster by orders of magnitude. We also show that latent semantic indexing is capable of revealing fault-specific features.
Resumo:
Microorganisms exist predominantly as sessile multispecies communities in natural habitats. Most bacterial species can form these matrix-enclosed microbial communities called biofilms. Biofilms occur in a wide range of environments, on every surface with sufficient moisture and nutrients, also on surfaces in industrial settings and engineered water systems. This unwanted biofilm formation on equipment surfaces is called biofouling. Biofouling can significantly decrease equipment performance and lifetime and cause contamination and impaired quality of the industrial product. In this thesis we studied bacterial adherence to abiotic surfaces by using coupons of stainless steel coated or not coated with fluoropolymer or diamond like carbon (DLC). As model organisms we used bacterial isolates from paper machines (Meiothermus silvanus, Pseudoxanthomonas taiwanensis and Deinococcus geothermalis) and also well characterised species isolated from medical implants (Staphylococcus epidermidis). We found that coating of steel surface with these materials reduced its tendency towards biofouling: Fluoropolymer and DLC coatings repelled all four biofilm formers on steel. We found great differences between bacterial species in their preference of surfaces to adhere as well as their ultrastructural details, like number and thickness of adhesion organelles they expressed. These details responded differently towards the different surfaces they adhered to. We further found that biofilms of D. geothermalis formed on titanium dioxide coated coupons of glass, steel and titanium, were effectively removed by photocatalytic action in response to irradiation at 360 nm. However, on non-coated glass or steel surfaces irradiation had no detectable effect on the amount of bacterial biomass. We showed that the adhesion organelles of bacteria on illuminated TiO2 coated coupons were complety destroyed whereas on non-coated coupons they looked intact when observed by microscope. Stainless steel is the most widely used material for industrial process equipments and surfaces. The results in this thesis showed that stainless steel is prone to biofouling by phylogenetically distant bacterial species and that coating of the steel may offer a tool for reduced biofouling of industrial equipment. Photocatalysis, on the other hand, is a potential technique for biofilm removal from surfaces in locations where high level of hygiene is required. Our study of natural biofilms on barley kernel surfaces showed that also there the microbes possessed adhesion organelles visible with electronmicroscope both before and after steeping. The microbial community of dry barley kernels turned into a dense biofilm covered with slimy extracellular polymeric substance (EPS) in the kernels after steeping in water. Steeping is the first step in malting. We also presented evidence showing that certain strains of Lactobacillus plantarum and Wickerhamomyces anomalus, when used as starter cultures in the steeping water, could enter the barley kernel and colonise the tissues of the barley kernel. By use of a starter culture it was possible to reduce the extensive production of EPS, which resulted in a faster filtration of the mash.
Resumo:
A key trait of Free and Open Source Software (FOSS) development is its distributed nature. Nevertheless, two project-level operations, the fork and the merge of program code, are among the least well understood events in the lifespan of a FOSS project. Some projects have explicitly adopted these operations as the primary means of concurrent development. In this study, we examine the effect of highly distributed software development, is found in the Linux kernel project, on collection and modelling of software development data. We find that distributed development calls for sophisticated temporal modelling techniques where several versions of the source code tree can exist at once. Attention must be turned towards the methods of quality assurance and peer review that projects employ to manage these parallel source trees. Our analysis indicates that two new metrics, fork rate and merge rate, could be useful for determining the role of distributed version control systems in FOSS projects. The study presents a preliminary data set consisting of version control and mailing list data.
Resumo:
A key trait of Free and Open Source Software (FOSS) development is its distributed nature. Nevertheless, two project-level operations, the fork and the merge of program code, are among the least well understood events in the lifespan of a FOSS project. Some projects have explicitly adopted these operations as the primary means of concurrent development. In this study, we examine the effect of highly distributed software development, is found in the Linux kernel project, on collection and modelling of software development data. We find that distributed development calls for sophisticated temporal modelling techniques where several versions of the source code tree can exist at once. Attention must be turned towards the methods of quality assurance and peer review that projects employ to manage these parallel source trees. Our analysis indicates that two new metrics, fork rate and merge rate, could be useful for determining the role of distributed version control systems in FOSS projects. The study presents a preliminary data set consisting of version control and mailing list data.
Resumo:
According to certain arguments, computation is observer-relative either in the sense that many physical systems implement many computations (Hilary Putnam), or in the sense that almost all physical systems implement all computations (John Searle). If sound, these arguments have a potentially devastating consequence for the computational theory of mind: if arbitrary physical systems can be seen to implement arbitrary computations, the notion of computation seems to lose all explanatory power as far as brains and minds are concerned. David Chalmers and B. Jack Copeland have attempted to counter these relativist arguments by placing certain constraints on the definition of implementation. In this thesis, I examine their proposals and find both wanting in some respects. During the course of this examination, I give a formal definition of the class of combinatorial-state automata , upon which Chalmers s account of implementation is based. I show that this definition implies two theorems (one an observation due to Curtis Brown) concerning the computational power of combinatorial-state automata, theorems which speak against founding the theory of implementation upon this formalism. Toward the end of the thesis, I sketch a definition of the implementation of Turing machines in dynamical systems, and offer this as an alternative to Chalmers s and Copeland s accounts of implementation. I demonstrate that the definition does not imply Searle s claim for the universal implementation of computations. However, the definition may support claims that are weaker than Searle s, yet still troubling to the computationalist. There remains a kernel of relativity in implementation at any rate, since the interpretation of physical systems seems itself to be an observer-relative matter, to some degree at least. This observation helps clarify the role the notion of computation can play in cognitive science. Specifically, I will argue that the notion should be conceived as an instrumental rather than as a fundamental or foundational one.
Resumo:
Statistical learning algorithms provide a viable framework for geotechnical engineering modeling. This paper describes two statistical learning algorithms applied for site characterization modeling based on standard penetration test (SPT) data. More than 2700 field SPT values (N) have been collected from 766 boreholes spread over an area of 220 sqkm area in Bangalore. To get N corrected value (N,), N values have been corrected (Ne) for different parameters such as overburden stress, size of borehole, type of sampler, length of connecting rod, etc. In three-dimensional site characterization model, the function N-c=N-c (X, Y, Z), where X, Y and Z are the coordinates of a point corresponding to N, value, is to be approximated in which N, value at any half-space point in Bangalore can be determined. The first algorithm uses least-square support vector machine (LSSVM), which is related to aridge regression type of support vector machine. The second algorithm uses relevance vector machine (RVM), which combines the strengths of kernel-based methods and Bayesian theory to establish the relationships between a set of input vectors and a desired output. The paper also presents the comparative study between the developed LSSVM and RVM model for site characterization. Copyright (C) 2009 John Wiley & Sons,Ltd.
First simultaneous measurement of the top quark mass in the lepton+jets and dilepton channels at CDF
Resumo:
We present a measurement of the mass of the top quark using data corresponding to an integrated luminosity of 1.9fb^-1 of ppbar collisions collected at sqrt{s}=1.96 TeV with the CDF II detector at Fermilab's Tevatron. This is the first measurement of the top quark mass using top-antitop pair candidate events in the lepton + jets and dilepton decay channels simultaneously. We reconstruct two observables in each channel and use a non-parametric kernel density estimation technique to derive two-dimensional probability density functions from simulated signal and background samples. The observables are the top quark mass and the invariant mass of two jets from the W decay in the lepton + jets channel, and the top quark mass and the scalar sum of transverse energy of the event in the dilepton channel. We perform a simultaneous fit for the top quark mass and the jet energy scale, which is constrained in situ by the hadronic W boson mass. Using 332 lepton + jets candidate events and 144 dilepton candidate events, we measure the top quark mass to be mtop=171.9 +/- 1.7 (stat. + JES) +/- 1.1 (syst.) GeV/c^2 = 171.9 +/- 2.0 GeV/c^2.