884 resultados para Dunkl Kernel


Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the advent of alternative fuels, such as biodiesels and related blends, it is important to develop an understanding of their effects on inter-cycle variability which, in turn, influences engine performance as well as its emission. Using four methanol trans-esterified biomass fuels of differing carbon chain length and degree of unsaturation, this paper provides insight into the effect that alternative fuels have on inter-cycle variability. The experiments were conducted with a heavy-duty Cummins, turbo-charged, common-rail compression ignition engine. Combustion performance is reported in terms of the following key in-cylinder parameters: indicated mean effective pressure (IMEP), net heat release rate (NHRR), standard deviation of variability (StDev), coefficient of variation (CoV), peak pressure, peak pressure timing and maximum rate of pressure rise. A link is also established between the cyclic variability and oxygen ratio, which is a good indicator of stoichiometry. The results show that the fatty acid structures did not have a significant effect on injection timing, injection duration, injection pressure, StDev of IMEP, or the timing of peak motoring and combustion pressures. However, a significant effect was noted on the premixed and diffusion combustion proportions, combustion peak pressure and maximum rate of pressure rise. Additionally, the boost pressure, IMEP and combustion peak pressure were found to be directly correlated to the oxygen ratio. The emission of particles positively correlates with oxygen content in the fuel as well as in the air-fuel mixture resulting in a higher total number of particles per unit of mass.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The continuous growth of the XML data poses a great concern in the area of XML data management. The need for processing large amounts of XML data brings complications to many applications, such as information retrieval, data integration and many others. One way of simplifying this problem is to break the massive amount of data into smaller groups by application of clustering techniques. However, XML clustering is an intricate task that may involve the processing of both the structure and the content of XML data in order to identify similar XML data. This research presents four clustering methods, two methods utilizing the structure of XML documents and the other two utilizing both the structure and the content. The two structural clustering methods have different data models. One is based on a path model and other is based on a tree model. These methods employ rigid similarity measures which aim to identifying corresponding elements between documents with different or similar underlying structure. The two clustering methods that utilize both the structural and content information vary in terms of how the structure and content similarity are combined. One clustering method calculates the document similarity by using a linear weighting combination strategy of structure and content similarities. The content similarity in this clustering method is based on a semantic kernel. The other method calculates the distance between documents by a non-linear combination of the structure and content of XML documents using a semantic kernel. Empirical analysis shows that the structure-only clustering method based on the tree model is more scalable than the structure-only clustering method based on the path model as the tree similarity measure for the tree model does not need to visit the parents of an element many times. Experimental results also show that the clustering methods perform better with the inclusion of the content information on most test document collections. To further the research, the structural clustering method based on tree model is extended and employed in XML transformation. The results from the experiments show that the proposed transformation process is faster than the traditional transformation system that translates and converts the source XML documents sequentially. Also, the schema matching process of XML transformation produces a better matching result in a shorter time.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new framework for distributed intrusion detection based on taint marking. Our system tracks information flows between applications of multiple hosts gathered in groups (i.e., sets of hosts sharing the same distributed information flow policy) by attaching taint labels to system objects such as files, sockets, Inter Process Communication (IPC) abstractions, and memory mappings. Labels are carried over the network by tainting network packets. A distributed information flow policy is defined for each group at the host level by labeling information and defining how users and applications can legally access, alter or transfer information towards other trusted or untrusted hosts. As opposed to existing approaches, where information is most often represented by two security levels (low/high, public/private, etc.), our model identifies each piece of information within a distributed system, and defines their legal interaction in a fine-grained manner. Hosts store and exchange security labels in a peer to peer fashion, and there is no central monitor. Our IDS is implemented in the Linux kernel as a Linux Security Module (LSM) and runs standard software on commodity hardware with no required modification. The only trusted code is our modified operating system kernel. We finally present a scenario of intrusion in a web service running on multiple hosts, and show how our distributed IDS is able to report security violations at each host level.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The occurrence of extreme movements in the spot price of electricity represents a significant source of risk to retailers. A range of approaches have been considered with respect to modelling electricity prices; these models, however, have relied on time-series approaches, which typically use restrictive decay schemes placing greater weight on more recent observations. This study develops an alternative, semi-parametric method for forecasting, which uses state-dependent weights derived from a kernel function. The forecasts that are obtained using this method are accurate and therefore potentially useful to electricity retailers in terms of risk management.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation: Gene silencing, also called RNA interference, requires reliable assessment of silencer impacts. A critical task is to find matches between silencer oligomers and sites in the genome, in accordance with one-to-many matching rules (G-U matching, with provision for mismatches). Fast search algorithms are required to support silencer impact assessments in procedures for designing effective silencer sequences.Results: The article presents a matching algorithm and data structures specialized for matching searches, including a kernel procedure that addresses a Boolean version of the database task called the skyline search. Besides exact matches, the algorithm is extended to allow for the location-specific mismatches applicable in plants. Computational tests show that the algorithm is significantly faster than suffix-tree alternatives. © The Author 2010. Published by Oxford University Press. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An important aspect of decision support systems involves applying sophisticated and flexible statistical models to real datasets and communicating these results to decision makers in interpretable ways. An important class of problem is the modelling of incidence such as fire, disease etc. Models of incidence known as point processes or Cox processes are particularly challenging as they are ‘doubly stochastic’ i.e. obtaining the probability mass function of incidents requires two integrals to be evaluated. Existing approaches to the problem either use simple models that obtain predictions using plug-in point estimates and do not distinguish between Cox processes and density estimation but do use sophisticated 3D visualization for interpretation. Alternatively other work employs sophisticated non-parametric Bayesian Cox process models, but do not use visualization to render interpretable complex spatial temporal forecasts. The contribution here is to fill this gap by inferring predictive distributions of Gaussian-log Cox processes and rendering them using state of the art 3D visualization techniques. This requires performing inference on an approximation of the model on a discretized grid of large scale and adapting an existing spatial-diurnal kernel to the log Gaussian Cox process context.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A single plant cell was modeled with smoothed particle hydrodynamics (SPH) and a discrete element method (DEM) to study the basic micromechanics that govern the cellular structural deformations during drying. This two-dimensional particle-based model consists of two components: a cell fluid model and a cell wall model. The cell fluid was approximated to a highly viscous Newtonian fluid and modeled with SPH. The cell wall was treated as a stiff semi-permeable solid membrane with visco-elastic properties and modeled as a neo-Hookean solid material using a DEM. Compared to existing meshfree particle-based plant cell models, we have specifically introduced cell wall–fluid attraction forces and cell wall bending stiffness effects to address the critical shrinkage characteristics of the plant cells during drying. Also, a moisture domain-based novel approach was used to simulate drying mechanisms within the particle scheme. The model performance was found to be mainly influenced by the particle resolution, initial gap between the outermost fluid particles and wall particles and number of particles in the SPH influence domain. A higher order smoothing kernel was used with adaptive smoothing length to improve the stability and accuracy of the model. Cell deformations at different states of cell dryness were qualitatively and quantitatively compared with microscopic experimental findings on apple cells and a fairly good agreement was observed with some exceptions. The wall–fluid attraction forces and cell wall bending stiffness were found to be significantly improving the model predictions. A detailed sensitivity analysis was also done to further investigate the influence of wall–fluid attraction forces, cell wall bending stiffness, cell wall stiffness and the particle resolution. This novel meshfree based modeling approach is highly applicable for cellular level deformation studies of plant food materials during drying, which characterize large deformations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Approximate Bayesian Computation’ (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable – the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference [e.g. conventional Markov chain Monte Carlo (MCMC) simulation]. In this paper, we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high-redshift galaxies. To this end, we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe, and secondly, through an ABC-based comparison against the observed demographics of massive (Mgal > 1011 M⊙) galaxies (at 1.5 < z < 3) in the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS)/Extended Groth Strip (EGS) data set we derive posterior probability densities for the key parameters of this model. The ‘Sequential Monte Carlo’ implementation of ABC exhibited herein, featuring both a self-generating target sequence and self-refining MCMC kernel, is amongst the most efficient of contemporary approaches to this important statistical algorithm. We highlight as well through our chosen case study the value of careful summary statistic selection, and demonstrate two modern strategies for assessment and optimization in this regard. Ultimately, our ABC analysis of the high-redshift morphological mix returns tight constraints on the evolving merger rate in the early Universe and favours major merging (with disc survival or rapid reformation) over secular evolution as the mechanism most responsible for building up the first generation of bulges in early-type discs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.