Biblioteca Digital

923 resultados para Classification errors

Transreal limits expose category errors in IEEE 754 floating-point arithmetic and in mathematics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The IEEE 754 standard for oating-point arithmetic is widely used in computing. It is based on real arithmetic and is made total by adding both a positive and a negative infinity, a negative zero, and many Not-a-Number (NaN) states. The IEEE infinities are said to have the behaviour of limits. Transreal arithmetic is total. It also has a positive and a negative infinity but no negative zero, and it has a single, unordered number, nullity. We elucidate the transreal tangent and extend real limits to transreal limits. Arguing from this firm foundation, we maintain that there are three category errors in the IEEE 754 standard. Firstly the claim that IEEE infinities are limits of real arithmetic confuses limiting processes with arithmetic. Secondly a defence of IEEE negative zero confuses the limit of a function with the value of a function. Thirdly the definition of IEEE NaNs confuses undefined with unordered. Furthermore we prove that the tangent function, with the infinities given by geometrical con- struction, has a period of an entire rotation, not half a rotation as is commonly understood. This illustrates a category error, confusing the limit with the value of a function, in an important area of applied mathe- matics { trigonometry. We brie y consider the wider implications of this category error. Another paper proposes transreal arithmetic as a basis for floating- point arithmetic; here we take the profound step of proposing transreal arithmetic as a replacement for real arithmetic to remove the possibility of certain category errors in mathematics. Thus we propose both theo- retical and practical advantages of transmathematics. In particular we argue that implementing transreal analysis in trans- floating-point arith- metic would extend the coverage, accuracy and reliability of almost all computer programs that exploit real analysis { essentially all programs in science and engineering and many in finance, medicine and other socially beneficial applications.

Novel single trial movement classification based on temporal dynamics of EEG

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Various complex oscillatory processes are involved in the generation of the motor command. The temporal dynamics of these processes were studied for movement detection from single trial electroencephalogram (EEG). Autocorrelation analysis was performed on the EEG signals to find robust markers of movement detection. The evolution of the autocorrelation function was characterised via the relaxation time of the autocorrelation by exponential curve fitting. It was observed that the decay constant of the exponential curve increased during movement, indicating that the autocorrelation function decays slowly during motor execution. Significant differences were observed between movement and no moment tasks. Additionally, a linear discriminant analysis (LDA) classifier was used to identify movement trials with a peak accuracy of 74%.

Classification of seed storage behaviour of 67 Amazonian tree species

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information was collated on the seed storage behaviour of 67 tree species native to the Amazon rainforest of Brazil; 38 appeared to show orthodox, 23 recalcitrant and six intermediate seed storage behaviour. A double-criteria key based on thousand-seed weight and seed moisture content at shedding to estimate likely seed storage behaviour, developed previously, showed good agreement with the above classifications. The key can aid seed storage behaviour identification considerably.

Smart-phone based electrocardiogram wavelet decomposition and neural network classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses ECG classification after parametrizing the ECG waveforms in the wavelet domain. The aim of the work is to develop an accurate classification algorithm that can be used to diagnose cardiac beat abnormalities detected using a mobile platform such as smart-phones. Continuous time recurrent neural network classifiers are considered for this task. Records from the European ST-T Database are decomposed in the wavelet domain using discrete wavelet transform (DWT) filter banks and the resulting DWT coefficients are filtered and used as inputs for training the neural network classifier. Advantages of the proposed methodology are the reduced memory requirement for the signals which is of relevance to mobile applications as well as an improvement in the ability of the neural network in its generalization ability due to the more parsimonious representation of the signal to its inputs.

On the application of optimal wavelet filter banks for ECG signal classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses ECG signal classification after parametrizing the ECG waveforms in the wavelet domain. Signal decomposition using perfect reconstruction quadrature mirror filter banks can provide a very parsimonious representation of ECG signals. In the current work, the filter parameters are adjusted by a numerical optimization algorithm in order to minimize a cost function associated to the filter cut-off sharpness. The goal consists of achieving a better compromise between frequency selectivity and time resolution at each decomposition level than standard orthogonal filter banks such as those of the Daubechies and Coiflet families. Our aim is to optimally decompose the signals in the wavelet domain so that they can be subsequently used as inputs for training to a neural network classifier.

Estimate of the cutoff errors in the Ewald summation for dipolar systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Theoretical estimates for the cutoff errors in the Ewald summation method for dipolar systems are derived. Absolute errors in the total energy, forces and torques, both for the real and reciprocal space parts, are considered. The applicability of the estimates is tested and confirmed in several numerical examples. We demonstrate that these estimates can be used easily in determining the optimal parameters of the dipolar Ewald summation in the sense that they minimize the computation time for a predefined, user set, accuracy.

Representation errors and retrievals in linear and nonlinear data assimilation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article shows how one can formulate the representation problem starting from Bayes’ theorem. The purpose of this article is to raise awareness of the formal solutions,so that approximations can be placed in a proper context. The representation errors appear in the likelihood, and the different possibilities for the representation of reality in model and observations are discussed, including nonlinear representation probability density functions. Specifically, the assumptions needed in the usual procedure to add a representation error covariance to the error covariance of the observations are discussed,and it is shown that, when several sub-grid observations are present, their mean still has a representation error ; socalled ‘superobbing’ does not resolve the issue. Connection is made to the off-line or on-line retrieval problem, providing a new simple proof of the equivalence of assimilating linear retrievals and original observations. Furthermore, it is shown how nonlinear retrievals can be assimilated without loss of information. Finally we discuss how errors in the observation operator model can be treated consistently in the Bayesian framework, connecting to previous work in this area.

Computationally efficient rule-based classification for continuous streaming data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.

Towards a parallel computationally efficient approach to scaling up data stream classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Classification in e-procurement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three coupled knowledge transfer partnerships used pattern recognition techniques to produce an e-procurement system which, the National Audit Office reports, could save the National Health Service £500 m per annum. An extension to the system, GreenInsight, allows the environmental impact of procurements to be assessed and savings made. Both systems require suitable products to be discovered and equivalent products recognised, for which classification is a key component. This paper describes the innovative work done for product classification, feature selection and reducing the impact of mislabelled data.

Application of complex extreme learning machine to multiclass classification problems with high dimensionality: a THz spectra classification problem

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We extend extreme learning machine (ELM) classifiers to complex Reproducing Kernel Hilbert Spaces (RKHS) where the input/output variables as well as the optimization variables are complex-valued. A new family of classifiers, called complex-valued ELM (CELM) suitable for complex-valued multiple-input–multiple-output processing is introduced. In the proposed method, the associated Lagrangian is computed using induced RKHS kernels, adopting a Wirtinger calculus approach formulated as a constrained optimization problem similarly to the conventional ELM classifier formulation. When training the CELM, the Karush–Khun–Tuker (KKT) theorem is used to solve the dual optimization problem that consists of satisfying simultaneously smallest training error as well as smallest norm of output weights criteria. The proposed formulation also addresses aspects of quaternary classification within a Clifford algebra context. For 2D complex-valued inputs, user-defined complex-coupled hyper-planes divide the classifier input space into four partitions. For 3D complex-valued inputs, the formulation generates three pairs of complex-coupled hyper-planes through orthogonal projections. The six hyper-planes then divide the 3D space into eight partitions. It is shown that the CELM problem formulation is equivalent to solving six real-valued ELM tasks, which are induced by projecting the chosen complex kernel across the different user-defined coordinate planes. A classification example of powdered samples on the basis of their terahertz spectral signatures is used to demonstrate the advantages of the CELM classifiers compared to their SVM counterparts. The proposed classifiers retain the advantages of their ELM counterparts, in that they can perform multiclass classification with lower computational complexity than SVM classifiers. Furthermore, because of their ability to perform classification tasks fast, the proposed formulations are of interest to real-time applications.

Finite sample weighting of recursive forecast errors

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes and tests a new framework for weighting recursive out-of-sample prediction errors according to their corresponding levels of in-sample estimation uncertainty. In essence, we show how to use the maximum possible amount of information from the sample in the evaluation of the prediction accuracy, by commencing the forecasts at the earliest opportunity and weighting the prediction errors. Via a Monte Carlo study, we demonstrate that the proposed framework selects the correct model from a set of candidate models considerably more often than the existing standard approach when only a small sample is available. We also show that the proposed weighting approaches result in tests of equal predictive accuracy that have much better sizes than the standard approach. An application to an exchange rate dataset highlights relevant differences in the results of tests of predictive accuracy based on the standard approach versus the framework proposed in this paper.

The classification of involuntary musical imagery: the case for earworms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Involuntary musical imagery (INMI) is the subject of much recent research interest. INMI covers a number of experience types such as musical obsessions and musical hallucinations. One type of experience has been called earworms, for which the literature provides a number of definitions. In this paper we consider the origins of the term earworm in the German language literature and compare that usage with the English language literature. We consider the published literature on earworms and conclude that there is merit in distinguishing between earworms and other types of types of involuntary musical imagery described in the scientific literature: e.g. musical hallucinations, musical obsessions. We also describe other experiences that can be considered under the term INMI. The aim of future research could be to ascertain similarities and differences between types of INMI with a view to refining the classification scheme proposed here.

An evaluation of the pedestrian classification in a multi-domain multi-modality setup

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this article is to study the problem of pedestrian classification across different light spectrum domains (visible and far-infrared (FIR)) and modalities (intensity, depth and motion). In recent years, there has been a number of approaches for classifying and detecting pedestrians in both FIR and visible images, but the methods are difficult to compare, because either the datasets are not publicly available or they do not offer a comparison between the two domains. Our two primary contributions are the following: (1) we propose a public dataset, named RIFIR , containing both FIR and visible images collected in an urban environment from a moving vehicle during daytime; and (2) we compare the state-of-the-art features in a multi-modality setup: intensity, depth and flow, in far-infrared over visible domains. The experiments show that features families, intensity self-similarity (ISS), local binary patterns (LBP), local gradient patterns (LGP) and histogram of oriented gradients (HOG), computed from FIR and visible domains are highly complementary, but their relative performance varies across different modalities. In our experiments, the FIR domain has proven superior to the visible one for the task of pedestrian classification, but the overall best results are obtained by a multi-domain multi-modality multi-feature fusion.

Beef carcass classification from slaughterhouses in Veria, Kilkis and Kavala (KREKA)

Relevância:

20.00% 20.00%

Publicador:

«
1
2
...
54
55
56
57
58
59
60
61
62
»