96 resultados para Naive Bayes Classifier


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We introduce a modified conditional logit model that takes account of uncertainty associated with mis-reporting in revealed preference experiments estimating willingness-to-pay (WTP). Like Hausman et al. [Journal of Econometrics (1988) Vol. 87, pp. 239-269], our model captures the extent and direction of uncertainty by respondents. Using a Bayesian methodology, we apply our model to a choice modelling (CM) data set examining UK consumer preferences for non-pesticide food. We compare the results of our model with the Hausman model. WTP estimates are produced for different groups of consumers and we find that modified estimates of WTP, that take account of mis-reporting, are substantially revised downwards. We find a significant proportion of respondents mis-reporting in favour of the non-pesticide option. Finally, with this data set, Bayes factors suggest that our model is preferred to the Hausman model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of the current study was to investigate expressive affect in children with Williams syndrome ( WS) in comparison to typically developing children in an experimental task and in spontaneous speech. Fourteen children with WS, 14 typically developing children matched to the WS group for receptive language ( LA) and 15 typically developing children matched to the WS groups for chronological age ( CA) were recruited. Affect was investigated using an experimental Output Affect task from the Profiling Elements of Prosodic Systems-Child version ( PEPS-C) battery, and by measuring pitch range and vowel durations from a spontaneous speech task. The children were also rated for level of emotional involvement by phonetically naive listeners. The WS group performed similarly to the LA and CA groups on the Output Affect task. With regard to vowel durations, the WS group was no different from the LA group; however both the WS and the LA groups were found to use significantly longer vowels than the CA group. The WS group differed significantly from both control groups on their range of pitch range and was perceived as being significantly more emotionally involved than the two control groups.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Binocular disparity, blur, and proximal cues drive convergence and accommodation. Disparity is considered to be the main vergence cue and blur the main accommodation cue. We have developed a remote haploscopic photorefractor to measure simultaneous vergence and accommodation objectively in a wide range of participants of all ages while fixating targets at between 0.3 and 2 m. By separating the three main near cues, we can explore their relative weighting in three-, two-, one-, and zero-cue conditions. Disparity can be manipulated by remote occlusion; blur cues manipulated by using either a Gabor patch or a detailed picture target; looming cues by either scaling or not scaling target size with distance. In normal orthophoric, emmetropic, symptom-free, naive visually mature participants, disparity was by far the most significant cue to both vergence and accommodation. Accommodation responses dropped dramatically if disparity was not available. Blur only had a clinically significant effect when disparity was absent. Proximity had very little effect. There was considerable interparticipant variation. We predict that relative weighting of near cue use is likely to vary between clinical groups and present some individual cases as examples. We are using this naturalistic tool to research strabismus, vergence and accommodation development, and emmetropization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Numerous techniques exist which can be used for the task of behavioural analysis and recognition. Common amongst these are Bayesian networks and Hidden Markov Models. Although these techniques are extremely powerful and well developed, both have important limitations. By fusing these techniques together to form Bayes-Markov chains, the advantages of both techniques can be preserved, while reducing their limitations. The Bayes-Markov technique forms the basis of a common, flexible framework for supplementing Markov chains with additional features. This results in improved user output, and aids in the rapid development of flexible and efficient behaviour recognition systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new face verification algorithm based on Gabor wavelets and AdaBoost. In the algorithm, faces are represented by Gabor wavelet features generated by Gabor wavelet transform. Gabor wavelets with 5 scales and 8 orientations are chosen to form a family of Gabor wavelets. By convolving face images with these 40 Gabor wavelets, the original images are transformed into magnitude response images of Gabor wavelet features. The AdaBoost algorithm selects a small set of significant features from the pool of the Gabor wavelet features. Each feature is the basis for a weak classifier which is trained with face images taken from the XM2VTS database. The feature with the lowest classification error is selected in each iteration of the AdaBoost operation. We also address issues regarding computational costs in feature selection with AdaBoost. A support vector machine (SVM) is trained with examples of 20 features, and the results have shown a low false positive rate and a low classification error rate in face verification.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, an improved stochastic discrimination (SD) is introduced to reduce the error rate of the standard SD in the context of multi-class classification problem. The learning procedure of the improved SD consists of two stages. In the first stage, a standard SD, but with shorter learning period is carried out to identify an important space where all the misclassified samples are located. In the second stage, the standard SD is modified by (i) restricting sampling in the important space; and (ii) introducing a new discriminant function for samples in the important space. It is shown by mathematical derivation that the new discriminant function has the same mean, but smaller variance than that of standard SD for samples in the important space. It is also analyzed that the smaller the variance of the discriminant function, the lower the error rate of the classifier. Consequently, the proposed improved SD improves standard SD by its capability of achieving higher classification accuracy. Illustrative examples axe provided to demonstrate the effectiveness of the proposed improved SD.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Utilising the expressive power of S-Expressions in Learning Classifier Systems often prohibitively increases the search space due to increased flexibility of the endcoding. This work shows that selection of appropriate S-Expression functions through domain knowledge improves scaling in problems, as expected. It is also known that simple alphabets perform well on relatively small sized problems in a domain, e.g. ternary alphabet in the 6, 11 and 20 bit MUX domain. Once fit ternary rules have been formed it was investigated whether higher order learning was possible and whether this staged learning facilitated selection of appropriate functions in complex alphabets, e.g. selection of S-Expression functions. This novel methodology is shown to provide compact results (135-MUX) and exhibits potential for scaling well (1034-MUX), but is only a small step towards introducing abstraction to LCS.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper describes the implementation of an offline, low-cost Brain Computer Interface (BCI) alternative to more expensive commercial models. Using inexpensive general purpose clinical EEG acquisition hardware (Truscan32, Deymed Diagnostic) as the base unit, a synchronisation module was constructed to allow the EEG hardware to be operated precisely in time to allow for recording of automatically time stamped EEG signals. The synchronising module allows the EEG recordings to be aligned in stimulus time locked fashion for further processing by the classifier to establish the class of the stimulus, sample by sample. This allows for the acquisition of signals from the subject’s brain for the goal oriented BCI application based on the oddball paradigm. An appropriate graphical user interface (GUI) was constructed and implemented as the method to elicit the required responses (in this case Event Related Potentials or ERPs) from the subject.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work compares classification results of lactose, mandelic acid and dl-mandelic acid, obtained on the basis of their respective THz transients. The performance of three different pre-processing algorithms applied to the time-domain signatures obtained using a THz-transient spectrometer are contrasted by evaluating the classifier performance. A range of amplitudes of zero-mean white Gaussian noise are used to artificially degrade the signal-to-noise ratio of the time-domain signatures to generate the data sets that are presented to the classifier for both learning and validation purposes. This gradual degradation of interferograms by increasing the noise level is equivalent to performing measurements assuming a reduced integration time. Three signal processing algorithms were adopted for the evaluation of the complex insertion loss function of the samples under study; a) standard evaluation by ratioing the sample with the background spectra, b) a subspace identification algorithm and c) a novel wavelet-packet identification procedure. Within class and between class dispersion metrics are adopted for the three data sets. A discrimination metric evaluates how well the three classes can be distinguished within the frequency range 0. 1 - 1.0 THz using the above algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work analyzes the use of linear discriminant models, multi-layer perceptron neural networks and wavelet networks for corporate financial distress prediction. Although simple and easy to interpret, linear models require statistical assumptions that may be unrealistic. Neural networks are able to discriminate patterns that are not linearly separable, but the large number of parameters involved in a neural model often causes generalization problems. Wavelet networks are classification models that implement nonlinear discriminant surfaces as the superposition of dilated and translated versions of a single "mother wavelet" function. In this paper, an algorithm is proposed to select dilation and translation parameters that yield a wavelet network classifier with good parsimony characteristics. The models are compared in a case study involving failed and continuing British firms in the period 1997-2000. Problems associated with over-parameterized neural networks are illustrated and the Optimal Brain Damage pruning technique is employed to obtain a parsimonious neural model. The results, supported by a re-sampling study, show that both neural and wavelet networks may be a valid alternative to classical linear discriminant models.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new distributed spam filter system based on mobile agent is proposed in this paper. We introduce the application of mobile agent technology to the spam filter system. The system architecture, the work process, the pivotal technology of the distributed spam filter system based on mobile agent, and the Naive Bayesian filter method are described in detail. The experiment results indicate that the system can prevent spam emails effectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work compares and contrasts results of classifying time-domain ECG signals with pathological conditions taken from the MITBIH arrhythmia database. Linear discriminant analysis and a multi-layer perceptron were used as classifiers. The neural network was trained by two different methods, namely back-propagation and a genetic algorithm. Converting the time-domain signal into the wavelet domain reduced the dimensionality of the problem at least 10-fold. This was achieved using wavelets from the db6 family as well as using adaptive wavelets generated using two different strategies. The wavelet transforms used in this study were limited to two decomposition levels. A neural network with evolved weights proved to be the best classifier with a maximum of 99.6% accuracy when optimised wavelet-transform ECG data wits presented to its input and 95.9% accuracy when the signals presented to its input were decomposed using db6 wavelets. The linear discriminant analysis achieved a maximum classification accuracy of 95.7% when presented with optimised and 95.5% with db6 wavelet coefficients. It is shown that the much simpler signal representation of a few wavelet coefficients obtained through an optimised discrete wavelet transform facilitates the classification of non-stationary time-variant signals task considerably. In addition, the results indicate that wavelet optimisation may improve the classification ability of a neural network. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In rapid scan Fourier transform spectrometry, we show that the noise in the wavelet coefficients resulting from the filter bank decomposition of the complex insertion loss function is linearly related to the noise power in the sample interferogram by a noise amplification factor. By maximizing an objective function composed of the power of the wavelet coefficients divided by the noise amplification factor, optimal feature extraction in the wavelet domain is performed. The performance of a classifier based on the output of a filter bank is shown to be considerably better than that of an Euclidean distance classifier in the original spectral domain. An optimization procedure results in a further improvement of the wavelet classifier. The procedure is suitable for enhancing the contrast or classifying spectra acquired by either continuous wave or THz transient spectrometers as well as for increasing the dynamic range of THz imaging systems. (C) 2003 Optical Society of America.