40 resultados para Support Vector Machines and Naive Bayes Classifier


Relevância:

100.00% 100.00%

Publicador:

Resumo:

An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new face verification algorithm based on Gabor wavelets and AdaBoost. In the algorithm, faces are represented by Gabor wavelet features generated by Gabor wavelet transform. Gabor wavelets with 5 scales and 8 orientations are chosen to form a family of Gabor wavelets. By convolving face images with these 40 Gabor wavelets, the original images are transformed into magnitude response images of Gabor wavelet features. The AdaBoost algorithm selects a small set of significant features from the pool of the Gabor wavelet features. Each feature is the basis for a weak classifier which is trained with face images taken from the XM2VTS database. The feature with the lowest classification error is selected in each iteration of the AdaBoost operation. We also address issues regarding computational costs in feature selection with AdaBoost. A support vector machine (SVM) is trained with examples of 20 features, and the results have shown a low false positive rate and a low classification error rate in face verification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Uganda, control of vector-borne diseases is mainly in form of vector control, and chemotherapy. There have been reports that acaricides are being misused in the pastoralist systems in Uganda. This is because of the belief by scientists that intensive application of acaricide is uneconomical and unsustainable particularly in the indigenous cattle. The objective of this study was to investigate the strategies, rationale and effectiveness of vector-borne disease control by pastoralists. To systematically carry out these investigations, a combination of qualitative and quantitative research methods was used, in both the collection and the analysis of data. Cattle keepers were found to control tick-borne diseases (TBDs) mainly through spraying, in contrast with the control of trypanosomosis for which the main method of control was by chemotherapy. The majority of herders applied acaricides weekly and used an acaricide of lower strength than recommended by the manufacturers. They used very little acaricide wash, and spraying was preferred to dipping. Furthermore, pastoralists either treated sick animals themselves or did nothing at all, rather than using veterinary personnel. Oxytetracycline (OTC) was the drug commonly used in the treatment of TBDs. Nevertheless, although pastoralists may not have been following recommended practices in their control of ticks and tick-borne diseases, they were neither wasteful nor uneconomical and their methods appeared to be effective. Trypanosomosis was not a problem either in Sembabule or Mbarara district. Those who used trypanocides were found to use more drugs than were necessary.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an efficient construction algorithm for obtaining sparse kernel density estimates based on a regression approach that directly optimizes model generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. A local regularization method is incorporated naturally into the density construction process to further enforce sparsity. An additional advantage of the proposed algorithm is that it is fully automatic and the user is not required to specify any criterion to terminate the density construction procedure. This is in contrast to an existing state-of-art kernel density estimation method using the support vector machine (SVM), where the user is required to specify some critical algorithm parameter. Several examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample optimized Parzen window density estimate. Our experimental results also demonstrate that the proposed algorithm compares favorably with the SVM method, in terms of both test accuracy and sparsity, for constructing kernel density estimates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose – Expectations of future market conditions are acknowledged to be crucial for the development decision and hence for shaping the built environment. The purpose of this paper is to study the central London office market from 1987 to 2009 and test for evidence of rational, adaptive and naive expectations. Design/methodology/approach – Two parallel approaches are applied to test for either rational or adaptive/naive expectations: vector auto-regressive (VAR) approach with Granger causality tests and recursive OLS regression with one-step forecasts. Findings – Applying VAR models and a recursive OLS regression with one-step forecasts, the authors do not find evidence of adaptive and naïve expectations of developers. Although the magnitude of the errors and the length of time lags between market signal and construction starts vary over time and development cycles, the results confirm that developer decisions are explained, to a large extent, by contemporaneous and historic conditions in both the City and the West End, but this is more likely to stem from the lengthy design, financing and planning permission processes rather than adaptive or naive expectations. Research limitations/implications – More generally, the results of this study suggest that real estate cycles are largely generated endogenously rather than being the result of large demand shocks and/or irrational behaviour. Practical implications – Developers may be able to generate excess profits by exploiting market inefficiencies but this may be hindered in practice by the long periods necessary for planning and construction of the asset. Originality/value – This paper focuses the scholarly debate of real estate cycles on the role of expectations. It is also one of very few spatially disaggregate studies of the subject matter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This letter presents an effective approach for selection of appropriate terrain modeling methods in forming a digital elevation model (DEM). This approach achieves a balance between modeling accuracy and modeling speed. A terrain complexity index is defined to represent a terrain's complexity. A support vector machine (SVM) classifies terrain surfaces into either complex or moderate based on this index associated with the terrain elevation range. The classification result recommends a terrain modeling method for a given data set in accordance with its required modeling accuracy. Sample terrain data from the lunar surface are used in constructing an experimental data set. The results have shown that the terrain complexity index properly reflects the terrain complexity, and the SVM classifier derived from both the terrain complexity index and the terrain elevation range is more effective and generic than that designed from either the terrain complexity index or the terrain elevation range only. The statistical results have shown that the average classification accuracy of SVMs is about 84.3% ± 0.9% for terrain types (complex or moderate). For various ratios of complex and moderate terrain types in a selected data set, the DEM modeling speed increases up to 19.5% with given DEM accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: There are compelling economic and environmental reasons to reduce our reliance on inorganic phosphate (Pi) fertilisers. Better management of Pi fertiliser applications is one option to improve the efficiency of Pi fertiliser use, whilst maintaining crop yields. Application rates of Pi fertilisers are traditionally determined from analyses of soil or plant tissues. Alternatively, diagnostic genes with altered expression under Pi limiting conditions that suggest a physiological requirement for Pi fertilisation, could be used to manage Pifertiliser applications, and might be more precise than indirect measurements of soil or tissue samples. Results: We grew potato (Solanum tuberosum L.) plants hydroponically, under glasshouse conditions, to control their nutrient status accurately. Samples of total leaf RNA taken periodically after Pi was removed from the nutrient solution were labelled and hybridised to potato oligonucleotide arrays. A total of 1,659 genes were significantly differentially expressed following Pi withdrawal. These included genes that encode proteins involved in lipid, protein, and carbohydrate metabolism, characteristic of Pi deficient leaves and included potential novel roles for genes encoding patatin like proteins in potatoes. The array data were analysed using a support vector machine algorithm to identify groups of genes that could predict the Pi status of the crop. These groups of diagnostic genes were tested using field grown potatoes that had either been fertilised or unfertilised. A group of 200 genes could correctly predict the Pi status of field grown potatoes. Conclusions: This paper provides a proof-of-concept demonstration for using microarrays and class prediction tools to predict the Pi status of a field grown potato crop. There is potential to develop this technology for other biotic and abiotic stresses in field grown crops. Ultimately, a better understanding of crop stresses may improve our management of the crop, improving the sustainability of agriculture.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed algorithm should also be very useful in other applications requiring the classification of very large datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As part of the broader prevention and social inclusion agenda, concepts of risk, resilience, and protective factors inform a range of U.K. Government initiatives targeted towards children and young people in England, including Sure Start, the Children's Fund, On Track, and Connexions. This paper is based on findings from a large qualitative dataset of interviews conducted with children and their parents or caregiver who accessed Children's Fund services as part of National Evaluation of the Children's Fund research.1 Drawing on the notion of young people's trajectories, the paper discusses how Children's Fund services support children's and young people's pathways towards greater social inclusion. While many services help to build resilience and protective factors for individual children, the paper considers the extent to which services also promote resilience within the domains of the family, school, and wider community and, hence, attempt to tackle the complex, multi-dimensional aspects of social exclusion affecting children, young people, and their families.