905 resultados para Support vector machine.


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM) models were developed for predicting HbL proteins based upon amino acid composition (AC), dipeptide composition (DC), hybrid method (AC + DC), and position specific scoring matrix (PSSM). In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM) profiles. The average accuracy, standard deviation (SD), false positive rate (FPR), confusion matrix, and receiver operating characteristic (ROC) were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the petroleum well drilling operation many mechanical and hydraulic parameters are monitored by an instrumentation system installed in the rig called a mud-logging system. These sensors, distributed in the rig, monitor different operation parameters such as weight on the hook and drillstring rotation. These measurements are known as mud-logging records and allow the online following of all the drilling process with well monitoring purposes. However, in most of the cases, these data are stored without taking advantage of all their potential. On the other hand, to make use of the mud-logging data, an analysis and interpretationt is required. That is not an easy task because of the large volume of information involved. This paper presents a Support Vector Machine (SVM) used to automatically classify the drilling operation stages through the analysis of some mud-logging parameters. In order to validate the results of SVM technique, it was compared to a classification elaborated by a Petroleum Engineering expert. © 2006 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the increased incidence of skin cancer, computational methods based on intelligent approaches have been developed to aid dermatologists in the diagnosis of skin lesions. This paper proposes a method to classify texture in images, since it is an important feature for the successfully identification of skin lesions. For this is defined a feature vector, with the fractal dimension of images through the box-counting method (BCM), which is used with a SVM to classify the texture of the lesions in to non-irregular or irregular. With the proposed solution, we could obtain an accuracy of 72.84%. © 2012 AISTI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of mobile robots turns out to be interesting in activities where the action of human specialist is difficult or dangerous. Mobile robots are often used for the exploration in areas of difficult access, such as rescue operations and space missions, to avoid human experts exposition to risky situations. Mobile robots are also used in agriculture for planting tasks as well as for keeping the application of pesticides within minimal amounts to mitigate environmental pollution. In this paper we present the development of a system to control the navigation of an autonomous mobile robot through tracks in plantations. Track images are used to control robot direction by preprocessing them to extract image features. Such features are then submitted to a support vector machine in order to find out the most appropriate route. The overall goal of the project to which this work is connected is to develop a real time robot control system to be embedded into a hardware platform. In this paper we report the software implementation of a support vector machine, which so far presented around 93% accuracy in predicting the appropriate route. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Delineating brain tumor boundaries from magnetic resonance images is an essential task for the analysis of brain cancer. We propose a fully automatic method for brain tissue segmentation, which combines Support Vector Machine classification using multispectral intensities and textures with subsequent hierarchical regularization based on Conditional Random Fields. The CRF regularization introduces spatial constraints to the powerful SVM classification, which assumes voxels to be independent from their neighbors. The approach first separates healthy and tumor tissue before both regions are subclassified into cerebrospinal fluid, white matter, gray matter and necrotic, active, edema region respectively in a novel hierarchical way. The hierarchical approach adds robustness and speed by allowing to apply different levels of regularization at different stages. The method is fast and tailored to standard clinical acquisition protocols. It was assessed on 10 multispectral patient datasets with results outperforming previous methods in terms of segmentation detail and computation times.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El análisis de imágenes hiperespectrales permite obtener información con una gran resolución espectral: cientos de bandas repartidas desde el espectro infrarrojo hasta el ultravioleta. El uso de dichas imágenes está teniendo un gran impacto en el campo de la medicina y, en concreto, destaca su utilización en la detección de distintos tipos de cáncer. Dentro de este campo, uno de los principales problemas que existen actualmente es el análisis de dichas imágenes en tiempo real ya que, debido al gran volumen de datos que componen estas imágenes, la capacidad de cómputo requerida es muy elevada. Una de las principales líneas de investigación acerca de la reducción de dicho tiempo de procesado se basa en la idea de repartir su análisis en diversos núcleos trabajando en paralelo. En relación a esta línea de investigación, en el presente trabajo se desarrolla una librería para el lenguaje RVC – CAL – lenguaje que está especialmente pensado para aplicaciones multimedia y que permite realizar la paralelización de una manera intuitiva – donde se recogen las funciones necesarias para implementar el clasificador conocido como Support Vector Machine – SVM. Cabe mencionar que este trabajo complementa el realizado en [1] y [2] donde se desarrollaron las funciones necesarias para implementar una cadena de procesado que utiliza el método unmixing para procesar la imagen hiperespectral. En concreto, este trabajo se encuentra dividido en varias partes. La primera de ellas expone razonadamente los motivos que han llevado a comenzar este Trabajo de Investigación y los objetivos que se pretenden conseguir con él. Tras esto, se hace un amplio estudio del estado del arte actual y, en él, se explican tanto las imágenes hiperespectrales como sus métodos de procesado y, en concreto, se detallará el método que utiliza el clasificador SVM. Una vez expuesta la base teórica, nos centraremos en la explicación del método seguido para convertir una versión en Matlab del clasificador SVM optimizado para analizar imágenes hiperespectrales; un punto importante en este apartado es que se desarrolla la versión secuencial del algoritmo y se asientan las bases para una futura paralelización del clasificador. Tras explicar el método utilizado, se exponen los resultados obtenidos primero comparando ambas versiones y, posteriormente, analizando por etapas la versión adaptada al lenguaje RVC – CAL. Por último, se aportan una serie de conclusiones obtenidas tras analizar las dos versiones del clasificador SVM en cuanto a bondad de resultados y tiempos de procesado y se proponen una serie de posibles líneas de actuación futuras relacionadas con dichos resultados. ABSTRACT. Hyperspectral imaging allows us to collect high resolution spectral information: hundred of bands covering from infrared to ultraviolet spectrum. These images have had strong repercussions in the medical field; in particular, we must highlight its use in cancer detection. In this field, the main problem we have to deal with is the real time analysis, because these images have a great data volume and they require a high computational power. One of the main research lines that deals with this problem is related with the analysis of these images using several cores working at the same time. According to this investigation line, this document describes the development of a RVC – CAL library – this language has been widely used for working with multimedia applications and allows an optimized system parallelization –, which joins all the functions needed to implement the Support Vector Machine – SVM - classifier. This research complements the research conducted in [1] and [2] where the necessary functions to implement the unmixing method to analyze hyperspectral images were developed. The document is divided in several chapters. The first of them introduces the motivation of the Master Thesis and the main objectives to achieve. After that, we study the state of the art of some technologies related with this work, like hyperspectral images, their processing methods and, concretely, the SVM classifier. Once we have exposed the theoretical bases, we will explain the followed methodology to translate a Matlab version of the SVM classifier optimized to process an hyperspectral image to RVC – CAL language; one of the most important issues in this chapter is that a sequential implementation is developed and the bases of a future parallelization of the SVM classifier are set. At this point, we will expose the results obtained in the comparative between versions and then, the results of the different steps that compose the SVM in its RVC – CAL version. Finally, we will extract some conclusions related with algorithm behavior and time processing. In the same way, we propose some future research lines according to the results obtained in this document.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Promiscuous human leukocyte antigen (HLA) binding peptides are ideal targets for vaccine development. Existing computational models for prediction of promiscuous peptides used hidden Markov models and artificial neural networks as prediction algorithms. We report a system based on support vector machines that outperforms previously published methods. Preliminary testing showed that it can predict peptides binding to HLA-A2 and -A3 super-type molecules with excellent accuracy, even for molecules where no binding data are currently available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background - The binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities. Results - We developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides. Conclusion - As a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data fluctuation in multiple measurements of Laser Induced Breakdown Spectroscopy (LIBS) greatly affects the accuracy of quantitative analysis. A new LIBS quantitative analysis method based on the Robust Least Squares Support Vector Machine (RLS-SVM) regression model is proposed. The usual way to enhance the analysis accuracy is to improve the quality and consistency of the emission signal, such as by averaging the spectral signals or spectrum standardization over a number of laser shots. The proposed method focuses more on how to enhance the robustness of the quantitative analysis regression model. The proposed RLS-SVM regression model originates from the Weighted Least Squares Support Vector Machine (WLS-SVM) but has an improved segmented weighting function and residual error calculation according to the statistical distribution of measured spectral data. Through the improved segmented weighting function, the information on the spectral data in the normal distribution will be retained in the regression model while the information on the outliers will be restrained or removed. Copper elemental concentration analysis experiments of 16 certified standard brass samples were carried out. The average value of relative standard deviation obtained from the RLS-SVM model was 3.06% and the root mean square error was 1.537%. The experimental results showed that the proposed method achieved better prediction accuracy and better modeling robustness compared with the quantitative analysis methods based on Partial Least Squares (PLS) regression, standard Support Vector Machine (SVM) and WLS-SVM. It was also demonstrated that the improved weighting function had better comprehensive performance in model robustness and convergence speed, compared with the four known weighting functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Identification of DNA-binding proteins is one of the major challenges in the field of genome annotation. There have been several computational methods proposed in the literature to deal with the DNA-binding protein identification. However, most of them can't provide an invaluable knowledge base for our understanding of DNA-protein interactions. Results: We firstly presented a new protein sequence encoding method called PSSM Distance Transformation, and then constructed a DNA-binding protein identification method (SVM-PSSM-DT) by combining PSSM Distance Transformation with support vector machine (SVM). First, the PSSM profiles are generated by using the PSI-BLAST program to search the non-redundant (NR) database. Next, the PSSM profiles are transformed into uniform numeric representations appropriately by distance transformation scheme. Lastly, the resulting uniform numeric representations are inputted into a SVM classifier for prediction. Thus whether a sequence can bind to DNA or not can be determined. In benchmark test on 525 DNA-binding and 550 non DNA-binding proteins using jackknife validation, the present model achieved an ACC of 79.96%, MCC of 0.622 and AUC of 86.50%. This performance is considerably better than most of the existing state-of-the-art predictive methods. When tested on a recently constructed independent dataset PDB186, SVM-PSSM-DT also achieved the best performance with ACC of 80.00%, MCC of 0.647 and AUC of 87.40%, and outperformed some existing state-of-the-art methods. Conclusions: The experiment results demonstrate that PSSM Distance Transformation is an available protein sequence encoding method and SVM-PSSM-DT is a useful tool for identifying the DNA-binding proteins. A user-friendly web-server of SVM-PSSM-DT was constructed, which is freely accessible to the public at the web-site on http://bioinformatics.hitsz.edu.cn/PSSM-DT/.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We experimentally demonstrate 7-dB reduction of nonlinearity penalty in 40-Gb/s CO-OFDM at 2000-km using support vector machine regression-based equalization. Simulation in WDM-CO-OFDM shows up to 12-dB enhancement in Q-factor compared to linear equalization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the prevalence of unlabeled data, semi-supervised learning has drawn significant attention and has been found applicable in many real-world applications. In this paper, we present the so-called Budgeted Semi-supervised Sup-port Vector Machine (BS3VM), a method that leverages the excellent generalization capacity of kernel-based method with the adjacent and distributive information carried in a spectral graph for semi-supervised learning purpose. The fact that the optimization problem of BS3VM can be solved directly in the primal form makes it fast and efficient in memory usage. We validate the proposed method on several benchmark datasets to demonstrate its accuracy and efficiency. The experimental results show that BS3VM can scale up efficiently to the large-scale datasets where it yields a comparable classification accuracy while simultaneously achieving a significant computational speed-up compared with the baselines.