46 resultados para Pattern recognition multivariate SIMCA
Resumo:
PTS1 proteins are peroxisomal matrix proteins that have a well conserved targeting motif at the C-terminal end. However, this motif is present in many non peroxisomal proteins as well, thus predicting peroxisomal proteins involves differentiating fake PTS1 signals from actual ones. In this paper we report on the development of an SVM classifier with a separately trained logistic output function. The model uses an input window containing 12 consecutive residues at the C-terminus and the amino acid composition of the full sequence. The final model gives a Matthews Correlation Coefficient of 0.77, representing an increase of 54% compared with the well-known PeroxiP predictor. We test the model by applying it to several proteomes of eukaryotes for which there is no evidence of a peroxisome, producing a false positive rate of 0.088%.
Resumo:
Traditionally, machine learning algorithms have been evaluated in applications where assumptions can be reliably made about class priors and/or misclassification costs. In this paper, we consider the case of imprecise environments, where little may be known about these factors and they may well vary significantly when the system is applied. Specifically, the use of precision-recall analysis is investigated and compared to the more well known performance measures such as error-rate and the receiver operating characteristic (ROC). We argue that while ROC analysis is invariant to variations in class priors, this invariance in fact hides an important factor of the evaluation in imprecise environments. Therefore, we develop a generalised precision-recall analysis methodology in which variation due to prior class probabilities is incorporated into a multi-way analysis of variance (ANOVA). The increased sensitivity and reliability of this approach is demonstrated in a remote sensing application.
Resumo:
This paper presents an innovative approach for signature verification and forgery detection based on fuzzy modeling. The signature image is binarized and resized to a fixed size window and is then thinned. The thinned image is then partitioned into a fixed number of eight sub-images called boxes. This partition is done using the horizontal density approximation approach. Each sub-image is then further resized and again partitioned into twelve further sub-images using the uniform partitioning approach. The features of consideration are normalized vector angle (α) from each box. Each feature extracted from sample signatures gives rise to a fuzzy set. Since the choice of a proper fuzzification function is crucial for verification, we have devised a new fuzzification function with structural parameters, which is able to adapt to the variations in fuzzy sets. This function is employed to develop a complete forgery detection and verification system.