3 resultados para Stable Vector-Bundles

em Deakin Research Online - Australia


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The support vector machine (SVM) is a popular method for classification, well known for finding the maximum-margin hyperplane. Combining SVM with l1-norm penalty further enables it to simultaneously perform feature selection and margin maximization within a single framework. However, l1-norm SVM shows instability in selecting features in presence of correlated features. We propose a new method to increase the stability of l1-norm SVM by encouraging similarities between feature weights based on feature correlations, which is captured via a feature covariance matrix. Our proposed method can capture both positive and negative correlations between features. We formulate the model as a convex optimization problem and propose a solution based on alternating minimization. Using both synthetic and real-world datasets, we show that our model achieves better stability and classification accuracy compared to several state-of-the-art regularized classification methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The performance of different information criteria - namely Akaike, corrected Akaike (AICC), Schwarz-Bayesian (SBC), and Hannan-Quinn - is investigated so as to choose the optimal lag length in stable and unstable vector autoregressive (VAR) models both when autoregressive conditional heteroscedasticity (ARCH) is present and when it is not. The investigation covers both large and small sample sizes. The Monte Carlo simulation results show that SBC has relatively better performance in lag-choice accuracy in many situations. It is also generally the least sensitive to ARCH regardless of stability or instability of the VAR model, especially in large sample sizes. These appealing properties of SBC make it the optimal criterion for choosing lag length in many situations, especially in the case of financial data, which are usually characterized by occasional periods of high volatility. SBC also has the best forecasting abilities in the majority of situations in which we vary sample size, stability, variance structure (ARCH or not), and forecast horizon (one period or five). frequently, AICC also has good lag-choosing and forecasting properties. However, when ARCH is present, the five-period forecast performance of all criteria in all situations worsens.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can potentially assist clinical decision making for accurate medical prognosis.