Biblioteca Digital

3 resultados para Latent class analysis

em Massachusetts Institute of Technology

Theories of Comparative Analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Comparative analysis is the problem of predicting how a system will react to perturbations in its parameters, and why. For example, comparative analysis could be asked to explain why the period of an oscillating spring/block system would increase if the mass of the block were larger. This thesis formalizes the task of comparative analysis and presents two solution techniques: differential qualitative (DQ) analysis and exaggeration. Both techniques solve many comparative analysis problems, providing explanations suitable for use by design systems, automated diagnosis, intelligent tutoring systems, and explanation based generalization. This thesis explains the theoretical basis for each technique, describes how they are implemented, and discusses the difference between the two. DQ analysis is sound; it never generates an incorrect answer to a comparative analysis question. Although exaggeration does occasionally produce misleading answers, it solves a larger class of problems than DQ analysis and frequently results in simpler explanations.

Veja mais

Improving Multi-class Text Classification with Naive Bayes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are numerous text documents available in electronic form. More and more are becoming available every day. Such documents represent a massive amount of information that is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. The accuracy and our understanding of such systems greatly influences their usefulness. In this paper, we seek 1) to advance the understanding of commonly used text classification techniques, and 2) through that understanding, improve the tools that are available for text classification. We begin by clarifying the assumptions made in the derivation of Naive Bayes, noting basic properties and proposing ways for its extension and improvement. Next, we investigate the quality of Naive Bayes parameter estimates and their impact on classification. Our analysis leads to a theorem which gives an explanation for the improvements that can be found in multiclass classification with Naive Bayes using Error-Correcting Output Codes. We use experimental evidence on two commonly-used data sets to exhibit an application of the theorem. Finally, we show fundamental flaws in a commonly-used feature selection algorithm and develop a statistics-based framework for text feature selection. Greater understanding of Naive Bayes and the properties of text allows us to make better use of it in text classification.

Veja mais

Sparse Correlation Kernel Analysis and Reconstruction

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a new paradigm for signal reconstruction and superresolution, Correlation Kernel Analysis (CKA), that is based on the selection of a sparse set of bases from a large dictionary of class- specific basis functions. The basis functions that we use are the correlation functions of the class of signals we are analyzing. To choose the appropriate features from this large dictionary, we use Support Vector Machine (SVM) regression and compare this to traditional Principal Component Analysis (PCA) for the tasks of signal reconstruction, superresolution, and compression. The testbed we use in this paper is a set of images of pedestrians. This paper also presents results of experiments in which we use a dictionary of multiscale basis functions and then use Basis Pursuit De-Noising to obtain a sparse, multiscale approximation of a signal. The results are analyzed and we conclude that 1) when used with a sparse representation technique, the correlation function is an effective kernel for image reconstruction and superresolution, 2) for image compression, PCA and SVM have different tradeoffs, depending on the particular metric that is used to evaluate the results, 3) in sparse representation techniques, L_1 is not a good proxy for the true measure of sparsity, L_0, and 4) the L_epsilon norm may be a better error metric for image reconstruction and compression than the L_2 norm, though the exact psychophysical metric should take into account high order structure in images.

Veja mais

3 resultados para Latent class analysis

em Massachusetts Institute of Technology

Filtro por publicador

Theories of Comparative Analysis

Improving Multi-class Text Classification with Naive Bayes

Sparse Correlation Kernel Analysis and Reconstruction