621 resultados para Dimensionality
Resumo:
We introduce and explore an approach to estimating statistical significance of classification accuracy, which is particularly useful in scientific applications of machine learning where high dimensionality of the data and the small number of training examples render most standard convergence bounds too loose to yield a meaningful guarantee of the generalization ability of the classifier. Instead, we estimate statistical significance of the observed classification accuracy, or the likelihood of observing such accuracy by chance due to spurious correlations of the high-dimensional data patterns with the class labels in the given training set. We adopt permutation testing, a non-parametric technique previously developed in classical statistics for hypothesis testing in the generative setting (i.e., comparing two probability distributions). We demonstrate the method on real examples from neuroimaging studies and DNA microarray analysis and suggest a theoretical analysis of the procedure that relates the asymptotic behavior of the test to the existing convergence bounds.
Resumo:
This report shows how knowledge about the visual world can be built into a shape representation in the form of a descriptive vocabulary making explicit the important geometrical relationships comprising objects' shapes. Two computational tools are offered: (1) Shapestokens are placed on a Scale-Space Blackboard, (2) Dimensionality-reduction captures deformation classes in configurations of tokens. Knowledge lies in the token types and deformation classes tailored to the constraints and regularities ofparticular shape worlds. A hierarchical shape vocabulary has been implemented supporting several later visual tasks in the two-dimensional shape domain of the dorsal fins of fishes.
Resumo:
R. Jensen and Q. Shen. Fuzzy-Rough Sets Assisted Attribute Selection. IEEE Transactions on Fuzzy Systems, vol. 15, no. 1, pp. 73-89, 2007.
Resumo:
Q. Shen. Rough feature selection for intelligent classifiers. LNCS Transactions on Rough Sets, 7:244-255, 2007.
Resumo:
R. Jensen and Q. Shen, 'Fuzzy-Rough Attribute Reduction with Application to Web Categorization,' Fuzzy Sets and Systems, vol. 141, no. 3, pp. 469-485, 2004.
Resumo:
Q. Shen and R. Jensen, 'Selecting Informative Features with Fuzzy-Rough Sets and its Application for Complex Systems Monitoring,' Pattern Recognition, vol. 37, no. 7, pp. 1351-1363, 2004.
Resumo:
R. Jensen and Q. Shen, 'Tolerance-based and Fuzzy-Rough Feature Selection,' Proceedings of the 16th International Conference on Fuzzy Systems (FUZZ-IEEE'07), pp. 877-882, 2007.
Resumo:
R. Jensen and Q. Shen, 'Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection,' Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2006), LNAI 4259, pp. 147-156, 2006.
Resumo:
R. Marti, C. Rubin, E. Denton and R. Zwiggelaar, '2D-3D correspondence in mammography', Cybernetics and Systems 35 (1), 85-105 (2004)
Resumo:
R. Jensen, Q. Shen, Data Reduction with Rough Sets, In: Encyclopedia of Data Warehousing and Mining - 2nd Edition, Vol. II, 2008.
Resumo:
Enot, D. P., Beckmann, M., Overy, D., Draper, J. (2006). Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals. Proceedings of the National Academy of Sciences of the USA, 103(40), 14865-14870. Sponsorship: BBSRC RAE2008
Resumo:
We describe our work on shape-based image database search using the technique of modal matching. Modal matching employs a deformable shape decomposition that allows users to select example objects and have the computer efficiently sort the set of objects based on the similarity of their shape. Shapes are compared in terms of the types of nonrigid deformations (differences) that relate them. The modal decomposition provides deformation "control knobs" for flexible matching and thus allows for selecting weighted subsets of shape parameters that are deemed significant for a particular category or context. We demonstrate the utility of this approach for shape comparison in 2-D image databases; however, the general formulation is applicable to signals of any dimensionality.
Resumo:
Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align the two objects. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting which types of deformations are used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2-D contour and point feature data are shown, and recognition experiments with 2-D images of hand tools and airplanes are described.
Resumo:
We consider the problem of performing topological optimizations of distributed hash tables. Such hash tables include Chord and Tapestry and are a popular building block for distributed applications. Optimizing topologies over one dimensional hash spaces is particularly difficult as the higher dimensionality of the underlying network makes close fits unlikely. Instead, current schemes are limited to heuristically performing local optimizations finding the best of small random set of peers. We propose a new class of topology optimizations based on the existence of clusters of close overlay members within the underlying network. By constructing additional overlays for each cluster, a significant portion of the search procedure can be performed within the local cluster with a corresponding reduction in the search time. Finally, we discuss the effects of these additional overlays on spatial locality and other load balancing scheme.
Resumo:
The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.