962 resultados para Classification Tree Pruning
Resumo:
Binary image classifiction is a problem that has received much attention in recent years. In this paper we evaluate a selection of popular techniques in an effort to find a feature set/ classifier combination which generalizes well to full resolution image data. We then apply that system to images at one-half through one-sixteenth resolution, and consider the corresponding error rates. In addition, we further observe generalization performance as it depends on the number of training images, and lastly, compare the system's best error rates to that of a human performing an identical classification task given teh same set of test images.
Resumo:
Trees are a common way of organizing large amounts of information by placing items with similar characteristics near one another in the tree. We introduce a classification problem where a given tree structure gives us information on the best way to label nearby elements. We suggest there are many practical problems that fall under this domain. We propose a way to map the classification problem onto a standard Bayesian inference problem. We also give a fast, specialized inference algorithm that incrementally updates relevant probabilities. We apply this algorithm to web-classification problems and show that our algorithm empirically works well.
Resumo:
We introduce and explore an approach to estimating statistical significance of classification accuracy, which is particularly useful in scientific applications of machine learning where high dimensionality of the data and the small number of training examples render most standard convergence bounds too loose to yield a meaningful guarantee of the generalization ability of the classifier. Instead, we estimate statistical significance of the observed classification accuracy, or the likelihood of observing such accuracy by chance due to spurious correlations of the high-dimensional data patterns with the class labels in the given training set. We adopt permutation testing, a non-parametric technique previously developed in classical statistics for hypothesis testing in the generative setting (i.e., comparing two probability distributions). We demonstrate the method on real examples from neuroimaging studies and DNA microarray analysis and suggest a theoretical analysis of the procedure that relates the asymptotic behavior of the test to the existing convergence bounds.
Resumo:
The use of terms such as “Engineering Systems”, “System of systems” and others have been coming into greater use over the past decade to denote systems of importance but with implied higher complexity than for the term systems alone. This paper searches for a useful taxonomy or classification scheme for complex Systems. There are two aspects to this problem: 1) distinguishing between Engineering Systems (the term we use) and other Systems, and 2) differentiating among Engineering Systems. Engineering Systems are found to be differentiated from other complex systems by being human-designed and having both significant human complexity as well as significant technical complexity. As far as differentiating among various engineering systems, it is suggested that functional type is the most useful attribute for classification differentiation. Information, energy, value and mass acted upon by various processes are the foundation concepts underlying the technical types.
Resumo:
The identification of subject-specific traits extracted from patterns of brain activity still represents an important challenge. The need to detect distinctive brain features, which is relevant for biometric and brain computer interface systems, has been also emphasized in monitoring the effect of clinical treatments and in evaluating the progression of brain disorders. Graph theory and network science tools have revealed fundamental mechanisms of functional brain organization in resting-state M/EEG analysis. Nevertheless, it is still not clearly understood how several methodological aspects may bias the topology of the reconstructed functional networks. In this context, the literature shows inconsistency in the chosen length of the selected epochs, impeding a meaningful comparison between results from different studies. In this study we propose an approach which aims to investigate the existence of a distinctive functional core (sub-network) using an unbiased reconstruction of network topology. Brain signals from a public and freely available EEG dataset were analyzed using a phase synchronization based measure, minimum spanning tree and k-core decomposition. The analysis was performed for each classical brain rhythm separately. Furthermore, we aim to provide a network approach insensitive to the effects that epoch length has on functional connectivity (FC) and network reconstruction. Two different measures, the phase lag index (PLI) and the Amplitude Envelope Correlation (AEC), were applied to EEG resting-state recordings for a group of eighteen healthy volunteers. Weighted clustering coefficient (CCw), weighted characteristic path length (Lw) and minimum spanning tree (MST) parameters were computed to evaluate the network topology. The analysis was performed on both scalp and source-space data. Results about distinctive functional core, show highest classification rates from k-core decomposition in gamma (EER=0.130, AUC=0.943) and high beta (EER=0.172, AUC=0.905) frequency bands. Results from scalp analysis concerning the influence of epoch length, show a decrease in both mean PLI and AEC values with an increase in epoch length, with a tendency to stabilize at a length of 12 seconds for PLI and 6 seconds for AEC. Moreover, CCw and Lw show very similar behaviour, with metrics based on AEC more reliable in terms of stability. In general, MST parameters stabilize at short epoch lengths, particularly for MSTs based on PLI (1-6 seconds versus 4-8 seconds for AEC). At the source-level the results were even more reliable, with stability already at 1 second duration for PLI-based MSTs. Our results confirm that EEG analysis may represent an effective tool to identify subject-specific characteristics that may be of great impact for several bioengineering applications. Regarding epoch length, the present work suggests that both PLI and AEC depend on epoch length and that this has an impact on the reconstructed network topology, particularly at the scalp-level. Source-level MST topology is less sensitive to differences in epoch length, therefore enabling the comparison of brain network topology between different studies.
Resumo:
Discussion Conclusions Materials and Methods Acknowledgments Author Contributions References Reader Comments (0) Figures Abstract The importance of mangrove forests in carbon sequestration and coastal protection has been widely acknowledged. Large-scale damage of these forests, caused by hurricanes or clear felling, can enhance vulnerability to erosion, subsidence and rapid carbon losses. However, it is unclear how small-scale logging might impact on mangrove functions and services. We experimentally investigated the impact of small-scale tree removal on surface elevation and carbon dynamics in a mangrove forest at Gazi bay, Kenya. The trees in five plots of a Rhizophora mucronata (Lam.) forest were first girdled and then cut. Another set of five plots at the same site served as controls. Treatment induced significant, rapid subsidence (−32.1±8.4 mm yr−1 compared with surface elevation changes of +4.2±1.4 mm yr−1 in controls). Subsidence in treated plots was likely due to collapse and decomposition of dying roots and sediment compaction as evidenced from increased sediment bulk density. Sediment effluxes of CO2 and CH4 increased significantly, especially their heterotrophic component, suggesting enhanced organic matter decomposition. Estimates of total excess fluxes from treated compared with control plots were 25.3±7.4 tCO2 ha−1 yr−1 (using surface carbon efflux) and 35.6±76.9 tCO2 ha−1 yr−1 (using surface elevation losses and sediment properties). Whilst such losses might not be permanent (provided cut areas recover), observed rapid subsidence and enhanced decomposition of soil sediment organic matter caused by small-scale harvesting offers important lessons for mangrove management. In particular mangrove managers need to carefully consider the trade-offs between extracting mangrove wood and losing other mangrove services, particularly shoreline stabilization, coastal protection and carbon storage.
Resumo:
Struyf, J., Dzeroski, S. Blockeel, H. and Clare, A. (2005) Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics. In proceedings of the EPIA 2005 CMB Workshop
Resumo:
R. Jensen and Q. Shen, 'Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection,' Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2006), LNAI 4259, pp. 147-156, 2006.
Resumo:
C. Shang and Q. Shen. Aiding classification of gene expression data with feature selection: a comparative study. Computational Intelligence Research, 1(1):68-76.
Resumo:
M. Galea, Q. Shen and J. Levine. Evolutionary approaches to fuzzy modelling. Knowledge Engineering Review, 19(1):27-59, 2004.
Resumo:
M. Galea and Q. Shen. Fuzzy rules from ant-inspired computation. Proceedings of the 13th International Conference on Fuzzy Systems, pages 1691-1696, 2004.
Resumo:
K. Rasmani and Q. Shen. Subsethood-based fuzzy modelling and classification. Proceedings of the 2004 UK Workshop on Computational Intelligence, pages 181-188.
Resumo:
M. Galea and Q. Shen. Simultaneous ant colony optimisation algorithms for learning linguistic fuzzy rules. A. Abraham, C. Grosan and V. Ramos (Eds.), Swarm Intelligence in Data Mining, pages 75-99.