9 resultados para TREE METHOD
em University of Queensland eSpace - Australia
Resumo:
In this paper, we propose a novel high-dimensional index method, the BM+-tree, to support efficient processing of similarity search queries in high-dimensional spaces. The main idea of the proposed index is to improve data partitioning efficiency in a high-dimensional space by using a rotary binary hyperplane, which further partitions a subspace and can also take advantage of the twin node concept used in the M+-tree. Compared with the key dimension concept in the M+-tree, the binary hyperplane is more effective in data filtering. High space utilization is achieved by dynamically performing data reallocation between twin nodes. In addition, a post processing step is used after index building to ensure effective filtration. Experimental results using two types of real data sets illustrate a significantly improved filtering efficiency.
Resumo:
The notorious "dimensionality curse" is a well-known phenomenon for any multi-dimensional indexes attempting to scale up to high dimensions. One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them are challenging tasks. In this paper, we present an adaptive Multi-level Mahalanobis-based Dimensionality Reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has four notable features compared to existing methods. First, it discovers elliptical clusters for more effective dimensionality reduction by using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B+-tree. Third, our technique is highly scalable in terms of data size and dimension. Finally, it is also dynamic and adaptive to insertions. An extensive performance study was conducted using both real and synthetic datasets, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently. Copyright Springer-Verlag 2005
Resumo:
Reforestation in tropical areas is usually attempted by planting seedlings but, direct seeding (the artificial addition or sowing of seed) may be an alternative way of accelerating forest recovery and successional processes. This study investigated the effects of various sowing treatments (designed to create different microsite conditions for seed germination) and seed sizes on the early establishment and growth of directly sown rainforest tree species in a variety of experimental plots at three sites in the wet tropical region of north-cast Queensland, Australia. The different sowing treatments were found to have significant effects on seedling establishment. Broadcast sowing treatments were ineffective and resulted in very poor seedling establishment and high seed wastage. Higher establishment rates occurred when seeds were buried. Seed size was found to be an important factor affecting establishment in relation to micro-site condition. In general, larger seeded species had higher establishment rates at all three sites than species of small and intermediate seed size, but only in sowing treatments where seeds were buried. Overall these results suggest that direct sowing of seed can be used as a too] to accelerate recolonisation of certain rainforest tree species on degraded tropical lands, but initial success will be dependent on the choice of sowing method and its suitability for the seed types selected. The results also indicate that the recruitment of naturally dispersed tree species at degraded sites is likely to be severely limited by the availability of suitable microsites for seed germination. Consequently the natural recovery of degraded sites via seed rain can be expected to be slow and unpredictable, particularly in areas where soil compaction has occurred. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Results from the humid tropics of Australia demonstrate that diverse plantations can achieve greater productivity than monocultures. We found that increases in both the observed species number and the effective species richness were significantly related to increased levels of productivity as measured by stand basal area or mean individual tree basal area. Four of five plantation species were more productive in mixtures with other species than in monocultures, offering on average, a 55% increase in mean tree basal area. A general linear model suggests that species richness had a significant effect on mean individual tree basal area when environmental variables were included in the model. As monoculture plantations are currently the preferred reforestation method throughout the tropics these results suggest that significant productivity and ecological gains could be made if multi-species plantations are more broadly pursued. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.
Resumo:
The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the performance of Naïve Bayes classifiers and in this paper, we investigate its effectiveness on application to the TAN classifier.
Resumo:
Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering curse of dimensionality problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of curse of dimensionality; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.
Resumo:
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005
Resumo:
Government agencies responsible for riparian environments are assessing the combined utility of field survey and remote sensing for mapping and monitoring indicators of riparian zone condition. The objective of this work was to compare the Tropical Rapid Appraisal of Riparian Condition (TRARC) method to a satellite image based approach. TRARC was developed for rapid assessment of the environmental condition of savanna riparian zones. The comparison assessed mapping accuracy, representativeness of TRARC assessment, cost-effectiveness, and suitability for multi-temporal analysis. Two multi-spectral QuickBird images captured in 2004 and 2005 and coincident field data covering sections of the Daly River in the Northern Territory, Australia were used in this work. Both field and image data were processed to map riparian health indicators (RHIs) including percentage canopy cover, organic litter, canopy continuity, stream bank stability, and extent of tree clearing. Spectral vegetation indices, image segmentation and supervised classification were used to produce RHI maps. QuickBird image data were used to examine if the spatial distribution of TRARC transects provided a representative sample of ground based RHI measurements. Results showed that TRARC transects were required to cover at least 3% of the study area to obtain a representative sample. The mapping accuracy and costs of the image based approach were compared to those of the ground based TRARC approach. Results proved that TRARC was more cost-effective at smaller scales (1-100km), while image based assessment becomes more feasible at regional scales (100-1000km). Finally, the ability to use both the image and field based approaches for multi-temporal analysis of RHIs was assessed. Change detection analysis demonstrated that image data can provide detailed information on gradual change, while the TRARC method was only able to identify more gross scale changes. In conclusion, results from both methods were considered to complement each other if used at appropriate spatial scales.