74 resultados para Volatility clustering
Resumo:
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005
Resumo:
Rectangular dropshafts, commonly used in sewers and storm water systems, are characterised by significant flow aeration. New detailed air-water flow measurements were conducted in a near-full-scale dropshaft at large discharges. In the shaft pool and outflow channel, the results demonstrated the complexity of different competitive air entrainment mechanisms. Bubble size measurements showed a broad range of entrained bubble sizes. Analysis of streamwise distributions of bubbles suggested further some clustering process in the bubbly flow although, in the outflow channel, bubble chords were in average smaller than in the shaft pool. A robust hydrophone was tested to measure bubble acoustic spectra and to assess its field application potential. The acoustic results characterised accurately the order of magnitude of entrained bubble sizes, but the transformation from acoustic frequencies to bubble radii did not predict correctly the probability distribution functions of bubble sizes.
Resumo:
In an open channel, a hydraulic jump is the rapid transition from super- to sub-critical flow associated with strong turbulence and air bubble entrainment in the mixing layer. New experiments were performed at relatively large Reynolds numbers using phase-detection probes. Some new signal analysis provided characteristic air-water time and length scales of the vortical structures advecting the air bubbles in the developing shear flow. An analysis of the longitudinal air-water flow structure suggested little bubble clustering in the mixing layer, although an interparticle arrival time analysis showed some preferential bubble clustering for small bubbles with chord times below 3 ms. Correlation analyses yielded longitudinal air-water time scales Txx*V1/d1 of about 0.8 in average. The transverse integral length scale Z/d1 of the eddies advecting entrained bubbles was typically between 0.25 and 0.4, irrespective of the inflow conditions within the range of the investigations. Overall the findings highlighted the complicated nature of the air-water flow
Resumo:
A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.
Resumo:
In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.