2 resultados para Universal tree
em Plymouth Marine Science Electronic Archive (PlyMSEA)
Resumo:
Agglomerative cluster analyses encompass many techniques, which have been widely used in various fields of science. In biology, and specifically ecology, datasets are generally highly variable and may contain outliers, which increase the difficulty to identify the number of clusters. Here we present a new criterion to determine statistically the optimal level of partition in a classification tree. The criterion robustness is tested against perturbated data (outliers) using an observation or variable with values randomly generated. The technique, called Random Simulation Test (RST), is tested on (1) the well-known Iris dataset [Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann. Eugenic. 7, 179–188], (2) simulated data with predetermined numbers of clusters following Milligan and Cooper [Milligan, G.W., Cooper, M.C., 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179] and finally (3) is applied on real copepod communities data previously analyzed in Beaugrand et al. [Beaugrand, G., Ibanez, F., Lindley, J.A., Reid, P.C., 2002. Diversity of calanoid copepods in the North Atlantic and adjacent seas: species associations and biogeography. Mar. Ecol. Prog. Ser. 232, 179–195]. The technique is compared to several standard techniques. RST performed generally better than existing algorithms on simulated data and proved to be especially efficient with highly variable datasets.
Resumo:
Characterization of chlorophyll and sea surface temperature (SST) structural heterogeneity using their scaling properties can provide a useful tool to estimate the relative importance of key physical and biological drivers. Seasonal, annual, and also instantaneous spatial distributions of chlorophyll and SST, determined from satellite measurements, in seven different coastal and shelf-sea regions around the UK have been studied. It is shown that multifractals provide a very good approximation to the scaling properties of the data: in fact, the multifractal scaling function is well approximated by universal multifractal theory. The consequence is that all of the statistical information about data structure can be reduced to being described by two parameters. It is further shown that also bathymetry scales in the studied regions as multifractal. The SST and chlorophyll multifractal structures are then explained as an effect of bathymetry and turbulence.