775 resultados para mining data streams
Resumo:
This paper presents the results of a Secchi depth data mining study for the North Sea - Baltic Sea region. 40,829 measurements of Secchi depth were compiled from the area as a result of this study. 4.3% of the observations were found in the international data centers [ICES Oceanographic Data Center in Denmark and the World Ocean Data Center A (WDC-A) in the USA], while 95.7% of the data was provided by individuals and ocean research institutions from the surrounding North Sea and Baltic Sea countries. Inquiries made at the World Ocean Data Center B (WDC-B) in Russia suggested that there could be significant additional holdings in that archive but, unfortunately, no data could be made available. The earliest Secchi depth measurement retrieved in this study dates back to 1902 for the Baltic Sea, while the bulk of the measurements were gathered after 1970. The spatial distribution of Secchi depth measurements in the North Sea is very uneven with surprisingly large sampling gaps in the Western North Sea. Quarterly and annual Secchi depth maps with a 0.5° x 0.5° spatial resolution are provided for the transition area between the North Sea and the Baltic Sea (4°E-16°E, 53°N-60°N).
Resumo:
The planktonic haptophyte Phaeocystis has been suggested to play a fundamental role in the global biogeochemical cycling of carbon and sulphur, but little is known about its global biomass distribution. We have collected global microscopy data of the genus Phaeocystis and converted abundance data to carbon biomass using species-specific carbon conversion factors. Microscopic counts of single-celled and colonial Phaeocystis were obtained both through the mining of online databases and by accepting direct submissions (both published and unpublished) from Phaeocystis specialists. We recorded abundance data from a total of 1595 depth-resolved stations sampled between 1955-2009. The quality-controlled dataset includes 5057 counts of individual Phaeocystis cells resolved to species level and information regarding life-stages from 3526 samples. 83% of stations were located in the Northern Hemisphere while 17% were located in the Southern Hemisphere. Most data were located in the latitude range of 50-70° N. While the seasonal distribution of Northern Hemisphere data was well-balanced, Southern Hemisphere data was biased towards summer months. Mean species- and form-specific cell diameters were determined from previously published studies. Cell diameters were used to calculate the cellular biovolume of Phaeocystis cells, assuming spherical geometry. Cell biomass was calculated using a carbon conversion factor for Prymnesiophytes (Menden-Deuer and Lessard, 2000). For colonies, the number of cells per colony was derived from the colony volume. Cell numbers were then converted to carbon concentrations. An estimation of colonial mucus carbon was included a posteriori, assuming a mean colony size for each species. Carbon content per cell ranged from 9 pg (single-celled Phaeocystis antarctica) to 29 pg (colonial Phaeocystis globosa). Non-zero Phaeocystis cell biomasses (without mucus carbon) range from 2.9 - 10?5 µg l-1 to 5.4 - 103 µg l-1, with a mean of 45.7 µg l-1 and a median of 3.0 µg l-1. Highest biomasses occur in the Southern Ocean below 70° S (up to 783.9 µg l-1), and in the North Atlantic around 50° N (up to 5.4 - 103 µg l-1).
Resumo:
The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources.