868 resultados para data-types
Resumo:
Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the data itself. Although abundant expert knowledge exists in many areas where unlabelled data is examined, such knowledge is rarely incorporated into automatic analysis. Incorporation of expert knowledge is frequently a matter of combining multiple data sources from disparate hypothetical spaces. In cases where such spaces belong to different data types, this task becomes even more challenging. In this paper we present a novel immune-inspired method that enables the fusion of such disparate types of data for a specific set of problems. We show that our method provides a better visual understanding of one hypothetical space with the help of data from another hypothetical space. We believe that our model has implications for the field of exploratory data analysis and knowledge discovery.
Resumo:
Background The expansion of cell colonies is driven by a delicate balance of several mechanisms including cell motility, cell-to-cell adhesion and cell proliferation. New approaches that can be used to independently identify and quantify the role of each mechanism will help us understand how each mechanism contributes to the expansion process. Standard mathematical modelling approaches to describe such cell colony expansion typically neglect cell-to-cell adhesion, despite the fact that cell-to-cell adhesion is thought to play an important role. Results We use a combined experimental and mathematical modelling approach to determine the cell diffusivity, D, cell-to-cell adhesion strength, q, and cell proliferation rate, ?, in an expanding colony of MM127 melanoma cells. Using a circular barrier assay, we extract several types of experimental data and use a mathematical model to independently estimate D, q and ?. In our first set of experiments, we suppress cell proliferation and analyse three different types of data to estimate D and q. We find that standard types of data, such as the area enclosed by the leading edge of the expanding colony and more detailed cell density profiles throughout the expanding colony, does not provide sufficient information to uniquely identify D and q. We find that additional data relating to the degree of cell-to-cell clustering is required to provide independent estimates of q, and in turn D. In our second set of experiments, where proliferation is not suppressed, we use data describing temporal changes in cell density to determine the cell proliferation rate. In summary, we find that our experiments are best described using the range D = 161 - 243 ?m2 hour-1, q = 0.3 - 0.5 (low to moderate strength) and ? = 0.0305 - 0.0398 hour-1, and with these parameters we can accurately predict the temporal variations in the spatial extent and cell density profile throughout the expanding melanoma cell colony. Conclusions Our systematic approach to identify the cell diffusivity, cell-to-cell adhesion strength and cell proliferation rate highlights the importance of integrating multiple types of data to accurately quantify the factors influencing the spatial expansion of melanoma cell colonies.
Resumo:
Marine species generally have large population sizes, continuous distributions and high dispersal capacity. Despite this, they are often subdivided into separate populations, which are the basic units of fisheries management. For example, populations of some fisheries species across the deep water of the Timor Trench are genetically different, inferring minimal movement and interbreeding. When connectivity is higher than the Timor Trench example, but not so high that the populations become one, connectivity between populations is crinkled. Crinkled connectivity occurs when migration is above the threshold required to link populations genetically, but below the threshold for demographic links. In future, genetic estimates of connectivity over crinkled links could be uniquely combined with other data, such as estimates of population size and tagging and tracking data, to quantify demographic connectedness between these types of populations. Elasmobranch species may be ideal targets for this research because connectivity between populations is more likely to be crinkled than for finfish species. Fisheries stock-assessment models could be strengthened with estimates of connectivity to improve the strategic and sustainable harvesting of biological resources.
Resumo:
Pteropods are a group of holoplanktonic gastropods for which global biomass distribution patterns remain poorly resolved. The aim of this study was to collect and synthesize existing pteropod (Gymnosomata, Thecosomata and Pseudothecosomata) abundance and biomass data, in order to evaluate the global distribution of pteropod carbon biomass, with a particular emphasis on its seasonal, temporal and vertical patterns. We collected 25 902 data points from several online databases and a number of scientific articles. The biomass data has been gridded onto a 360 x 180° grid, with a vertical resolution of 33 WOA depth levels. Data has been converted to NetCDF format. Data were collected between 1951-2010, with sampling depths ranging from 0-1000 m. Pteropod biomass data was either extracted directly or derived through converting abundance to biomass with pteropod specific length to weight conversions. In the Northern Hemisphere (NH) the data were distributed evenly throughout the year, whereas sampling in the Southern Hemisphere was biased towards the austral summer months. 86% of all biomass values were located in the NH, most (42%) within the latitudinal band of 30-50° N. The range of global biomass values spanned over three orders of magnitude, with a mean and median biomass concentration of 8.2 mg C l-1 (SD = 61.4) and 0.25 mg C l-1, respectively for all data points, and with a mean of 9.1 mg C l-1 (SD = 64.8) and a median of 0.25 mg C l-1 for non-zero biomass values. The highest mean and median biomass concentrations were located in the NH between 40-50° S (mean biomass: 68.8 mg C l-1 (SD = 213.4) median biomass: 2.5 mg C l-1) while, in the SH, they were within the 70-80° S latitudinal band (mean: 10.5 mg C l-1 (SD = 38.8) and median: 0.2 mg C l-1). Biomass values were lowest in the equatorial regions. A broad range of biomass concentrations was observed at all depths, with the biomass peak located in the surface layer (0-25 m) and values generally decreasing with depth. However, biomass peaks were located at different depths in different ocean basins: 0-25 m depth in the N Atlantic, 50-100 m in the Pacific, 100-200 m in the Arctic, 200-500 m in the Brazilian region and >500 m in the Indo-Pacific region. Biomass in the NH was relatively invariant over the seasonal cycle, but more seasonally variable in the SH. The collected database provides a valuable tool for modellers for the study of ecosystem processes and global biogeochemical cycles.
Resumo:
The smallest marine phytoplankton, collectively termed picophytoplankton, have been routinely enumerated by flow cytometry since the late 1980s, during cruises throughout most of the world ocean. We compiled a database of 40,946 data points, with separate abundance entries for Prochlorococcus, Synechococcus and picoeukaryotes. We use average conversion factors for each of the three groups to convert the abundance data to carbon biomass. After gridding with 1° spacing, the database covers 2.4% of the ocean surface area, with the best data coverage in the North Atlantic, the South Pacific and North Indian basins. The average picophytoplankton biomass is 12 ± 22 µg C L-1 or 1.9 g C m-2. We estimate a total global picophytoplankton biomass, excluding N2-fixers, of 0.53 - 0.74 Pg C (17 - 39 % Prochlorococcus, 12 - 15 % Synechococcus and 49 - 69 % picoeukaryotes). Future efforts in this area of research should focus on reporting calibrated cell size, and collecting data in undersampled regions.
Resumo:
This study is a first effort to compile the largest possible body of data available from different plankton databases as well as from individual published or unpublished datasets regarding diatom distribution in the world ocean. The data obtained originate from time series studies as well as spatial studies. This effort is supported by the Marine Ecosystem Data (MAREDAT) project, which aims at building consistent data sets for the main PFTs (Plankton Functional Types) in order to help validate biogeochemical ocean models by using converted C biomass from abundance data. Diatom abundance data were obtained from various research programs with the associated geolocation and date of collection, as well as with a taxonomic information ranging from group down to species. Minimum, maximum and average cell size information were mined from the literature for each taxonomic entry, and all abundance data were subsequently converted to biovolume and C biomass using the same methodology.
Resumo:
Microzooplankton database. Originally published in: Buitenhuis, Erik, Richard Rivkin, Sévrine Sailley, Corinne Le Quéré (2010) Biogeochemical fluxes through microzooplankton. Global Biogeochemical Cycles Vol. 24, GB4015, doi:10.1029/2009GB003601 This new version has had some mistakes corrected.
Resumo:
We compiled a database of bacterial abundance of 39 766 data points. After gridding with 1° spacing, the database covers 1.3% of the ocean surface. There is data covering all ocean basins and depth except the Southern Hemisphere below 350 m or from April until June. The average bacterial biomass is 3.9 ± 3.6 µg l-1 with a 20-fold decrease between the surface and the deep sea. We estimate a total ocean inventory of about 1.3 - 1029 bacteria. Using an average of published open ocean measurements for the conversion from abundance to carbon biomass of 9.1 fg cell-1, we calculate a bacterial carbon inventory of about 1.2 Pg C. The main source of uncertainty in this inventory is the conversion factor from abundance to biomass.
Resumo:
Macrozooplankton are an important link between higher and lower trophic levels in the oceans. They serve as the primary food for fish, reptiles, birds and mammals in some regions, and play a role in the export of carbon from the surface to the intermediate and deep ocean. Little, however, is known of their global distribution and biomass. Here we compiled a dataset of macrozooplankton abundance and biomass observations for the global ocean from a collection of four datasets. We harmonise the data to common units, calculate additional carbon biomass where possible, and bin the dataset in a global 1 x 1 degree grid. This dataset is part of a wider effort to provide a global picture of carbon biomass data for key plankton functional types, in particular to support the development of marine ecosystem models. Over 387 700 abundance data and 1330 carbon biomass data have been collected from pre-existing datasets. A further 34 938 abundance data were converted to carbon biomass data using species-specific length frequencies or using species-specific abundance to carbon biomass data. Depth-integrated values are used to calculate known epipelagic macrozooplankton biomass concentrations and global biomass. Global macrozooplankton biomass has a mean of 8.4 µg C l-1, median of 0.15 µg C l-1 and a standard deviation of 63.46 µg C l-1. The global annual average estimate of epipelagic macrozooplankton, based on the median value, is 0.02 Pg C. Biomass is highest in the tropics, decreasing in the sub-tropics and increasing slightly towards the poles. There are, however, limitations on the dataset; abundance observations have good coverage except in the South Pacific mid latitudes, but biomass observation coverage is only good at high latitudes. Biomass is restricted to data that is originally given in carbon or to data that can be converted from abundance to carbon. Carbon conversions from abundance are restricted in the most part by the lack of information on the size of the organism and/or the absence of taxonomic information. Distribution patterns of global macrozooplankton biomass and statistical information about biomass concentrations may be used to validate biogeochemical models and Plankton Functional Type models.