935 resultados para Complex data
                                
Resumo:
The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.
A priori parameterisation of the CERES soil-crop models and tests against several European data sets
                                
Resumo:
Mechanistic soil-crop models have become indispensable tools to investigate the effect of management practices on the productivity or environmental impacts of arable crops. Ideally these models may claim to be universally applicable because they simulate the major processes governing the fate of inputs such as fertiliser nitrogen or pesticides. However, because they deal with complex systems and uncertain phenomena, site-specific calibration is usually a prerequisite to ensure their predictions are realistic. This statement implies that some experimental knowledge on the system to be simulated should be available prior to any modelling attempt, and raises a tremendous limitation to practical applications of models. Because the demand for more general simulation results is high, modellers have nevertheless taken the bold step of extrapolating a model tested within a limited sample of real conditions to a much larger domain. While methodological questions are often disregarded in this extrapolation process, they are specifically addressed in this paper, and in particular the issue of models a priori parameterisation. We thus implemented and tested a standard procedure to parameterize the soil components of a modified version of the CERES models. The procedure converts routinely-available soil properties into functional characteristics by means of pedo-transfer functions. The resulting predictions of soil water and nitrogen dynamics, as well as crop biomass, nitrogen content and leaf area index were compared to observations from trials conducted in five locations across Europe (southern Italy, northern Spain, northern France and northern Germany). In three cases, the model’s performance was judged acceptable when compared to experimental errors on the measurements, based on a test of the model’s root mean squared error (RMSE). Significant deviations between observations and model outputs were however noted in all sites, and could be ascribed to various model routines. In decreasing importance, these were: water balance, the turnover of soil organic matter, and crop N uptake. A better match to field observations could therefore be achieved by visually adjusting related parameters, such as field-capacity water content or the size of soil microbial biomass. As a result, model predictions fell within the measurement errors in all sites for most variables, and the model’s RMSE was within the range of published values for similar tests. We conclude that the proposed a priori method yields acceptable simulations with only a 50% probability, a figure which may be greatly increased through a posteriori calibration. Modellers should thus exercise caution when extrapolating their models to a large sample of pedo-climatic conditions for which they have only limited information.
                                
Resumo:
Microphthalmia with linear skin defects (MLS) syndrome is an X-linked male-lethal disorder also known as MIDAS (microphthalmia, dermal aplasia, and sclerocornea). Additional clinical features include neurological and cardiac abnormalities. MLS syndrome is genetically heterogeneous given that heterozygous mutations in HCCS or COX7B have been identified in MLS-affected females. Both genes encode proteins involved in the structure and function of complexes III and IV, which form the terminal segment of the mitochondrial respiratory chain (MRC). However, not all individuals with MLS syndrome carry a mutation in either HCCS or COX7B. The majority of MLS-affected females have severe skewing of X chromosome inactivation, suggesting that mutations in HCCS, COX7B, and other as-yet-unidentified X-linked gene(s) cause selective loss of cells in which the mutated X chromosome is active. By applying whole-exome sequencing and filtering for X-chromosomal variants, we identified a de novo nonsense mutation in NDUFB11 (Xp11.23) in one female individual and a heterozygous 1-bp deletion in a second individual, her asymptomatic mother, and an affected aborted fetus of the subject's mother. NDUFB11 encodes one of 30 poorly characterized supernumerary subunits of NADH:ubiquinone oxidoreductase, known as complex I (cI), the first and largest enzyme of the MRC. By shRNA-mediated NDUFB11 knockdown in HeLa cells, we demonstrate that NDUFB11 is essential for cI assembly and activity as well as cell growth and survival. These results demonstrate that X-linked genetic defects leading to the complete inactivation of complex I, III, or IV underlie MLS syndrome. Our data reveal an unexpected role of cI dysfunction in a developmental phenotype, further underscoring the existence of a group of mitochondrial diseases associated with neurocutaneous manifestations.
                                
Resumo:
The current challenge in a context of major environmental changes is to anticipate the responses of species to future landscape and climate scenarios. In the Mediterranean basin, climate change is one the most powerful driving forces of fire dynamics, with fire frequency and impact having markedly increased in recent years. Species distribution modelling plays a fundamental role in this challenge, but better integration of available ecological knowledge is needed to adequately guide conservation efforts. Here, we quantified changes in habitat suitability of an early-succession bird in Catalonia, the Dartford Warbler (Sylvia undata) ― globally evaluated as Near Threatened in the IUCN Red List. We assessed potential changes in species distributions between 2000 and 2050 under different fire management and climate change scenarios and described landscape dynamics using a spatially-explicit fire-succession model that simulates fire impacts in the landscape and post-fire regeneration (MEDFIRE model). Dartford Warbler occurrence data were acquired at two different spatial scales from: 1) the Atlas of European Breeding Birds (EBCC) and 2) Catalan Breeding Bird Atlas (CBBA). Habitat suitability was modelled using five widely-used modelling techniques in an ensemble forecasting framework. Our results indicated considerable habitat suitability losses (ranging between 47% and 57% in baseline scenarios), which were modulated to a large extent by fire regime changes derived from fire management policies and climate changes. Such result highlighted the need for taking the spatial interaction between climate changes, fire-mediated landscape dynamics and fire management policies into account for coherently anticipating habitat suitability changes of early succession bird species. We conclude that fire management programs need to be integrated into conservation plans to effectively preserve sparsely forested and early succession habitats and their associated species in the face of global environmental change.
                                
Resumo:
The linking of North and South America by the Isthmus of Panama had major impacts on global climate, oceanic and atmospheric currents, and biodiversity, yet the timing of this critical event remains contentious. The Isthmus is traditionally understood to have fully closed by ca. 3.5 million years ago (Ma), and this date has been used as a benchmark for oceanographic, climatic, and evolutionary research, but recent evidence suggests a more complex geological formation. Here, we analyze both molecular and fossil data to evaluate the tempo of biotic exchange across the Americas in light of geological evidence. We demonstrate significant waves of dispersal of terrestrial organisms at approximately ca. 20 and 6 Ma and corresponding events separating marine organisms in the Atlantic and Pacific oceans at ca. 23 and 7 Ma. The direction of dispersal and their rates were symmetrical until the last ca. 6 Ma, when northern migration of South American lineages increased significantly. Variability among taxa in their timing of dispersal or vicariance across the Isthmus is not explained by the ecological factors tested in these analyses, including biome type, dispersal ability, and elevation preference. Migration was therefore not generally regulated by intrinsic traits but more likely reflects the presence of emergent terrain several millions of years earlier than commonly assumed. These results indicate that the dramatic biotic turnover associated with the Great American Biotic Interchange was a long and complex process that began as early as the Oligocene-Miocene transition.
                                
Resumo:
Maximum entropy modeling (Maxent) is a widely used algorithm for predicting species distributions across space and time. Properly assessing the uncertainty in such predictions is non-trivial and requires validation with independent datasets. Notably, model complexity (number of model parameters) remains a major concern in relation to overfitting and, hence, transferability of Maxent models. An emerging approach is to validate the cross-temporal transferability of model predictions using paleoecological data. In this study, we assess the effect of model complexity on the performance of Maxent projections across time using two European plant species (Alnus giutinosa (L.) Gaertn. and Corylus avellana L) with an extensive late Quaternary fossil record in Spain as a study case. We fit 110 models with different levels of complexity under present time and tested model performance using AUC (area under the receiver operating characteristic curve) and AlCc (corrected Akaike Information Criterion) through the standard procedure of randomly partitioning current occurrence data. We then compared these results to an independent validation by projecting the models to mid-Holocene (6000 years before present) climatic conditions in Spain to assess their ability to predict fossil pollen presence-absence and abundance. We find that calibrating Maxent models with default settings result in the generation of overly complex models. While model performance increased with model complexity when predicting current distributions, it was higher with intermediate complexity when predicting mid-Holocene distributions. Hence, models of intermediate complexity resulted in the best trade-off to predict species distributions across time. Reliable temporal model transferability is especially relevant for forecasting species distributions under future climate change. Consequently, species-specific model tuning should be used to find the best modeling settings to control for complexity, notably with paleoecological data to independently validate model projections. For cross-temporal projections of species distributions for which paleoecological data is not available, models of intermediate complexity should be selected.
                                
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
                                
Resumo:
CREB-binding protein (CBP) and p300 are transcriptional coactivators involved in numerous biological processes that affect cell growth, transformation, differentiation, and development. In this study, we provide evidence of the involvement of homeodomain-interacting protein kinase 2 (HIPK2) in the regulation of CBP activity. We show that HIPK2 interacts with and phosphorylates several regions of CBP. We demonstrate that serines 2361, 2363, 2371, 2376, and 2381 are responsible for the HIPK2-induced mobility shift of CBP C-terminal activation domain. Moreover, we show that HIPK2 strongly potentiates the transcriptional activity of CBP. However, our data suggest that HIPK2 activates CBP mainly by counteracting the repressive action of cell cycle regulatory domain 1 (CRD1), located between amino acids 977 and 1076, independently of CBP phosphorylation. Our findings thus highlight a complex regulation of CBP activity by HIPK2, which might be relevant for the control of specific sets of target genes involved in cellular proliferation, differentiation and apoptosis.
                                
Resumo:
The second scientific meeting of the European systems genetics network for the study of complex genetic human disease using genetic reference populations (SYSGENET) took place at the Center for Cooperative Research in Biosciences in Bilbao, Spain, December 10-12, 2012. SYSGENET is funded by the European Cooperation in the Field of Scientific and Technological Research (COST) and represents a network of scientists in Europe that use mouse genetic reference populations (GRPs) to identify complex genetic factors influencing disease phenotypes (Schughart, Mamm Genome 21:331-336, 2010). About 50 researchers working in the field of systems genetics attended the meeting, which consisted of 27 oral presentations, a poster session, and a management committee meeting. Participants exchanged results, set up future collaborations, and shared phenotyping and data analysis methodologies. This meeting was particularly instrumental for conveying the current status of the US, Israeli, and Australian Collaborative Cross (CC) mouse GRP. The CC is an open source project initiated nearly a decade ago by members of the Complex Trait Consortium to aid the mapping of multigenetic traits (Threadgill, Mamm Genome 13:175-178, 2002). In addition, representatives of the International Mouse Phenotyping Consortium were invited to exchange ongoing activities between the knockout and complex genetics communities and to discuss and explore potential fields for future interactions.
                                
Resumo:
Extensive defects of the pelvis and genitoperineal region are a reconstructive challenge. We discuss a consecutive series of 25 reconstructions with the pedicled anterolateral thigh (ALT) flap including muscle part of the vastus lateralis (VL) in 23 patients from October 1999 to September 2012.Only surface defects larger than 100 cm and reconstructions by composite ALT + VL were included in this retrospective analysis. Of the 23 patients, 19 underwent oncologic resection, whereas 4 cases presented Fournier gangrene. Three patients did not reach 6 months of follow-up and were excluded from further data analysis. Among the remaining 20 patients (22 reconstructions), average follow-up period was 14 months (range, 10-18 months). Patient's average age was 60 years. Average size of the defect was 182 cm.Postoperative complications included 1 (4.5%) flap necrosis out of 22 raised flaps, 1 partial flap necrosis after venous congestion, and 2 cases where a complementary reconstructive procedure was performed due to remaining defect or partial flap failure. In 6 cases, peripheral wound dehiscence (27%) was treated by debridement followed by split-thickness skin graft or advancement local flaps. Defect size was significantly related to postoperative complications and increased hospital stay, especially in those patients who underwent preoperative radiotherapy. At the end of the follow-up period, a long-term and satisfactory coverage was obtained in all patients without functional deficits.This consecutive series of composite ALT + VL flap shows that, in case of extended defects, the flap provides an excellent and adjustable muscle mass, is reliable with minimal donor-site morbidity, and can even be designed as a sensate flap.
                                
Resumo:
Nowadays, Species Distribution Models (SDMs) are a widely used tool. Using different statistical approaches these models reconstruct the realized niche of a species using presence data and a set of variables, often topoclimatic. There utilization range is quite large from understanding single species requirements, to the creation of nature reserve based on species hotspots, or modeling of climate change impact, etc... Most of the time these models are using variables at a resolution of 50km x 50km or 1 km x 1 km. However in some cases these models are used with resolutions below the kilometer scale and thus called high resolution models (100 m x 100 m or 25 m x 25 m). Quite recently a new kind of data has emerged enabling precision up to lm x lm and thus allowing very high resolution modeling. However these new variables are very costly and need an important amount of time to be processed. This is especially the case when these variables are used in complex calculation like models projections over large areas. Moreover the importance of very high resolution data in SDMs has not been assessed yet and is not well understood. Some basic knowledge on what drive species presence-absences is still missing. Indeed, it is not clear whether in mountain areas like the Alps coarse topoclimatic gradients are driving species distributions or if fine scale temperature or topography are more important or if their importance can be neglected when balance to competition or stochasticity. In this thesis I investigated the importance of very high resolution data (2-5m) in species distribution models using either very high resolution topographic, climatic or edaphic variables over a 2000m elevation gradient in the Western Swiss Alps. I also investigated more local responses of these variables for a subset of species living in this area at two precise elvation belts. During this thesis I showed that high resolution data necessitates very good datasets (species and variables for the models) to produce satisfactory results. Indeed, in mountain areas, temperature is the most important factor driving species distribution and needs to be modeled at very fine resolution instead of being interpolated over large surface to produce satisfactory results. Despite the instinctive idea that topographic should be very important at high resolution, results are mitigated. However looking at the importance of variables over a large gradient buffers the importance of the variables. Indeed topographic factors have been shown to be highly important at the subalpine level but their importance decrease at lower elevations. Wether at the mountane level edaphic and land use factors are more important high resolution topographic data is more imporatant at the subalpine level. Finally the biggest improvement in the models happens when edaphic variables are added. Indeed, adding soil variables is of high importance and variables like pH are overpassing the usual topographic variables in SDMs in term of importance in the models. To conclude high resolution is very important in modeling but necessitate very good datasets. Only increasing the resolution of the usual topoclimatic predictors is not sufficient and the use of edaphic predictors has been highlighted as fundamental to produce significantly better models. This is of primary importance, especially if these models are used to reconstruct communities or as basis for biodiversity assessments. -- Ces dernières années, l'utilisation des modèles de distribution d'espèces (SDMs) a continuellement augmenté. Ces modèles utilisent différents outils statistiques afin de reconstruire la niche réalisée d'une espèce à l'aide de variables, notamment climatiques ou topographiques, et de données de présence récoltées sur le terrain. Leur utilisation couvre de nombreux domaines allant de l'étude de l'écologie d'une espèce à la reconstruction de communautés ou à l'impact du réchauffement climatique. La plupart du temps, ces modèles utilisent des occur-rences issues des bases de données mondiales à une résolution plutôt large (1 km ou même 50 km). Certaines bases de données permettent cependant de travailler à haute résolution, par conséquent de descendre en dessous de l'échelle du kilomètre et de travailler avec des résolutions de 100 m x 100 m ou de 25 m x 25 m. Récemment, une nouvelle génération de données à très haute résolution est apparue et permet de travailler à l'échelle du mètre. Les variables qui peuvent être générées sur la base de ces nouvelles données sont cependant très coûteuses et nécessitent un temps conséquent quant à leur traitement. En effet, tout calcul statistique complexe, comme des projections de distribution d'espèces sur de larges surfaces, demande des calculateurs puissants et beaucoup de temps. De plus, les facteurs régissant la distribution des espèces à fine échelle sont encore mal connus et l'importance de variables à haute résolution comme la microtopographie ou la température dans les modèles n'est pas certaine. D'autres facteurs comme la compétition ou la stochasticité naturelle pourraient avoir une influence toute aussi forte. C'est dans ce contexte que se situe mon travail de thèse. J'ai cherché à comprendre l'importance de la haute résolution dans les modèles de distribution d'espèces, que ce soit pour la température, la microtopographie ou les variables édaphiques le long d'un important gradient d'altitude dans les Préalpes vaudoises. J'ai également cherché à comprendre l'impact local de certaines variables potentiellement négligées en raison d'effets confondants le long du gradient altitudinal. Durant cette thèse, j'ai pu monter que les variables à haute résolution, qu'elles soient liées à la température ou à la microtopographie, ne permettent qu'une amélioration substantielle des modèles. Afin de distinguer une amélioration conséquente, il est nécessaire de travailler avec des jeux de données plus importants, tant au niveau des espèces que des variables utilisées. Par exemple, les couches climatiques habituellement interpolées doivent être remplacées par des couches de température modélisées à haute résolution sur la base de données de terrain. Le fait de travailler le long d'un gradient de température de 2000m rend naturellement la température très importante au niveau des modèles. L'importance de la microtopographie est négligeable par rapport à la topographie à une résolution de 25m. Cependant, lorsque l'on regarde à une échelle plus locale, la haute résolution est une variable extrêmement importante dans le milieu subalpin. À l'étage montagnard par contre, les variables liées aux sols et à l'utilisation du sol sont très importantes. Finalement, les modèles de distribution d'espèces ont été particulièrement améliorés par l'addition de variables édaphiques, principalement le pH, dont l'importance supplante ou égale les variables topographique lors de leur ajout aux modèles de distribution d'espèces habituels.
                                
Resumo:
The management and conservation of coastal waters in the Baltic is challenged by a number of complex environmental problems, including eutrophication and habitat degradation. Demands for a more holistic, integrated and adaptive framework of ecosystem-based management emphasize the importance of appropriate information on the status and changes of the aquatic ecosystems. The thesis focuses on the spatiotemporal aspects of environmental monitoring in the extensive and geomorphologically complex coastal region of SW Finland, where the acquisition of spatially and temporally representative monitoring data is inherently challenging. Furthermore, the region is subject to multiple human interests and uses. A holistic geographical approach is emphasized, as it is ultimately the physical conditions that set the frame for any human activity. Characteristics of the coastal environment were examined using water quality data from the database of the Finnish environmental administration and Landsat TM/ETM+ images. A basic feature of the complex aquatic environment in the Archipelago Sea is its high spatial and temporal variability; this foregrounds the importance of geographical information as a basis of environmental assessments. While evidence of a consistent water turbidity pattern was observed, the coastal hydrodynamic realm is also characterized by high spatial and temporal variability. It is therefore also crucial to consider the spatial and temporal representativeness of field monitoring data. Remote sensing may facilitate evaluation of hydrodynamic conditions in the coastal region and the spatial extrapolation of in situ data despite their restrictions. Additionally, remotely sensed images can be used in the mapping of many of those coastal habitats that need to be considered in environmental management. With regard to surface water monitoring, only a small fraction of the currently available data stored in the Hertta-PIVET register can be used effectively in scientific studies and environmental assessments. Long-term consistent data collection from established sampling stations should be emphasized but research-type seasonal assessments producing abundant data should also be encouraged. Thus a more comprehensive coordination of field work efforts is called for. The integration of remote sensing and various field measurement techniques would be especially useful in the complex coastal waters. The integration and development of monitoring system in Finnish coastal areas also requires further scientific assesement of monitoring practices. A holistic approach to the gathering and management of environmental monitoring data could be a cost-effective way of serving a multitude of information needs, and would fit the holistic, ecosystem-based management regimes that are currently being strongly promoted in Europe.
                                
Resumo:
About 50% of living species are holometabolan insects. Therefore, unraveling the ori- gin of insect metamorphosis from the hemimetabolan (gradual metamorphosis) to the holometabolan (sudden metamorphosis at the end of the life cycle) mode is equivalent to explaining how all this biodiversity originated. One of the problems with studying the evolution from hemimetaboly to holometaboly is that most information is available only in holometabolan species. Within the hemimetabolan group, our model, the cock- roach Blattella germanica, is the most studied species. However, given that the study of adult morphogenesis at organismic level is still complex, we focused on the study of the tergal gland (TG) as a minimal model of metamorphosis. The TG is formed in tergites 7 and 8 (T7-8) in the last days of the last nymphal instar (nymph 6). The comparative study of four T7-T8 transcriptomes provided us with crucial keys of TG formation, but also essential information about the mechanisms and circuitry that allows the shift from nymphal to adult morphogenesis.
                                
Resumo:
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.
                                
Resumo:
The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposedas the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where kis the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation
 
                    