982 resultados para Orion DBMS, Database, Uncertainty, Uncertain values, Benchmark
Resumo:
Given the importance of Guzera breeding programs for milk production in the tropics, the objective of this study was to compare alternative random regression models for estimation of genetic parameters and prediction of breeding values. Test-day milk yields records (TDR) were collected monthly, in a maximum of 10 measurements. The database included 20,524 records of first lactation from 2816 Guzera cows. TDR data were analyzed by random regression models (RRM) considering additive genetic, permanent environmental and residual effects as random and the effects of contemporary group (CG), calving age as a covariate (linear and quadratic effects) and mean lactation curve as fixed. The genetic additive and permanent environmental effects were modeled by RRM using Wilmink, All and Schaeffer and cubic B-spline functions as well as Legendre polynomials. Residual variances were considered as heterogeneous classes, grouped differently according to the model used. Multi-trait analysis using finite-dimensional models (FDM) for testday milk records (TDR) and a single-trait model for 305-days milk yields (default) using the restricted maximum likelihood method were also carried out as further comparisons. Through the statistical criteria adopted, the best RRM was the one that used the cubic B-spline function with five random regression coefficients for the genetic additive and permanent environmental effects. However, the models using the Ali and Schaeffer function or Legendre polynomials with second and fifth order for, respectively, the additive genetic and permanent environmental effects can be adopted, as little variation was observed in the genetic parameter estimates compared to those estimated by models using the B-spline function. Therefore, due to the lower complexity in the (co)variance estimations, the model using Legendre polynomials represented the best option for the genetic evaluation of the Guzera lactation records. An increase of 3.6% in the accuracy of the estimated breeding values was verified when using RRM. The ranks of animals were very close whatever the RRM for the data set used to predict breeding values. Considering P305, results indicated only small to medium difference in the animals' ranking based on breeding values predicted by the conventional model or by RRM. Therefore, the sum of all the RRM-predicted breeding values along the lactation period (RRM305) can be used as a selection criterion for 305-day milk production. (c) 2014 Elsevier B.V. All rights reserved.
Resumo:
Landscape fires show large variability in the amount of biomass or fuel consumed per unit area burned. Fuel consumption (FC) depends on the biomass available to burn and the fraction of the biomass that is actually combusted, and can be combined with estimates of area burned to assess emissions. While burned area can be detected from space and estimates are becoming more reliable due to improved algorithms and sensors, FC is usually modeled or taken selectively from the literature. We compiled the peerreviewed literature on FC for various biomes and fuel categories to understand FC and its variability better, and to provide a database that can be used to constrain biogeochemical models with fire modules. We compiled in total 77 studies covering 11 biomes including savanna (15 studies, average FC of 4.6 t DM (dry matter) ha 1 with a standard deviation of 2.2), tropical forest (n = 19, FC = 126 +/- 77), temperate forest (n = 12, FC = 58 +/- 72), boreal forest (n = 16, FC = 35 +/- 24), pasture (n = 4, FC = 28 +/- 9.3), shifting cultivation (n = 2, FC = 23, with a range of 4.0-43), crop residue (n = 4, FC = 6.5 +/- 9.0), chaparral (n = 3, FC = 27 +/- 19), tropical peatland (n = 4, FC = 314 +/- 196), boreal peatland (n = 2, FC = 42 [42-43]), and tundra (n = 1, FC = 40). Within biomes the regional variability in the number of measurements was sometimes large, with e. g. only three measurement locations in boreal Russia and 35 sites in North America. Substantial regional differences in FC were found within the defined biomes: for example, FC of temperate pine forests in the USA was 37% lower than Australian forests dominated by eucalypt trees. Besides showing the differences between biomes, FC estimates were also grouped into different fuel classes. Our results highlight the large variability in FC, not only between biomes but also within biomes and fuel classes. This implies that substantial uncertainties are associated with using biome-averaged values to represent FC for whole biomes. Comparing the compiled FC values with co-located Global Fire Emissions Database version 3 (GFED3) FC indicates that modeling studies that aim to represent variability in FC also within biomes, still require improvements as they have difficulty in representing the dynamics governing FC.
Resumo:
Categorical data cannot be interpolated directly because they are outcomes of discrete random variables. Thus, types of categorical variables are transformed into indicator functions that can be handled by interpolation methods. Interpolated indicator values are then backtransformed to the original types of categorical variables. However, aspects such as variability and uncertainty of interpolated values of categorical data have never been considered. In this paper we show that the interpolation variance can be used to map an uncertainty zone around boundaries between types of categorical variables. Moreover, it is shown that the interpolation variance is a component of the total variance of the categorical variables, as measured by the coefficient of unalikeability. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Background: The biorhythm of serum uric acid was evaluated in a large sample of a clinical laboratory database by spectral analysis and the influence of the gender and age on uric acid variability. Methods: Serum uric acid values were extracted from a large database of a clinical laboratory from May 2000 to August 2006. Outlier values were excluded from the analysis and the remaining data (n = 73,925) were grouped by gender and age ranges. Rhythm components were obtained by the Lomb Scargle method and Cosinor analysis. Results: Serum uric acid was higher in men than in women older than 13 years (p<0.05). Compared with 0-12 year group, uric acid increased in men but not in women older than 13 years (p<0.05). Circannual (12 months) and transyear (17 months) rhythm components were detected, but they were significant only in adult individuals (>26 years, p<0.05). Cosinor analysis showed that midline estimating statistic of rhythm (MESOR) values were higher in men (range: 353-368 mu mol/L) than in women (range: 240-278 mu mol/L; p<0.05), independent of the age and rhythm component. The extent of predictable change within a cycle, approximated by the double amplitude, represented up to 20% of the corresponding MESOR. Conclusions: Serum uric acid biorhythm is dependent on gender and age and it may have relevant influence on preanalytical variability of clinical laboratory results.
Resumo:
[EN] Marine N2 fixing microorganisms, termed diazotrophs, are a key functional group in marine pelagic ecosystems. The biological fixation of dinitrogen (N2) to bioavailable nitrogen provides an important new source of nitrogen for pelagic marine ecosystems 5 and influences primary productivity and organic matter export to the deep ocean. As one of a series of efforts to collect biomass and rates specific to different phytoplankton functional groups, we have constructed a database on diazotrophic organisms in the global pelagic upper ocean by compiling about 12 000 direct field measurements of cyanobacterial diazotroph abundances (based on microscopic cell counts or qPCR 10 assays targeting the nifH genes) and N2 fixation rates. Biomass conversion factors are estimated based on cell sizes to convert abundance data to diazotrophic biomass. The database is limited spatially, lacking large regions of the ocean especially in the Indian Ocean. The data are approximately log-normal distributed, and large variances exist in most sub-databases with non-zero values differing 5 to 8 orders of magnitude. 15 Lower mean N2 fixation rate was found in the North Atlantic Ocean than the Pacific Ocean. Reporting the geometric mean and the range of one geometric standard error below and above the geometric mean, the pelagic N2 fixation rate in the global ocean is estimated to be 62 (53–73) TgNyr−1 and the pelagic diazotrophic biomass in the global ocean is estimated to be 4.7 (2.3–9.6) TgC from cell counts and to 89 (40–20 200) TgC from nifH-based abundances. Uncertainties related to biomass conversion factors can change the estimate of geometric mean pelagic diazotrophic biomass in the global ocean by about ±70 %. This evolving database can be used to study spatial and temporal distributions and variations of marine N2 fixation, to validate geochemical estimates and to parameterize and validate biogeochemical models. The database is 25 stored in PANGAEA (http://doi.pangaea.de/10.1594/PANGAEA.774851).
Resumo:
Analisi del sistema NewSQL e sperimentazione del software VoltDB su un benchmark definito.
Resumo:
Applicazione basata sul database non relazionale MongoDB. Integrata in un sistema di prenotazione turistico online.
Resumo:
Nella tesi, inizialmente, viene introdotto il concetto di Big Data, descrivendo le caratteristiche principali, il loro utilizzo, la provenienza e le opportunità che possono apportare. Successivamente, si sono spiegati i motivi che hanno portato alla nascita del movimento NoSQL, come la necessità di dover gestire i Big Data pur mantenendo una struttura flessibile nel tempo. Inoltre, dopo un confronto con i sistemi tradizionali, si è passati al classificare questi DBMS in diverse famiglie, accennando ai concetti strutturali sulle quali si basano, per poi spiegare il funzionamento. In seguito è stato descritto il database MongoDB orientato ai documenti. Sono stati approfonditi i dettagli strutturali, i concetti sui quali si basa e gli obbiettivi che si pone, per poi andare ad analizzare nello specifico importanti funzioni, come le operazioni di inserimento e cancellazione, ma anche il modo di interrogare il database. Grazie alla sue caratteristiche che lo rendono molto performante, MonogDB, è stato utilizzato come supporto di base di dati per la realizzazione di un applicazione web che permette di mostrare la mappa della connettività urbana.
Resumo:
La tesi tratta una panoramica generale sui Time Series database e relativi gestori. Successivamente l'attenzione è focalizzata sul DBMS InfluxDB. Infine viene mostrato un progetto che implementa InfluxDB
Resumo:
BACKGROUND: Detecting a benefit from closure of patent foramen ovale in patients with cryptogenic stroke is hampered by low rates of stroke recurrence and uncertainty about the causal role of patent foramen ovale in the index event. A method to predict patent foramen ovale-attributable recurrence risk is needed. However, individual databases generally have too few stroke recurrences to support risk modeling. Prior studies of this population have been limited by low statistical power for examining factors related to recurrence. AIMS: The aim of this study was to develop a database to support modeling of patent foramen ovale-attributable recurrence risk by combining extant data sets. METHODS: We identified investigators with extant databases including subjects with cryptogenic stroke investigated for patent foramen ovale, determined the availability and characteristics of data in each database, collaboratively specified the variables to be included in the Risk of Paradoxical Embolism database, harmonized the variables across databases, and collected new primary data when necessary and feasible. RESULTS: The Risk of Paradoxical Embolism database has individual clinical, radiologic, and echocardiographic data from 12 component databases, including subjects with cryptogenic stroke both with (n = 1925) and without (n = 1749) patent foramen ovale. In the patent foramen ovale subjects, a total of 381 outcomes (stroke, transient ischemic attack, death) occurred (median follow-up 2·2 years). While there were substantial variations in data collection between studies, there was sufficient overlap to define a common set of variables suitable for risk modeling. CONCLUSION: While individual studies are inadequate for modeling patent foramen ovale-attributable recurrence risk, collaboration between investigators has yielded a database with sufficient power to identify those patients at highest risk for a patent foramen ovale-related stroke recurrence who may have the greatest potential benefit from patent foramen ovale closure.
Resumo:
This project examined the change in values in the still unfinished transitional period in Serbia during the 1990s and compared it with Greece in the same period. During this period the social and political transition affected the ruling value system primarily through changes in the modes of the production and representation of reality. The most remarkable trait of this period in Serbia is the parallel and interweaving existence of different value systems. The very perception of reality has been blurred by the emergence of a very complex technical and ideological structure. Reality is presented by and through extensions and additions, the models of which are language and media. With the development of media technology and global communication and information systems, representation has become the only available reality. This enables the media to overtly and unlimitedly intervene in reality, to manage and change it without constraint and thus have a direct impact on values. The difference between public and private is abolished, so the media start promoting exclusive collective values. However, since the collective thus loses its counterpart, it itself needs to be redefined. This confusion of values make the possible results of their change uncertain. It will either open up a space for multiculturalism and social pluralism and thus completely replace the old systems of values, or result in an indefinite survival of different, often contradictory, value systems and conceptions of reality, which often lead to all forms of exclusivity and intolerance.
Resumo:
We analyze three sets of doubly-censored cohort data on incubation times, estimating incubation distributions using semi-parametric methods and assessing the comparability of the estimates. Weibull models appear to be inappropriate for at least one of the cohorts, and the estimates for the different cohorts are substantially different. We use these estimates as inputs for backcalculation, using a nonparametric method based on maximum penalized likelihood. The different incubations all produce fits to the reported AIDS counts that are as good as the fit from a nonstationary incubation distribution that models treatment effects, but the estimated infection curves are very different. We also develop a method for estimating nonstationarity as part of the backcalculation procedure and find that such estimates also depend very heavily on the assumed incubation distribution. We conclude that incubation distributions are so uncertain that meaningful error bounds are difficult to place on backcalculated estimates and that backcalculation may be too unreliable to be used without being supplemented by other sources of information in HIV prevalence and incidence.