Biblioteca Digital

952 resultados para Dynamic data set visualization

An incremental space to visualize dynamic data sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Information Visualization, adding and removing data elements can strongly impact the underlying visual space. We have developed an inherently incremental technique (incBoard) that maintains a coherent disposition of elements from a dynamic multidimensional data set on a 2D grid as the set changes. Here, we introduce a novel layout that uses pairwise similarity from grid neighbors, as defined in incBoard, to reposition elements on the visual space, free from constraints imposed by the grid. The board continues to be updated and can be displayed alongside the new space. As similar items are placed together, while dissimilar neighbors are moved apart, it supports users in the identification of clusters and subsets of related elements. Densely populated areas identified in the incSpace can be efficiently explored with the corresponding incBoard visualization, which is not susceptible to occlusion. The solution remains inherently incremental and maintains a coherent disposition of elements, even for fully renewed sets. The algorithm considers relative positions for the initial placement of elements, and raw dissimilarity to fine tune the visualization. It has low computational cost, with complexity depending only on the size of the currently viewed subset, V. Thus, a data set of size N can be sequentially displayed in O(N) time, reaching O(N (2)) only if the complete set is simultaneously displayed.

High northern latitude surface air temperature: comparison of existing data and creation of a new gridded data set 1900-2000

Relevância:

100.00% 100.00%

Publicador:

Combined bias and outlier identification in dynamic data reconciliation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Measured process data normally contain inaccuracies because the measurements are obtained using imperfect instruments. As well as random errors one can expect systematic bias caused by miscalibrated instruments or outliers caused by process peaks such as sudden power fluctuations. Data reconciliation is the adjustment of a set of process data based on a model of the process so that the derived estimates conform to natural laws. In this paper, techniques for the detection and identification of both systematic bias and outliers in dynamic process data are presented. A novel technique for the detection and identification of systematic bias is formulated and presented. The problem of detection, identification and elimination of outliers is also treated using a modified version of a previously available clustering technique. These techniques are also combined to provide a global dynamic data reconciliation (DDR) strategy. The algorithms presented are tested in isolation and in combination using dynamic simulations of two continuous stirred tank reactors (CSTR).

An online platform for real-time sensor data collection, visualization, and sharing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sharing sensor data between multiple devices and users can be^challenging for naive users, and requires knowledge of programming and use of different communication channels and/or development tools, leading to non uniform solutions. This thesis proposes a system that allows users to access sensors, share sensor data and manage sensors. With this system we intent to manage devices, share sensor data, compare sensor data, and set policies to act based on rules. This thesis presents the design and implementation of the system, as well as three case studies of its use.

Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp RFLP/PCR data set

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The taxonomy of the N(2)-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradryrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses Clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster. (C) 2008 Elsevier B.V. All rights reserved.

Cytokine-related genes identified from the RIKEN full-length mouse cDNA data set

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To identify novel cytokine-related genes, we searched the set of 60,770 annotated RIKEN mouse cDNA clones (FANTOM2 clones), using keywords such as cytokine itself or cytokine names (such as interferon, interleukin, epidermal growth factor, fibroblast growth factor, and transforming growth factor). This search produced 108 known cytokines and cytokine-related products such as cytokine receptors, cytokine-associated genes, or their products (enhancers, accessory proteins, cytokine-induced genes). We found 15 clusters of FANTOM2 clones that are candidates for novel cytokine-related genes. These encoded products with strong sequence similarity to guanylate-binding protein (GBP-5), interleukin-1 receptor-associated kinase 2 (IRAK-2), interleukin 20 receptor alpha isoform 3, a member of the interferon-inducible proteins of the Ifi 200 cluster, four members of the membrane-associated family 1-8 of interferon-inducible proteins, one p27-like protein, and a hypothetical protein containing a Toll/Interleukin receptor domain. All four clones representing novel candidates of gene products from the family contain a novel highly conserved cross-species domain. Clones similar to growth factor-related products included transforming growth factor beta-inducible early growth response protein 2 (TIEG-2), TGFbeta-induced factor 2, integrin beta-like 1, latent TGF-binding protein 4S, and FGF receptor 4B. We performed a detailed sequence analysis of the candidate novel genes to elucidate their likely functional properties.

Identification of novel "pathologs" (human disease-related gene candidates) from the RIKEN full-length mouse cDNA data set

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The majority of common diseases such as cancer, allergy, diabetes, or heart disease are characterized by complex genetic traits, in which genetic and environmental components contribute to disease susceptibility. Our knowledge of the genetic factors underlying most of such diseases is limited. A major goal in the post-genomic era is to identify and characterize disease susceptibility genes and to use this knowledge for disease treatment and prevention. More than 500 genes are conserved across the invertebrate and vertebrate genomes. Because of gene conservation, various organisms including yeast, fruitfly, zebrafish, rat, and mouse have been used as genetic models.

Clustering an interval data set : are the main partitions similar to a priori partition?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

EU cohesion aid to Spain: a data set. Part I: 2000-06 planning period

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we construct a data set on EU cohesion aid to Spain during the planning period 2000-06. The data are disaggregated by region, year and function and attempt to approximate the timing of actual executed expenditure on assisted projects.

When a data set can be considered compositional?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditionally, compositional data has been identified with closed data, and the simplex has been considered as the natural sample space of this kind of data. In our opinion, the emphasis on the constrained nature ofcompositional data has contributed to mask its real nature. More crucial than the constraining property of compositional data is the scale-invariant property of this kind of data. Indeed, when we are considering only few parts of a full composition we are not working with constrained data but our data are still compositional. We believe that it is necessary to give a more precisedefinition of composition. This is the aim of this oral contribution

EU cohesion aid to Spain: a data set. Part II: 1994-99 planning period

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we construct a data set on EU cohesion aid to Spain during the planning period 1994-99. The data are disaggregated by region, year and function and attempt to approximate the timing of actual executed expenditure on assisted projects.

Evaluating predictive densities of U.S. output growth and inflation in a large macroeconomic data set

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We evaluate conditional predictive densities for U.S. output growth and inflationusing a number of commonly used forecasting models that rely on a large number ofmacroeconomic predictors. More specifically, we evaluate how well conditional predictive densities based on the commonly used normality assumption fit actual realizationsout-of-sample. Our focus on predictive densities acknowledges the possibility that, although some predictors can improve or deteriorate point forecasts, they might have theopposite effect on higher moments. We find that normality is rejected for most modelsin some dimension according to at least one of the tests we use. Interestingly, however,combinations of predictive densities appear to be correctly approximated by a normaldensity: the simple, equal average when predicting output growth and Bayesian modelaverage when predicting inflation.

Measuring and explaining farm inefficiency in a panel data set of mixed farms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper aims to estimate a translog stochastic frontier production function in the analysis of a panel of 150 mixed Catalan farms in the period 1989-1993, in order to attempt to measure and explain variation in technical inefficiency scores with a one-stage approach. The model uses gross value added as the output aggregate measure. Total employment, fixed capital, current assets, specific costs and overhead costs are introduced into the model as inputs. Stochasticfrontier estimates are compared with those obtained using a linear programming method using a two-stage approach. The specification of the translog stochastic frontier model appears as an appropriate representation of the data, technical change was rejected and the technical inefficiency effects were statistically significant. The mean technical efficiency in the period analyzed was estimated to be 64.0%. Farm inefficiency levels were found significantly at 5%level and positively correlated with the number of economic size units.

An appropriate data set size for digital soil mapping in Erechim, Rio Grande do Sul, Brazil

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Digital information generates the possibility of a high degree of redundancy in the data available for fitting predictive models used for Digital Soil Mapping (DSM). Among these models, the Decision Tree (DT) technique has been increasingly applied due to its capacity of dealing with large datasets. The purpose of this study was to evaluate the impact of the data volume used to generate the DT models on the quality of soil maps. An area of 889.33 km² was chosen in the Northern region of the State of Rio Grande do Sul. The soil-landscape relationship was obtained from reambulation of the studied area and the alignment of the units in the 1:50,000 scale topographic mapping. Six predictive covariates linked to the factors soil formation, relief and organisms, together with data sets of 1, 3, 5, 10, 15, 20 and 25 % of the total data volume, were used to generate the predictive DT models in the data mining program Waikato Environment for Knowledge Analysis (WEKA). In this study, sample densities below 5 % resulted in models with lower power of capturing the complexity of the spatial distribution of the soil in the study area. The relation between the data volume to be handled and the predictive capacity of the models was best for samples between 5 and 15 %. For the models based on these sample densities, the collected field data indicated an accuracy of predictive mapping close to 70 %.

Minimal data set for dispatch centre.

Relevância:

100.00% 100.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
63
64
»