25 resultados para High-dimensional data visualization
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
In Information Visualization, adding and removing data elements can strongly impact the underlying visual space. We have developed an inherently incremental technique (incBoard) that maintains a coherent disposition of elements from a dynamic multidimensional data set on a 2D grid as the set changes. Here, we introduce a novel layout that uses pairwise similarity from grid neighbors, as defined in incBoard, to reposition elements on the visual space, free from constraints imposed by the grid. The board continues to be updated and can be displayed alongside the new space. As similar items are placed together, while dissimilar neighbors are moved apart, it supports users in the identification of clusters and subsets of related elements. Densely populated areas identified in the incSpace can be efficiently explored with the corresponding incBoard visualization, which is not susceptible to occlusion. The solution remains inherently incremental and maintains a coherent disposition of elements, even for fully renewed sets. The algorithm considers relative positions for the initial placement of elements, and raw dissimilarity to fine tune the visualization. It has low computational cost, with complexity depending only on the size of the currently viewed subset, V. Thus, a data set of size N can be sequentially displayed in O(N) time, reaching O(N (2)) only if the complete set is simultaneously displayed.
Resumo:
Visualization of high-dimensional data requires a mapping to a visual space. Whenever the goal is to preserve similarity relations a frequent strategy is to use 2D projections, which afford intuitive interactive exploration, e. g., by users locating and selecting groups and gradually drilling down to individual objects. In this paper, we propose a framework for projecting high-dimensional data to 3D visual spaces, based on a generalization of the Least-Square Projection (LSP). We compare projections to 2D and 3D visual spaces both quantitatively and through a user study considering certain exploration tasks. The quantitative analysis confirms that 3D projections outperform 2D projections in terms of precision. The user study indicates that certain tasks can be more reliably and confidently answered with 3D projections. Nonetheless, as 3D projections are displayed on 2D screens, interaction is more difficult. Therefore, we incorporate suitable interaction functionalities into a framework that supports 3D transformations, predefined optimal 2D views, coordinated 2D and 3D views, and hierarchical 3D cluster definition and exploration. For visually encoding data clusters in a 3D setup, we employ color coding of projected data points as well as four types of surface renderings. A second user study evaluates the suitability of these visual encodings. Several examples illustrate the framework`s applicability for both visual exploration of multidimensional abstract (non-spatial) data as well as the feature space of multi-variate spatial data.
Resumo:
Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
We introduce a flexible technique for interactive exploration of vector field data through classification derived from user-specified feature templates. Our method is founded on the observation that, while similar features within the vector field may be spatially disparate, they share similar neighborhood characteristics. Users generate feature-based visualizations by interactively highlighting well-accepted and domain specific representative feature points. Feature exploration begins with the computation of attributes that describe the neighborhood of each sample within the input vector field. Compilation of these attributes forms a representation of the vector field samples in the attribute space. We project the attribute points onto the canonical 2D plane to enable interactive exploration of the vector field using a painting interface. The projection encodes the similarities between vector field points within the distances computed between their associated attribute points. The proposed method is performed at interactive rates for enhanced user experience and is completely flexible as showcased by the simultaneous identification of diverse feature types.
Resumo:
Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.
Resumo:
The integration of nanostructured films containing biomolecules and silicon-based technologies is a promising direction for reaching miniaturized biosensors that exhibit high sensitivity and selectivity. A challenge, however, is to avoid cross talk among sensing units in an array with multiple sensors located on a small area. In this letter, we describe an array of 16 sensing units, of a light-addressable potentiometric sensor (LAPS), which was made with layer-by-Layer (LbL) films of a poly(amidomine) dendrimer (PAMAM) and single-walled carbon nanotubes (SWNTs), coated with a layer of the enzyme penicillinase. A visual inspection of the data from constant-current measurements with liquid samples containing distinct concentrations of penicillin, glucose, or a buffer indicated a possible cross talk between units that contained penicillinase and those that did not. With the use of multidimensional data projection techniques, normally employed in information Visualization methods, we managed to distinguish the results from the modified LAPS, even in cases where the units were adjacent to each other. Furthermore, the plots generated with the interactive document map (IDMAP) projection technique enabled the distinction of the different concentrations of penicillin, from 5 mmol L(-1) down to 0.5 mmol L(-1). Data visualization also confirmed the enhanced performance of the sensing units containing carbon nanotubes, consistent with the analysis of results for LAPS sensors. The use of visual analytics, as with projection methods, may be essential to handle a large amount of data generated in multiple sensor arrays to achieve high performance in miniaturized systems.
Resumo:
The utilization of wood from reforested species by the furniture industry is a recent trend. Thus, the present study determined the specific gravity and shrinkage of wood of 18-year-old Eucalyptus grandis, Eucalyptus dunnii and Eucalyptus urophylla, for use as components in solid wood furniture making. The tests to evaluate the specific gravity and shrinkage of wood in the radial and axial variation of the eucalyptus trees were performed according to NBR 7190/96. The results of the analysis of wood from eucalypt species were subjected to the Homogeneity Test, ANOVA, Tukey and Pearson correlation and compared to the performance of sucupira wood (Bowdichia nitida) and cumaru wood (Dipteryx odorata), often used in the furniture industry. The following results were found: Eucalyptus grandis had a lower value of shrinkage, being more suitable for furniture components that require high dimensional stability, as well as parts of larger surface. The wood of this species showed a rate of dimensional variation compatible with the native species used in the furniture industry. The radial variation of the wood was also verified, and a high correlation between specific gravity and shrinkage was found. Longitudinally, the base of the trunk of the eucalyptus trees was shown to be the region of greatest dimensional stability.
Resumo:
Purpose: The aim of this research was to assess the dimensional accuracy of orbital prostheses based on reversed images generated by computer-aided design/computer-assisted manufacturing (CAD/CAM) using computed tomography (CT) scans. Materials and Methods: CT scans of the faces of 15 adults, men and women older than 25 years of age not bearing any congenital or acquired craniofacial defects, were processed using CAD software to produce 30 reversed three-dimensional models of the orbital region. These models were then processed using the CAM system by means of selective laser sintering to generate surface prototypes of the volunteers` orbital regions. Two moulage impressions of the faces of each volunteer were taken to manufacture 15 pairs of casts. Orbital defects were created on the right or left side of each cast. The surface prototypes were adapted to the casts and then flasked to fabricate silicone prostheses. The establishment of anthropometric landmarks on the orbital region and facial midline allowed for the data collection of 31 linear measurements, used to assess the dimensional accuracy of the orbital prostheses and their location on the face. Results: The comparative analyses of the linear measurements taken from the orbital prostheses and the opposite sides that originated the surface prototypes demonstrated that the orbital prostheses presented similar vertical, transversal, and oblique dimensions, as well as similar depth. There was no transverse or oblique displacement of the prostheses. Conclusion: From a clinical perspective, the small differences observed after analyzing all 31 linear measurements did not indicate facial asymmetry. The dimensional accuracy of the orbital prostheses suggested that the CAD/CAM system assessed herein may be applicable for clinical purposes. Int J Prosthodont 2010;23:271-276.
Resumo:
We present the first results of a study investigating the processes that control concentrations and sources of Pb and particulate matter in the atmosphere of Sao Paulo City Brazil Aerosols were collected with high temporal resolution (3 hours) during a four-day period in July 2005 The highest Pb concentrations measured coincided with large fireworks during celebration events and associated to high traffic occurrence Our high-resolution data highlights the impact that a singular transient event can have on air quality even in a megacity Under meteorological conditions non-conducive to pollutant dispersion Pb and particulate matter concentrations accumulated during the night leading to the highest concentrations in aerosols collected early in the morning of the following day The stable isotopes of Pb suggest that emissions from traffic remain an Important source of Pb in Sao Paulo City due to the large traffic fleet despite low Pb concentrations in fuels (C) 2010 Elsevier BV All rights reserved
Resumo:
Context. Our understanding of the chemical evolution (CE) of the Galactic bulge requires the determination of abundances in large samples of giant stars and planetary nebulae (PNe). Studies based on high resolution spectroscopy of giant stars in several fields of the Galactic bulge obtained with very large telescopes have allowed important progress. Aims. We discuss PNe abundances in the Galactic bulge and compare these results with those presented in the literature for giant stars. Methods. We present the largest, high-quality data-set available for PNe in the direction of the Galactic bulge (inner-disk/bulge). For comparison purposes, we also consider a sample of PNe in the Large Magellanic Cloud (LMC). We derive the element abundances in a consistent way for all the PNe studied. By comparing the abundances for the bulge, inner-disk, and LMC, we identify elements that have not been modified during the evolution of the PN progenitor and can be used to trace the bulge chemical enrichment history. We then compare the PN abundances with abundances of bulge field giant. Results. At the metallicity of the bulge, we find that the abundances of O and Ne are close to the values for the interstellar medium at the time of the PN progenitor formation, and hence these elements can be used as tracers of the bulge CE, in the same way as S and Ar, which are not expected to be affected by nucleosynthetic processes during the evolution of the PN progenitors. The PN oxygen abundance distribution is shifted to lower values by 0.3 dex with respect to the distribution given by giants. A similar shift appears to occur for Ne and S. We discuss possible reasons for this PNe-giant discrepancy and conclude that this is probably due to systematic errors in the abundance derivations in either giants or PNe (or both). We issue an important warning concerning the use of absolute abundances in CE studies.
Resumo:
High-precision data of backward-angle elastic and quasielastic scattering for the weakly bound (6)Li projectile on (144)Sm target at deep-sub-barrier, near-, and above-barrier energies were measured. From the deep-sub-barrier data, the surface diffuseness of the nuclear interacting potential was studied. Barrier distributions were extracted from the first derivatives of the elastic and quasielastic excitation functions. It is shown that sequential breakup through the first resonant state of the (6)Li is an important channel to be included in coupled-channels calculations, even at deep-sub-barrier energies.
Resumo:
We employ the recently installed near-infrared Multi-Conjugate Adaptive Optics demonstrator (MAD) to determine the basic properties of a newly identified, old and distant, Galactic open cluster (FSR 1415). The MAD facility remarkably approaches the diffraction limit, reaching a resolution of 0.07 arcsec (in K), that is also uniform in a field of similar to 1.8 arcmin in diameter. The MAD facility provides photometry that is 50 per cent complete at K similar to 19. This corresponds to about 2.5 mag below the cluster main-sequence turn-off. This high-quality data set allows us to derive an accurate heliocentric distance of 8.6 kpc, a metallicity close to solar and an age of similar to 2.5 Gyr. On the other hand, the deepness of the data allows us to reconstruct (completeness-corrected) mass functions (MFs) indicating a relatively massive cluster, with a flat core MF. The Very Large Telescope/MAD capabilities will therefore provide fundamental data for identifying/analysing other faint and distant open clusters in the Galaxy III and IV quadrants.
Resumo:
We present a new climatology of atmospheric aerosols (primarily pyrogenic and biogenic) for the Brazilian tropics on the basis of a high-quality data set of spectral aerosol optical depth and directional sky radiance measurements from Aerosol Robotic Network (AERONET) Cimel Sun-sky radiometers at more than 15 sites distributed across the Amazon basin and adjacent Cerrado region. This network is the only long-term project (with a record including observations from more than 11 years at some locations) ever to have provided ground-based remotely-sensed column aerosol properties for this critical region. Distinctive features of the Amazonian area aerosol are presented by partitioning the region into three aerosol regimes: southern Amazonian forest, Cerrado, and northern Amazonian forest. The monitoring sites generally include measurements from the interval 1999-2006, but some sites have measurement records that date back to the initial days of the AERONET program in 1993. Seasonal time series of aerosol optical depth (AOD), angstrom ngstrom exponent, and columnar-averaged microphysical properties of the aerosol derived from sky radiance inversion techniques (single-scattering albedo, volume size distribution, fine mode fraction of AOD, etc.) are described and contrasted for the defined regions. During the wet season, occurrences of mineral dust penetrating deep into the interior were observed.
Resumo:
Hemoglobinopathies were included in the Brazilian Neonatal Screening Program on June 6, 2001. Automated high-performance liquid chromatography (HPLC) was indicated as one of the diagnostic methods. The amount of information generated by these systems is immense, and the behavior of groups cannot always be observed in individual analyses. Three-dimensional (3-D) visualization techniques can be applied to extract this information, for extracting patterns, trends or relations from the results stored in databases. We applied the 3-D visualization tool to analyze patterns in the results of hemoglobinopathy based on neonatal diagnosis by HPLC. The laboratory results of 2520 newborn analyses carried out in 2001 and 2002 were used. The ""Fast"", ""F1"", ""F"" and ""A"" peaks, which were detected by the analytical system, were chosen as attributes for mapping. To establish a behavior pattern, the results were classified into groups according to hemoglobin phenotype: normal (N = 2169), variant (N = 73) and thalassemia (N = 279). 3-D visualization was made with the FastMap DB tool; there were two distribution patterns in the normal group, due to variation in the amplitude of the values obtained by HPLC for the F1 window. It allowed separation of the samples with normal Hb from those with alpha thalassemia, based on a significant difference (P < 0.05) between the mean values of the ""Fast"" and ""A"" peaks, demonstrating the need for better evaluation of chromatograms; this method could be used to help diagnose alpha thalassemia in newborns.