870 resultados para hierarchical multidimensional visualization


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software. © 2006 American Chemical Society.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Hierarchical visualization systems are desirable because a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex high-dimensional data sets. We extend an existing locally linear hierarchical visualization system PhiVis [1] in several directions: bf(1) we allow for em non-linear projection manifolds (the basic building block is the Generative Topographic Mapping -- GTM), bf(2) we introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree, bf(3) we describe folding patterns of low-dimensional projection manifold in high-dimensional data space by computing and visualizing the manifold's local directional curvatures. Quantities such as magnification factors [3] and directional curvatures are helpful for understanding the layout of the nonlinear projection manifold in the data space and for further refinement of the hierarchical visualization plot. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. We demonstrate the visualization system principle of the approach on a complex 12-dimensional data set and mention possible applications in the pharmaceutical industry.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. In this paper, we propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest," whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets. © 2005 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances in the control of molecular engineering architectures have allowed unprecedented ability of molecular recognition in biosensing, with a promising impact for clinical diagnosis and environment control. The availability of large amounts of data from electrical, optical, or electrochemical measurements requires, however, sophisticated data treatment in order to optimize sensing performance. In this study, we show how an information visualization system based on projections, referred to as Projection Explorer (PEx), can be used to achieve high performance for biosensors made with nanostructured films containing immobilized antigens. As a proof of concept, various visualizations were obtained with impedance spectroscopy data from an array of sensors whose electrical response could be specific toward a given antibody (analyte) owing to molecular recognition processes. In addition to discussing the distinct methods for projection and normalization of the data, we demonstrate that an excellent distinction can be made between real samples tested positive for Chagas disease and Leishmaniasis, which could not be achieved with conventional statistical methods. Such high performance probably arose from the possibility of treating the data in the whole frequency range. Through a systematic analysis, it was inferred that Sammon`s mapping with standardization to normalize the data gives the best results, where distinction could be made of blood serum samples containing 10(-7) mg/mL of the antibody. The method inherent in PEx and the procedures for analyzing the impedance data are entirely generic and can be extended to optimize any type of sensor or biosensor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyzes the DNA code of several species in the perspective of information content. For that purpose several concepts and mathematical tools are selected towards establishing a quantitative method without a priori distorting the alphabet represented by the sequence of DNA bases. The synergies of associating Gray code, histogram characterization and multidimensional scaling visualization lead to a collection of plots with a categorical representation of species and chromosomes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyses earthquake data in the perspective of dynamical systems and fractional calculus (FC). This new standpoint uses Multidimensional Scaling (MDS) as a powerful clustering and visualization tool. FC extends the concepts of integrals and derivatives to non-integer and complex orders. MDS is a technique that produces spatial or geometric representations of complex objects, such that those objects that are perceived to be similar in some sense are placed on the MDS maps forming clusters. In this study, over three million seismic occurrences, covering the period from January 1, 1904 up to March 14, 2012 are analysed. The events are characterized by their magnitude and spatiotemporal distributions and are divided into fifty groups, according to the Flinn–Engdahl (F–E) seismic regions of Earth. Several correlation indices are proposed to quantify the similarities among regions. MDS maps are proven as an intuitive and useful visual representation of the complex relationships that are present among seismic events, which may not be perceived on traditional geographic maps. Therefore, MDS constitutes a valid alternative to classic visualization tools for understanding the global behaviour of earthquakes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper investigates the risk factors for the severity of orthodontic root resorption. The multidimensional scaling (MDS) visualization method is used to investigate the experimental data from patients who received orthodontic treatment at the Department of Orthodontics and Dentofacial Orthopedics, Faculty of Dentistry, “Carol Davila” University of Medicine and Pharmacy, during a period of 4 years. The clusters emerging in the MDS plots reveal features and properties not easily captured by classical statistical tools. The results support the adoption of MDS for tackling the dentistry information and overcoming noise embedded into the data. The method introduced in this paper is rapid, efficient, and very useful for treating the risk factors for the severity of orthodontic root resorption.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study describes the change of the ultraviolet spectral bands starting from 0.1 to 5.0 nm slit width in the spectral range of 200–400 nm. The analysis of the spectral bands is carried out by using the multidimensional scaling (MDS) approach to reach the latent spectral background. This approach indicates that 0.1 nm slit width gives higher-order noise together with better spectral details. Thus, 5.0 nm slit width possesses the higher peak amplitude and lower-order noise together with poor spectral details. In the above-mentioned conditions, the main problem is to find the relationship between the spectral band properties and the slit width. For this aim, the MDS tool is to used recognize the hidden information of the ultraviolet spectra of sildenafil citrate by using a ShimadzuUV–VIS 2550, which is in theworld the best double monochromator instrument. In this study, the proposed mathematical approach gives the rich findings for the efficient use of the spectrophotometer in the qualitative and quantitative studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the use of multidimensional scaling in the evaluation of fractional system. Several algorithms are analysed based on the time response of the closed loop system under the action of a reference step input signal. Two alternative performance indices, based on the time and frequency domains, are tested. The numerical experiments demonstrate the feasibility of the proposed visualization method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forest fires dynamics is often characterized by the absence of a characteristic length-scale, long range correlations in space and time, and long memory, which are features also associated with fractional order systems. In this paper a public domain forest fires catalogue, containing information of events for Portugal, covering the period from 1980 up to 2012, is tackled. The events are modelled as time series of Dirac impulses with amplitude proportional to the burnt area. The time series are viewed as the system output and are interpreted as a manifestation of the system dynamics. In the first phase we use the pseudo phase plane (PPP) technique to describe forest fires dynamics. In the second phase we use multidimensional scaling (MDS) visualization tools. The PPP allows the representation of forest fires dynamics in two-dimensional space, by taking time series representative of the phenomena. The MDS approach generates maps where objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to better understand forest fires behaviour.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Artigo científico disponível actualmente em Early View (Online Version of Record published before inclusion in an issue)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study describes the change of the ultraviolet spectral bands starting from 0.1 to 5.0 nm slit width in the spectral range of 200–400 nm. The analysis of the spectral bands is carried out by using the multidimensional scaling (MDS) approach to reach the latent spectral background. This approach indicates that 0.1 nm slit width gives higher-order noise together with better spectral details. Thus, 5.0 nm slit width possesses the higher peak amplitude and lower-order noise together with poor spectral details. In the above-mentioned conditions, the main problem is to find the relationship between the spectral band properties and the slit width. For this aim, the MDS tool is to used recognize the hidden information of the ultraviolet spectra of sildenafil citrate by using a Shimadzu UV–VIS 2550, which is in the world the best double monochromator instrument. In this study, the proposed mathematical approach gives the rich findings for the efficient use of the spectrophotometer in the qualitative and quantitative studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper studies forest fires from the perspective of dynamical systems. Burnt area, precipitation and atmospheric temperatures are interpreted as state variables of a complex system and the correlations between them are investigated by means of different mathematical tools. First, we use mutual information to reveal potential relationships in the data. Second, we adopt the state space portrait to characterize the system’s behavior. Third, we compare the annual state space curves and we apply clustering and visualization tools to unveil long-range patterns. We use forest fire data for Portugal, covering the years 1980–2003. The territory is divided into two regions (North and South), characterized by different climates and vegetation. The adopted methodology represents a new viewpoint in the context of forest fires, shedding light on a complex phenomenon that needs to be better understood in order to mitigate its devastating consequences, at both economical and environmental levels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we study several natural and man-made complex phenomena in the perspective of dynamical systems. For each class of phenomena, the system outputs are time-series records obtained in identical conditions. The time-series are viewed as manifestations of the system behavior and are processed for analyzing the system dynamics. First, we use the Fourier transform to process the data and we approximate the amplitude spectra by means of power law functions. We interpret the power law parameters as a phenomenological signature of the system dynamics. Second, we adopt the techniques of non-hierarchical clustering and multidimensional scaling to visualize hidden relationships between the complex phenomena. Third, we propose a vector field based analogy to interpret the patterns unveiled by the PL parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.