Biblioteca Digital

Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software. © 2006 American Chemical Society.

Veja mais

Parallel geostatistics for sparse and dense datasets

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.

Veja mais

Transition in homogeneously heated inclined plane parallel shear flows

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The transition of internally heated inclined plane parallel shear flows is examined numerically for the case of finite values of the Prandtl number Pr. We show that as the strength of the homogeneously distributed heat source is increased the basic flow loses stability to two-dimensional perturbations of the transverse roll type in a Hopf bifurcation for the vertical orientation of the fluid layer, whereas perturbations of the longitudinal roll type are most dangerous for a wide range of the value of the angle of inclination. In the case of the horizontal inclination transverse roll and longitudinal roll perturbations share the responsibility for the prime instability. Following the linear stability analysis for the general inclination of the fluid layer our attention is focused on a numerical study of the finite amplitude secondary travelling-wave solutions (TW) that develop from the perturbations of the transverse roll type for the vertical inclination of the fluid layer. The stability of the secondary TW against three-dimensional perturbations is also examined and our study shows that for Pr=0.71 the secondary instability sets in as a quasi-periodic mode, while for Pr=7 it is phase-locked to the secondary TW. The present study complements and extends the recent study by Nagata and Generalis (2002) in the case of vertical inclination for Pr=0.

Veja mais

848 resultados para parallel benchmarks

Filtro por publicador