857 resultados para Cheminformatics, (Q)SAR, Cross-Validation, Visualization, 3D-Space
Resumo:
Analyzing and modeling relationships between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects in chemical datasets is a challenging task for scientific researchers in the field of cheminformatics. Therefore, (Q)SAR model validation is essential to ensure future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to approve its use in real-world scenarios as an alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model is still under discussion. In this work, we empirically compare a k-fold cross-validation with external test set validation. The introduced workflow allows to apply the built and validated models to large amounts of unseen data, and to compare the performance of the different validation approaches. Our experimental results indicate that cross-validation produces (Q)SAR models with higher predictivity than external test set validation and reduces the variance of the results. Statistical validation is important to evaluate the performance of (Q)SAR models, but does not support the user in better understanding the properties of the model or the underlying correlations. We present the 3D molecular viewer CheS-Mapper (Chemical Space Mapper) that arranges compounds in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kinds of features, like structural fragments as well as quantitative chemical descriptors. Comprehensive functionalities including clustering, alignment of compounds according to their 3D structure, and feature highlighting aid the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. Even though visualization tools for analyzing (Q)SAR information in small molecule datasets exist, integrated visualization methods that allows for the investigation of model validation results are still lacking. We propose visual validation, as an approach for the graphical inspection of (Q)SAR model validation results. New functionalities in CheS-Mapper 2.0 facilitate the analysis of (Q)SAR information and allow the visual validation of (Q)SAR models. The tool enables the comparison of model predictions to the actual activity in feature space. Our approach reveals if the endpoint is modeled too specific or too generic and highlights common properties of misclassified compounds. Moreover, the researcher can use CheS-Mapper to inspect how the (Q)SAR model predicts activity cliffs. The CheS-Mapper software is freely available at http://ches-mapper.org.
Resumo:
Virtual environments can provide, through digital games and online social interfaces, extremely exciting forms of interactive entertainment. Because of their capability in displaying and manipulating information in natural and intuitive ways, such environments have found extensive applications in decision support, education and training in the health and science domains amongst others. Currently, the burden of validating both the interactive functionality and visual consistency of a virtual environment content is entirely carried out by developers and play-testers. While considerable research has been conducted in assisting the design of virtual world content and mechanics, to date, only limited contributions have been made regarding the automatic testing of the underpinning graphics software and hardware. The aim of this thesis is to determine whether the correctness of the images generated by a virtual environment can be quantitatively defined, and automatically measured, in order to facilitate the validation of the content. In an attempt to provide an environment-independent definition of visual consistency, a number of classification approaches were developed. First, a novel model-based object description was proposed in order to enable reasoning about the color and geometry change of virtual entities during a play-session. From such an analysis, two view-based connectionist approaches were developed to map from geometry and color spaces to a single, environment-independent, geometric transformation space; we used such a mapping to predict the correct visualization of the scene. Finally, an appearance-based aliasing detector was developed to show how incorrectness too, can be quantified for debugging purposes. Since computer games heavily rely on the use of highly complex and interactive virtual worlds, they provide an excellent test bed against which to develop, calibrate and validate our techniques. Experiments were conducted on a game engine and other virtual worlds prototypes to determine the applicability and effectiveness of our algorithms. The results show that quantifying visual correctness in virtual scenes is a feasible enterprise, and that effective automatic bug detection can be performed through the techniques we have developed. We expect these techniques to find application in large 3D games and virtual world studios that require a scalable solution to testing their virtual world software and digital content.
Resumo:
The validation of Computed Tomography (CT) based 3D models takes an integral part in studies involving 3D models of bones. This is of particular importance when such models are used for Finite Element studies. The validation of 3D models typically involves the generation of a reference model representing the bones outer surface. Several different devices have been utilised for digitising a bone’s outer surface such as mechanical 3D digitising arms, mechanical 3D contact scanners, electro-magnetic tracking devices and 3D laser scanners. However, none of these devices is capable of digitising a bone’s internal surfaces, such as the medullary canal of a long bone. Therefore, this study investigated the use of a 3D contact scanner, in conjunction with a microCT scanner, for generating a reference standard for validating the internal and external surfaces of a CT based 3D model of an ovine femur. One fresh ovine limb was scanned using a clinical CT scanner (Phillips, Brilliance 64) with a pixel size of 0.4 mm2 and slice spacing of 0.5 mm. Then the limb was dissected to obtain the soft tissue free bone while care was taken to protect the bone’s surface. A desktop mechanical 3D contact scanner (Roland DG Corporation, MDX 20, Japan) was used to digitise the surface of the denuded bone. The scanner was used with the resolution of 0.3 × 0.3 × 0.025 mm. The digitised surfaces were reconstructed into a 3D model using reverse engineering techniques in Rapidform (Inus Technology, Korea). After digitisation, the distal and proximal parts of the bone were removed such that the shaft could be scanned with a microCT (µCT40, Scanco Medical, Switzerland) scanner. The shaft, with the bone marrow removed, was immersed in water and scanned with a voxel size of 0.03 mm3. The bone contours were extracted from the image data utilising the Canny edge filter in Matlab (The Mathswork).. The extracted bone contours were reconstructed into 3D models using Amira 5.1 (Visage Imaging, Germany). The 3D models of the bone’s outer surface reconstructed from CT and microCT data were compared against the 3D model generated using the contact scanner. The 3D model of the inner canal reconstructed from the microCT data was compared against the 3D models reconstructed from the clinical CT scanner data. The disparity between the surface geometries of two models was calculated in Rapidform and recorded as average distance with standard deviation. The comparison of the 3D model of the whole bone generated from the clinical CT data with the reference model generated a mean error of 0.19±0.16 mm while the shaft was more accurate(0.08±0.06 mm) than the proximal (0.26±0.18 mm) and distal (0.22±0.16 mm) parts. The comparison between the outer 3D model generated from the microCT data and the contact scanner model generated a mean error of 0.10±0.03 mm indicating that the microCT generated models are sufficiently accurate for validation of 3D models generated from other methods. The comparison of the inner models generated from microCT data with that of clinical CT data generated an error of 0.09±0.07 mm Utilising a mechanical contact scanner in conjunction with a microCT scanner enabled to validate the outer surface of a CT based 3D model of an ovine femur as well as the surface of the model’s medullary canal.
Resumo:
This paper addresses the problem of determining an optimal (shortest) path in three dimensional space for a constant speed and turn-rate constrained aerial vehicle, that would enable the vehicle to converge to a rectilinear path, starting from any arbitrary initial position and orientation. Based on 3D geometry, we propose an optimal and also a suboptimal path planning approach. Unlike the existing numerical methods which are computationally intensive, this optimal geometrical method generates an optimal solution in lesser time. The suboptimal solution approach is comparatively more efficient and gives a solution that is very close to the optimal one. Due to its simplicity and low computational requirements this approach can be implemented on an aerial vehicle with constrained turn radius to reach a straight line with a prescribed orientation as required in several applications. But, if the distance between the initial point and the straight line to be followed along the vertical axis is high, then the generated path may not be flyable for an aerial vehicle with limited range of flight path angle and we resort to a numerical method for obtaining the optimal solution. The numerical method used here for simulation is based on multiple shooting and is found to be comparatively more efficient than other methods for solving such two point boundary value problem.
Resumo:
This paper deals with an experimental study of pressure-swirl hydraulic injector nozzles using non-intrusive optical techniques. Experiments were conducted to study atomization characteristics using two nozzles with different orifice diameters, 0.3 mm and 0.5 mm, and injection pressures, 0.3-3.5 Mpa, which correspond to Reynolds number (Re-p) = 7,000-45,000, depending on nozzle utilized. Three laser diagnostic techniques were utilized: Shadowgraph, PIV (Particle Image Velocimetry), and PDPA (Phase Doppler Particle Anemometry). Measurements made in the spray in both axial and radial directions indicate that velocity, average droplet diameter profiles, and spray dynamics are highly dependent on the nozzle characteristics and injection pressure. Limitations of these techniques in the different flow regimes, related to the primary and secondary breakups as well as coalescence, are provided. Results indicate that all three techniques provide similar results throughout the different regimes. Shadowgraph and PDPA were possible in the secondary atomization and coalescence regimes while PIV measurements could be made only at the end of secondary atomization and coalescence.
Resumo:
The standard, ad-hoc stopping criteria used in decision tree-based context clustering are known to be sub-optimal and require parameters to be tuned. This paper proposes a new approach for decision tree-based context clustering based on cross validation and hierarchical priors. Combination of cross validation and hierarchical priors within decision tree-based context clustering offers better model selection and more robust parameter estimation than conventional approaches, with no tuning parameters. Experimental results on HMM-based speech synthesis show that the proposed approach achieved significant improvements in naturalness of synthesized speech over the conventional approaches. © 2011 IEEE.
Resumo:
In this letter, a new wind-vector algorithm is presented that uses radar backscatter sigma(0) measurements at two adjacent subscenes of RADARSAT-1 synthetic aperture radar (SAR) images, with each subscene having slightly different geometry. Resultant wind vectors are validated using in situ buoy measurements and compared with wind vectors determined from a hybrid wind-retrieval model using wind directions determined by spectral analysis of wind-induced image streaks and observed by colocated QuikSCAT measurements. The hybrid wind-retrieval model consists of CMOD-IFR2 [applicable to C-band vertical-vertical (W) polarization] and a C-band copolarization ratio according to Kirchhoff scattering. The new algorithm displays improved skill in wind-vector estimation for RADARSAT-1 SAR data when compared to conventional wind-retrieval methodology. In addition, unlike conventional methods, the present method is applicable to RADARSAT-1 images both with and without visible streaks. However, this method requires ancillary data such as buoy measurements to resolve the ambiguity in retrieved wind direction.
Resumo:
The validation of variable-density flow models simulating seawater intrusion in coastal aquifers requires information about concentration distribution in groundwater. Electrical resistivity tomography (ERT) provides relevant data for this purpose. However, inverse modeling is not accurate because of the non-uniqueness of solutions. Such difficulties in evaluating seawater intrusion can be overcome by coupling geophysical data and groundwater modeling. First, the resistivity distribution obtained by inverse geo-electrical modeling is established. Second, a 3-D variable-density flow hydrogeological model is developed. Third, using Archie's Law, the electrical resistivity model deduced from salt concentration is compared to the formerly interpreted electrical model. Finally, aside from that usual comparison-validation, the theoretical geophysical response of concentrations simulated with the groundwater model can be compared to field-measured resistivity data. This constitutes a cross-validation of both the inverse geo-electrical model and the groundwater model.
[Comte, J.-C., and O. Banton (2007), Cross-validation of geo-electrical and hydrogeological models to evaluate seawater intrusion in coastal aquifers, Geophys. Res. Lett., 34, L10402, doi:10.1029/2007GL029981.]
Resumo:
The paper addresses the issue of choice of bandwidth in the application of semiparametric estimation of the long memory parameter in a univariate time series process. The focus is on the properties of forecasts from the long memory model. A variety of cross-validation methods based on out of sample forecasting properties are proposed. These procedures are used for the choice of bandwidth and subsequent model selection. Simulation evidence is presented that demonstrates the advantage of the proposed new methodology.
Resumo:
Typically, algorithms for generating stereo disparity maps have been developed to minimise the energy equation of a single image. This paper proposes a method for implementing cross validation in a belief propagation optimisation. When tested using the Middlebury online stereo evaluation, the cross validation improves upon the results of standard belief propagation. Furthermore, it has been shown that regions of homogeneous colour within the images can be used for enforcing the so-called "Segment Constraint". Developing from this, Segment Support is introduced to boost belief between pixels of the same image region and improve propagation into textureless regions.
Resumo:
We present cross-validation of remote sensing measurements of methane profiles in the Canadian high Arctic. Accurate and precise measurements of methane are essential to understand quantitatively its role in the climate system and in global change. Here, we show a cross-validation between three datasets: two from spaceborne instruments and one from a ground-based instrument. All are Fourier Transform Spectrometers (FTSs). We consider the Canadian SCISAT Atmospheric Chemistry Experiment (ACE)-FTS, a solar occultation infrared spectrometer operating since 2004, and the thermal infrared band of the Japanese Greenhouse Gases Observing Satellite (GOSAT) Thermal And Near infrared Sensor for carbon Observation (TANSO)-FTS, a nadir/off-nadir scanning FTS instrument operating at solar and terrestrial infrared wavelengths, since 2009. The ground-based instrument is a Bruker 125HR Fourier Transform Infrared (FTIR) spectrometer, measuring mid-infrared solar absorption spectra at the Polar Environment Atmospheric Research Laboratory (PEARL) Ridge Lab at Eureka, Nunavut (80° N, 86° W) since 2006. For each pair of instruments, measurements are collocated within 500 km and 24 h. An additional criterion based on potential vorticity values was found not to significantly affect differences between measurements. Profiles are regridded to a common vertical grid for each comparison set. To account for differing vertical resolutions, ACE-FTS measurements are smoothed to the resolution of either PEARL-FTS or TANSO-FTS, and PEARL-FTS measurements are smoothed to the TANSO-FTS resolution. Differences for each pair are examined in terms of profile and partial columns. During the period considered, the number of collocations for each pair is large enough to obtain a good sample size (from several hundred to tens of thousands depending on pair and configuration). Considering full profiles, the degrees of freedom for signal (DOFS) are between 0.2 and 0.7 for TANSO-FTS and between 1.5 and 3 for PEARL-FTS, while ACE-FTS has considerably more information (roughly 1° of freedom per altitude level). We take partial columns between roughly 5 and 30 km for the ACE-FTS–PEARL-FTS comparison, and between 5 and 10 km for the other pairs. The DOFS for the partial columns are between 1.2 and 2 for PEARL-FTS collocated with ACE-FTS, between 0.1 and 0.5 for PEARL-FTS collocated with TANSO-FTS or for TANSO-FTS collocated with either other instrument, while ACE-FTS has much higher information content. For all pairs, the partial column differences are within ± 3 × 1022 molecules cm−2. Expressed as median ± median absolute deviation (expressed in absolute or relative terms), these differences are 0.11 ± 9.60 × 10^20 molecules cm−2 (0.012 ± 1.018 %) for TANSO-FTS–PEARL-FTS, −2.6 ± 2.6 × 10^21 molecules cm−2 (−1.6 ± 1.6 %) for ACE-FTS–PEARL-FTS, and 7.4 ± 6.0 × 10^20 molecules cm−2 (0.78 ± 0.64 %) for TANSO-FTS–ACE-FTS. The differences for ACE-FTS–PEARL-FTS and TANSO-FTS–PEARL-FTS partial columns decrease significantly as a function of PEARL partial columns, whereas the range of partial column values for TANSO-FTS–ACE-FTS collocations is too small to draw any conclusion on its dependence on ACE-FTS partial columns.