986 resultados para Parallel methods
Resumo:
In this correspondence new robust nonlinear model construction algorithms for a large class of linear-in-the-parameters models are introduced to enhance model robustness via combined parameter regularization and new robust structural selective criteria. In parallel to parameter regularization, we use two classes of robust model selection criteria based on either experimental design criteria that optimizes model adequacy, or the predicted residual sums of squares (PRESS) statistic that optimizes model generalization capability, respectively. Three robust identification algorithms are introduced, i.e., combined A- and D-optimality with regularized orthogonal least squares algorithm, respectively; and combined PRESS statistic with regularized orthogonal least squares algorithm. A common characteristic of these algorithms is that the inherent computation efficiency associated with the orthogonalization scheme in orthogonal least squares or regularized orthogonal least squares has been extended such that the new algorithms are computationally efficient. Numerical examples are included to demonstrate effectiveness of the algorithms.
Resumo:
A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.
Resumo:
Modelling the interaction of terahertz(THz) radiation with biological tissueposes many interesting problems. THzradiation is neither obviously described byan electric field distribution or anensemble of photons and biological tissueis an inhomogeneous medium with anelectronic permittivity that is bothspatially and frequency dependent making ita complex system to model.A three-layer system of parallel-sidedslabs has been used as the system throughwhich the passage of THz radiation has beensimulated. Two modelling approaches havebeen developed a thin film matrix model anda Monte Carlo model. The source data foreach of these methods, taken at the sametime as the data recorded to experimentallyverify them, was a THz spectrum that hadpassed though air only.Experimental verification of these twomodels was carried out using athree-layered in vitro phantom. Simulatedtransmission spectrum data was compared toexperimental transmission spectrum datafirst to determine and then to compare theaccuracy of the two methods. Goodagreement was found, with typical resultshaving a correlation coefficient of 0.90for the thin film matrix model and 0.78 forthe Monte Carlo model over the full THzspectrum. Further work is underway toimprove the models above 1 THz.
Resumo:
The adsorption of gases on microporous carbons is still poorly understood, partly because the structure of these carbons is not well known. Here, a model of microporous carbons based on fullerene- like fragments is used as the basis for a theoretical study of Ar adsorption on carbon. First, a simulation box was constructed, containing a plausible arrangement of carbon fragments. Next, using a new Monte Carlo simulation algorithm, two types of carbon fragments were gradually placed into the initial structure to increase its microporosity. Thirty six different microporous carbon structures were generated in this way. Using the method proposed recently by Bhattacharya and Gubbins ( BG), the micropore size distributions of the obtained carbon models and the average micropore diameters were calculated. For ten chosen structures, Ar adsorption isotherms ( 87 K) were simulated via the hyper- parallel tempering Monte Carlo simulation method. The isotherms obtained in this way were described by widely applied methods of microporous carbon characterisation, i. e. Nguyen and Do, Horvath - Kawazoe, high- resolution alpha(a)s plots, adsorption potential distributions and the Dubinin - Astakhov ( DA) equation. From simulated isotherms described by the DA equation, the average micropore diameters were calculated using empirical relationships proposed by different authors and they were compared with those from the BG method.
Resumo:
The DNA G-qadruplexes are one of the targets being actively explored for anti-cancer therapy by inhibiting them through small molecules. This computational study was conducted to predict the binding strengths and orientations of a set of novel dimethyl-amino-ethyl-acridine (DACA) analogues that are designed and synthesized in our laboratory, but did not diffract in Synchrotron light.Thecrystal structure of DNA G-Quadruplex(TGGGGT)4(PDB: 1O0K) was used as target for their binding properties in our studies.We used both the force field (FF) and QM/MM derived atomic charge schemes simultaneously for comparing the predictions of drug binding modes and their energetics. This study evaluates the comparative performance of fixed point charge based Glide XP docking and the quantum polarized ligand docking schemes. These results will provide insights on the effects of including or ignoring the drug-receptor interfacial polarization events in molecular docking simulations, which in turn, will aid the rational selection of computational methods at different levels of theory in future drug design programs. Plenty of molecular modelling tools and methods currently exist for modelling drug-receptor or protein-protein, or DNA-protein interactionssat different levels of complexities.Yet, the capasity of such tools to describevarious physico-chemical propertiesmore accuratelyis the next step ahead in currentresearch.Especially, the usage of most accurate methods in quantum mechanics(QM) is severely restricted by theirtedious nature. Though the usage of massively parallel super computing environments resulted in a tremendous improvement in molecular mechanics (MM) calculations like molecular dynamics,they are still capable of dealing with only a couple of tens to hundreds of atoms for QM methods. One such efficient strategy that utilizes thepowers of both MM and QM are the QM/MM hybrid methods. Lately, attempts have been directed towards the goal of deploying several different QM methods for betterment of force field based simulations, but with practical restrictions in place. One of such methods utilizes the inclusion of charge polarization events at the drug-receptor interface, that is not explicitly present in the MM FF.
Resumo:
In this paper, we extend to the time-harmonic Maxwell equations the p-version analysis technique developed in [R. Hiptmair, A. Moiola and I. Perugia, Plane wave discontinuous Galerkin methods for the 2D Helmholtz equation: analysis of the p-version, SIAM J. Numer. Anal., 49 (2011), 264-284] for Trefftz-discontinuous Galerkin approximations of the Helmholtz problem. While error estimates in a mesh-skeleton norm are derived parallel to the Helmholtz case, the derivation of estimates in a mesh-independent norm requires new twists in the duality argument. The particular case where the local Trefftz approximation spaces are built of vector-valued plane wave functions is considered, and convergence rates are derived.
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.
Resumo:
Pure N,N`-di(methoxycarbonylsulfenyl)urea, [CH(3)OC(O)SNH](2)CO, is quantitatively prepared by the hydrolysis reaction of CH(3)OC(O)SNCO and characterized by (1)H NMR, GC-MS and FTIR spectroscopy techniques. Structural and conformational properties are analyzed using a combined approach with data obtained from X-ray diffraction, vibrational spectra and theoretical calculation methods. The IR and Raman spectra for normal and deuterated species are reported. The crystal structure of [CH(3)OC(O)SNH](2)CO was determined by X-ray diffraction methods. The substance crystallizes in the orthorhombic P2(1)2(1)2 space group with a = 9.524(2), b = 12.003(1), c = 4.481 (1) angstrom, and Z = 2 moieties in the unit cell. The molecule is sited on a twofold crystallographic axis (C(2)) parallel to c and shows the anti-anti conformation (S-N single bonds antiperiplanar with respect to the opposite C-N single bonds in sulfenyl-urea-sic group). Neighboring molecules are arranged in a chain motif that extends along the C(2)-axis and is held by bifurcated NH center dot center dot center dot O center dot center dot center dot HN intermolecular bonds. A local planar symmetry is observed in the crystal for the central -SN(H)C(O)N(H)S- skeleton. Experimental and calculated data allow to trace this structural feature to the occurrence of N-H center dot center dot center dot O=C hydrogen bonding interactions. Calculated vibrational and structural properties are in good agreement with the experimentally determined features. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The InteGrade middleware intends to exploit the idle time of computing resources in computer laboratories. In this work we investigate the performance of running parallel applications with communication among processors on the InteGrade grid. As costly communication on a grid can be prohibitive, we explore the so-called systolic or wavefront paradigm to design the parallel algorithms in which no global communication is used. To evaluate the InteGrade middleware we considered three parallel algorithms that solve the matrix chain product problem, the 0-1 Knapsack Problem, and the local sequence alignment problem, respectively. We show that these three applications running under the InteGrade middleware and MPI take slightly more time than the same applications running on a cluster with only LAM-MPI support. The results can be considered promising and the time difference between the two is not substantial. The overhead of the InteGrade middleware is acceptable, in view of the benefits obtained to facilitate the use of grid computing by the user. These benefits include job submission, checkpointing, security, job migration, etc. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
Approximate Lie symmetries of the Navier-Stokes equations are used for the applications to scaling phenomenon arising in turbulence. In particular, we show that the Lie symmetries of the Euler equations are inherited by the Navier-Stokes equations in the form of approximate symmetries that allows to involve the Reynolds number dependence into scaling laws. Moreover, the optimal systems of all finite-dimensional Lie subalgebras of the approximate symmetry transformations of the Navier-Stokes are constructed. We show how the scaling groups obtained can be used to introduce the Reynolds number dependence into scaling laws explicitly for stationary parallel turbulent shear flows. This is demonstrated in the framework of a new approach to derive scaling laws based on symmetry analysis [11]-[13].