956 resultados para Computationally efficient
Resumo:
In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classification rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classification rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.
Resumo:
Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.
Resumo:
The Bergman cyclization of large polycyclic enediyne systems that mimic the cores of the enediyne anticancer antibiotics was studied using the ONIOM hybrid method. Tests on small enediynes show that ONIOM can accurately match experimental data. The effect of the triggering reaction in the natural products is investigated, and we support the argument that it is strain effects that lower the cyclization barrier. The barrier for the triggered molecule is very low, leading to a reasonable half-life at biological temperatures. No evidence is found that would suggest a concerted cyclization/H-atom abstraction mechanism is necessary for DNA cleavage.
Resumo:
Over the past 7 years, the enediyne anticancer antibiotics have been widely studied due to their DNA cleaving ability. The focus of these antibiotics, represented by kedarcidin chromophore, neocarzinostatin chromophore, calicheamicin, esperamicin A, and dynemicin A, is on the enediyne moiety contained within each of these antibiotics. In its inactive form, the moiety is benign to its environment. Upon suitable activation, the system undergoes a Bergman cycloaromatization proceeding through a 1,4-dehydrobenzene diradical intermediate. It is this diradical intermediate that is thought to cleave double-stranded dna through hydrogen atom abstraction. Semiempirical, semiempiricalci, Hartree–Fock ab initio, and mp2 electron correlation methods have been used to investigate the inactive hex-3-ene-1,5-diyne reactant, the 1,4-dehydrobenzene diradical, and a transition state structure of the Bergman reaction. Geometries calculated with different basis sets and by semiempirical methods have been used for single-point calculations using electron correlation methods. These results are compared with the best experimental and theoretical results reported in the literature. Implications of these results for computational studies of the enediyne anticancer antibiotics are discussed.
Resumo:
A computationally efficient procedure for modeling the alkaline hydrolysis of esters is proposed based on calculations performed on methyl acetate and methyl benzoate systems. Extensive geometry and energy comparisons were performed on the simple ester methyl acetate. The effectiveness of performing high level single point ab initio energy calculations on the geometries obtained from semiempirical and ab initio methods was determined. The AM1 and PM3 semiempirical methods are evaluated for their ability to model the transition states and intermediates for ester hydrolysis. The Cramer/Truhlar SM3 solvation method was used to determine activation energies. The most computationally efficient way to model the transition states of large esters is to use the PM3 method. The PM3 transition structure can then be used as a template for the design of haptens capable of inducing catalytic antibodies.
Resumo:
Phys. Rev. E 85, 026214-026219 (2012) Desarrollo de un nuevo y eficiente método para la construcción de funciones de scar a lo largo de las órtbitas periódicas inestables de sistemas clásicamente caóticos
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Efficient numerical models facilitate the study and design of solid oxide fuel cells (SOFCs), stacks, and systems. Whilst the accuracy and reliability of the computed results are usually sought by researchers, the corresponding modelling complexities could result in practical difficulties regarding the implementation flexibility and computational costs. The main objective of this article is to adapt a simple but viable numerical tool for evaluation of our experimental rig. Accordingly, a model for a multi-layer SOFC surrounded by a constant temperature furnace is presented, trained and validated against experimental data. The model consists of a four-layer structure including stand, two interconnects, and PEN (Positive electrode-Electrolyte-Negative electrode); each being approximated by a lumped parameter model. The heating process through the surrounding chamber is also considered. We used a set of V-I characteristics data for parameter adjustment followed by model verification against two independent sets of data. The model results show a good agreement with practical data, offering a significant improvement compared to reduced models in which the impact of external heat loss is neglected. Furthermore, thermal analysis for adiabatic and non-adiabatic process is carried out to capture the thermal behaviour of a single cell followed by a polarisation loss assessment. Finally, model-based design of experiment is demonstrated for a case study.
Resumo:
Background Biochemical systems with relatively low numbers of components must be simulated stochastically in order to capture their inherent noise. Although there has recently been considerable work on discrete stochastic solvers, there is still a need for numerical methods that are both fast and accurate. The Bulirsch-Stoer method is an established method for solving ordinary differential equations that possesses both of these qualities. Results In this paper, we present the Stochastic Bulirsch-Stoer method, a new numerical method for simulating discrete chemical reaction systems, inspired by its deterministic counterpart. It is able to achieve an excellent efficiency due to the fact that it is based on an approach with high deterministic order, allowing for larger stepsizes and leading to fast simulations. We compare it to the Euler τ-leap, as well as two more recent τ-leap methods, on a number of example problems, and find that as well as being very accurate, our method is the most robust, in terms of efficiency, of all the methods considered in this paper. The problems it is most suited for are those with increased populations that would be too slow to simulate using Gillespie’s stochastic simulation algorithm. For such problems, it is likely to achieve higher weak order in the moments. Conclusions The Stochastic Bulirsch-Stoer method is a novel stochastic solver that can be used for fast and accurate simulations. Crucially, compared to other similar methods, it better retains its high accuracy when the timesteps are increased. Thus the Stochastic Bulirsch-Stoer method is both computationally efficient and robust. These are key properties for any stochastic numerical method, as they must typically run many thousands of simulations.
Efficient implementations of a pseudodynamical stochastic filtering strategy for static elastography
Resumo:
A computationally efficient pseudodynamical filtering setup is established for elasticity imaging (i.e., reconstruction of shear modulus distribution) in soft-tissue organs given statically recorded and partially measured displacement data. Unlike a regularized quasi-Newton method (QNM) that needs inversion of ill-conditioned matrices, the authors explore pseudodynamic extended and ensemble Kalman filters (PD-EKF and PD-EnKF) that use a parsimonious representation of states and bypass explicit regularization by recursion over pseudotime. Numerical experiments with QNM and the two filters suggest that the PD-EnKF is the most robust performer as it exhibits no sensitivity to process noise covariance and yields good reconstruction even with small ensemble sizes.
Resumo:
In Universal Mobile Telecommunication Systems (UMTS), the Downlink Shared Channel (DSCH) can be used for providing streaming services. The traffic model for streaming services is different from the commonly used continuously- backlogged model. Each connection specifies a required service rate over an interval of time, k, called the "control horizon". In this paper, our objective is to determine how k DSCH frames should be shared among a set of I connections. We need a scheduler that is efficient and fair and introduce the notion of discrepancy to balance the conflicting requirements of aggregate throughput and fairness. Our motive is to schedule the mobiles in such a way that the schedule minimizes the discrepancy over the k frames. We propose an optimal and computationally efficient algorithm, called STEM+. The proof of the optimality of STEM+, when applied to the UMTS rate sets is the major contribution of this paper. We also show that STEM+ performs better in terms of both fairness and aggregate throughput compared to other scheduling algorithms. Thus, STEM+ achieves both fairness and efficiency and is therefore an appealing algorithm for scheduling streaming connections.
Resumo:
A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison. (C) 2013 Society of Photo-Optical Instrumentation Engineers (SPIE)
Resumo:
A receding horizon steering controller is presented, capable of pushing an oversteering nonlinear vehicle model to its handling limit while travelling at constant forward speed. The controller is able to optimise the vehicle path, using a computationally efficient and robust technique, so that the vehicle progression along a track is maximised as a function of time. The resultant method forms part of the solution to the motor racing objective of minimising lap time. © 2011 AACC American Automatic Control Council.