981 resultados para Computation time
Resumo:
Modelling and optimization of the power draw of large SAG/AG mills is important due to the large power draw which modern mills require (5-10 MW). The cost of grinding is the single biggest cost within the entire process of mineral extraction. Traditionally, modelling of the mill power draw has been done using empirical models. Although these models are reliable, they cannot model mills and operating conditions which are not within the model database boundaries. Also, due to its static nature, the impact of the changing conditions within the mill on the power draw cannot be determined using such models. Despite advances in computing power, discrete element method (DEM) modelling of large mills with many thousands of particles could be a time consuming task. The speed of computation is determined principally by two parameters: number of particles involved and material properties. The computational time step is determined by the size of the smallest particle present in the model and material properties (stiffness). In the case of small particles, the computational time step will be short, whilst in the case of large particles; the computation time step will be larger. Hence, from the point of view of time required for modelling (which usually corresponds to time required for 3-4 mill revolutions), it will be advantageous that the smallest particles in the model are not unnecessarily too small. The objective of this work is to compare the net power draw of the mill whose charge is characterised by different size distributions, while preserving the constant mass of the charge and mill speed. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
The adsorption of simple Lennard-Jones fluids in a carbon slit pore of finite length was studied with Canonical Ensemble (NVT) and Gibbs Ensemble Monte Carlo Simulations (GEMC). The Canonical Ensemble was a collection of cubic simulation boxes in which a finite pore resides, while the Gibbs Ensemble was that of the pore space of the finite pore. Argon was used as a model for Lennard-Jones fluids, while the adsorbent was modelled as a finite carbon slit pore whose two walls were composed of three graphene layers with carbon atoms arranged in a hexagonal pattern. The Lennard-Jones (LJ) 12-6 potential model was used to compute the interaction energy between two fluid particles, and also between a fluid particle and a carbon atom. Argon adsorption isotherms were obtained at 87.3 K for pore widths of 1.0, 1.5 and 2.0 nm using both Canonical and Gibbs Ensembles. These results were compared with isotherms obtained with corresponding infinite pores using Grand Canonical Ensembles. The effects of the number of cycles necessary to reach equilibrium, the initial allocation of particles, the displacement step and the simulation box size were particularly investigated in the Monte Carlo simulation with Canonical Ensembles. Of these parameters, the displacement step had the most significant effect on the performance of the Monte Carlo simulation. The simulation box size was also important, especially at low pressures at which the size must be sufficiently large to have a statistically acceptable number of particles in the bulk phase. Finally, it was found that the Canonical Ensemble and the Gibbs Ensemble both yielded the same isotherm (within statistical error); however, the computation time for GEMC was shorter than that for canonical ensemble simulation. However, the latter method described the proper interface between the reservoir and the adsorbed phase (and hence the meniscus).
Resumo:
A set of DCT domain properties for shifting and scaling by real amounts, and taking linear operations such as differentiation is described. The DCT coefficients of a sampled signal are subjected to a linear transform, which returns the DCT coefficients of the shifted, scaled and/or differentiated signal. The properties are derived by considering the inverse discrete transform as a cosine series expansion of the original continuous signal, assuming sampling in accordance with the Nyquist criterion. This approach can be applied in the signal domain, to give, for example, DCT based interpolation or derivatives. The same approach can be taken in decoding from the DCT to give, for example, derivatives in the signal domain. The techniques may prove useful in compressed domain processing applications, and are interesting because they allow operations from the continuous domain such as differentiation to be implemented in the discrete domain. An image matching algorithm illustrates the use of the properties, with improvements in computation time and matching quality.
Resumo:
Despite the number of computer-assisted methods described for the derivation of steady-state equations of enzyme systems, most of them are focused on strict steady-state conditions or are not able to solve complex reaction mechanisms. Moreover, many of them are based on computer programs that are either not readily available or have limitations. We present here a computer program called WinStes, which derives equations for both strict steady-state systems and those with the assumption of rapid equilibrium, for branched or unbranched mechanisms, containing both reversible and irreversible conversion steps. It solves reaction mechanisms involving up to 255 enzyme species, connected by up to 255 conversion steps. The program provides all the advantages of the Windows programs, such as a user-friendly graphical interface, and has a short computation time. WinStes is available free of charge on request from the authors. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.
Resumo:
The research carried out in this thesis was mainly concerned with the effects of large induction motors and their transient performance in power systems. Computer packages using the three phase co-ordinate frame of reference were developed to simulate the induction motor transient performance. A technique using matrix algebra was developed to allow extension of the three phase co-ordinate method to analyse asymmetrical and symmetrical faults on both sides of the three phase delta-star transformer which is usually required when connecting large induction motors to the supply system. System simulation, applying these two techniques, was used to study the transient stability of a power system. The response of a typical system, loaded with a group of large induction motors, two three-phase delta-star transformers, a synchronous generator and an infinite system was analysed. The computer software developed to study this system has the advantage that different types of fault at different locations can be studied by simple changes in input data. The research also involved investigating the possibility of using different integrating routines such as Runge-Kutta-Gill, RungeKutta-Fehlberg and the Predictor-Corrector methods. The investigation enables the reduction of computation time, which is necessary when solving the induction motor equations expressed in terms of the three phase variables. The outcome of this investigation was utilised in analysing an introductory model (containing only minimal control action) of an isolated system having a significant induction motor load compared to the size of the generator energising the system.
Resumo:
This work was partially supported by the Bulgarian National Science Fund under Contract No MM 1405. Part of the results were announced at the Fifth International Workshop on Optimal Codes and Related Topics (OCRT), White Lagoon, June 2007, Bulgaria
Resumo:
Respiratory gating in lung PET imaging to compensate for respiratory motion artifacts is a current research issue with broad potential impact on quantitation, diagnosis and clinical management of lung tumors. However, PET images collected at discrete bins can be significantly affected by noise as there are lower activity counts in each gated bin unless the total PET acquisition time is prolonged, so that gating methods should be combined with imaging-based motion correction and registration methods. The aim of this study was to develop and validate a fast and practical solution to the problem of respiratory motion for the detection and accurate quantitation of lung tumors in PET images. This included: (1) developing a computer-assisted algorithm for PET/CT images that automatically segments lung regions in CT images, identifies and localizes lung tumors of PET images; (2) developing and comparing different registration algorithms which processes all the information within the entire respiratory cycle and integrate all the tumor in different gated bins into a single reference bin. Four registration/integration algorithms: Centroid Based, Intensity Based, Rigid Body and Optical Flow registration were compared as well as two registration schemes: Direct Scheme and Successive Scheme. Validation was demonstrated by conducting experiments with the computerized 4D NCAT phantom and with a dynamic lung-chest phantom imaged using a GE PET/CT System. Iterations were conducted on different size simulated tumors and different noise levels. Static tumors without respiratory motion were used as gold standard; quantitative results were compared with respect to tumor activity concentration, cross-correlation coefficient, relative noise level and computation time. Comparing the results of the tumors before and after correction, the tumor activity values and tumor volumes were closer to the static tumors (gold standard). Higher correlation values and lower noise were also achieved after applying the correction algorithms. With this method the compromise between short PET scan time and reduced image noise can be achieved, while quantification and clinical analysis become fast and precise.
Resumo:
This dissertation aims to improve the performance of existing assignment-based dynamic origin-destination (O-D) matrix estimation models to successfully apply Intelligent Transportation Systems (ITS) strategies for the purposes of traffic congestion relief and dynamic traffic assignment (DTA) in transportation network modeling. The methodology framework has two advantages over the existing assignment-based dynamic O-D matrix estimation models. First, it combines an initial O-D estimation model into the estimation process to provide a high confidence level of initial input for the dynamic O-D estimation model, which has the potential to improve the final estimation results and reduce the associated computation time. Second, the proposed methodology framework can automatically convert traffic volume deviation to traffic density deviation in the objective function under congested traffic conditions. Traffic density is a better indicator for traffic demand than traffic volume under congested traffic condition, thus the conversion can contribute to improving the estimation performance. The proposed method indicates a better performance than a typical assignment-based estimation model (Zhou et al., 2003) in several case studies. In the case study for I-95 in Miami-Dade County, Florida, the proposed method produces a good result in seven iterations, with a root mean square percentage error (RMSPE) of 0.010 for traffic volume and a RMSPE of 0.283 for speed. In contrast, Zhou's model requires 50 iterations to obtain a RMSPE of 0.023 for volume and a RMSPE of 0.285 for speed. In the case study for Jacksonville, Florida, the proposed method reaches a convergent solution in 16 iterations with a RMSPE of 0.045 for volume and a RMSPE of 0.110 for speed, while Zhou's model needs 10 iterations to obtain the best solution, with a RMSPE of 0.168 for volume and a RMSPE of 0.179 for speed. The successful application of the proposed methodology framework to real road networks demonstrates its ability to provide results both with satisfactory accuracy and within a reasonable time, thus establishing its potential usefulness to support dynamic traffic assignment modeling, ITS systems, and other strategies.
Resumo:
In recent years, there has been an enormous growth of location-aware devices, such as GPS embedded cell phones, mobile sensors and radio-frequency identification tags. The age of combining sensing, processing and communication in one device, gives rise to a vast number of applications leading to endless possibilities and a realization of mobile Wireless Sensor Network (mWSN) applications. As computing, sensing and communication become more ubiquitous, trajectory privacy becomes a critical piece of information and an important factor for commercial success. While on the move, sensor nodes continuously transmit data streams of sensed values and spatiotemporal information, known as ``trajectory information". If adversaries can intercept this information, they can monitor the trajectory path and capture the location of the source node. ^ This research stems from the recognition that the wide applicability of mWSNs will remain elusive unless a trajectory privacy preservation mechanism is developed. The outcome seeks to lay a firm foundation in the field of trajectory privacy preservation in mWSNs against external and internal trajectory privacy attacks. First, to prevent external attacks, we particularly investigated a context-based trajectory privacy-aware routing protocol to prevent the eavesdropping attack. Traditional shortest-path oriented routing algorithms give adversaries the possibility to locate the target node in a certain area. We designed the novel privacy-aware routing phase and utilized the trajectory dissimilarity between mobile nodes to mislead adversaries about the location where the message started its journey. Second, to detect internal attacks, we developed a software-based attestation solution to detect compromised nodes. We created the dynamic attestation node chain among neighboring nodes to examine the memory checksum of suspicious nodes. The computation time for memory traversal had been improved compared to the previous work. Finally, we revisited the trust issue in trajectory privacy preservation mechanism designs. We used Bayesian game theory to model and analyze cooperative, selfish and malicious nodes' behaviors in trajectory privacy preservation activities.^
Resumo:
Respiratory gating in lung PET imaging to compensate for respiratory motion artifacts is a current research issue with broad potential impact on quantitation, diagnosis and clinical management of lung tumors. However, PET images collected at discrete bins can be significantly affected by noise as there are lower activity counts in each gated bin unless the total PET acquisition time is prolonged, so that gating methods should be combined with imaging-based motion correction and registration methods. The aim of this study was to develop and validate a fast and practical solution to the problem of respiratory motion for the detection and accurate quantitation of lung tumors in PET images. This included: (1) developing a computer-assisted algorithm for PET/CT images that automatically segments lung regions in CT images, identifies and localizes lung tumors of PET images; (2) developing and comparing different registration algorithms which processes all the information within the entire respiratory cycle and integrate all the tumor in different gated bins into a single reference bin. Four registration/integration algorithms: Centroid Based, Intensity Based, Rigid Body and Optical Flow registration were compared as well as two registration schemes: Direct Scheme and Successive Scheme. Validation was demonstrated by conducting experiments with the computerized 4D NCAT phantom and with a dynamic lung-chest phantom imaged using a GE PET/CT System. Iterations were conducted on different size simulated tumors and different noise levels. Static tumors without respiratory motion were used as gold standard; quantitative results were compared with respect to tumor activity concentration, cross-correlation coefficient, relative noise level and computation time. Comparing the results of the tumors before and after correction, the tumor activity values and tumor volumes were closer to the static tumors (gold standard). Higher correlation values and lower noise were also achieved after applying the correction algorithms. With this method the compromise between short PET scan time and reduced image noise can be achieved, while quantification and clinical analysis become fast and precise.
Resumo:
Speckle is being used as a characterization tool for the analysis of the dynamic of slow varying phenomena occurring in biological and industrial samples. The retrieved data takes the form of a sequence of speckle images. The analysis of these images should reveal the inner dynamic of the biological or physical process taking place in the sample. Very recently, it has been shown that principal component analysis is able to split the original data set in a collection of classes. These classes can be related with the dynamic of the observed phenomena. At the same time, statistical descriptors of biospeckle images have been used to retrieve information on the characteristics of the sample. These statistical descriptors can be calculated in almost real time and provide a fast monitoring of the sample. On the other hand, principal component analysis requires longer computation time but the results contain more information related with spatial-temporal pattern that can be identified with physical process. This contribution merges both descriptions and uses principal component analysis as a pre-processing tool to obtain a collection of filtered images where a simpler statistical descriptor can be calculated. The method has been applied to slow-varying biological and industrial processes
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
The distribution, abundance, behaviour, and morphology of marine species is affected by spatial variability in the wave environment. Maps of wave metrics (e.g. significant wave height Hs, peak energy wave period Tp, and benthic wave orbital velocity URMS) are therefore useful for predictive ecological models of marine species and ecosystems. A number of techniques are available to generate maps of wave metrics, with varying levels of complexity in terms of input data requirements, operator knowledge, and computation time. Relatively simple "fetch-based" models are generated using geographic information system (GIS) layers of bathymetry and dominant wind speed and direction. More complex, but computationally expensive, "process-based" models are generated using numerical models such as the Simulating Waves Nearshore (SWAN) model. We generated maps of wave metrics based on both fetch-based and process-based models and asked whether predictive performance in models of benthic marine habitats differed. Predictive models of seagrass distribution for Moreton Bay, Southeast Queensland, and Lizard Island, Great Barrier Reef, Australia, were generated using maps based on each type of wave model. For Lizard Island, performance of the process-based wave maps was significantly better for describing the presence of seagrass, based on Hs, Tp, and URMS. Conversely, for the predictive model of seagrass in Moreton Bay, based on benthic light availability and Hs, there was no difference in performance using the maps of the different wave metrics. For predictive models where wave metrics are the dominant factor determining ecological processes it is recommended that process-based models be used. Our results suggest that for models where wave metrics provide secondarily useful information, either fetch- or process-based models may be equally useful.