868 resultados para Bayesian algorithm
Resumo:
It is presented a software developed with Delphi programming language to compute the reservoir's annual regulated active storage, based on the sequent-peak algorithm. Mathematical models used for that purpose generally require extended hydrological series. Usually, the analysis of those series is performed with spreadsheets or graphical representations. Based on that, it was developed a software for calculation of reservoir active capacity. An example calculation is shown by 30-years (from 1977 to 2009) monthly mean flow historical data, from Corrente River, located at São Francisco River Basin, Brazil. As an additional tool, an interface was developed to manage water resources, helping to manipulate data and to point out information that it would be of interest to the user. Moreover, with that interface irrigation districts where water consumption is higher can be analyzed as a function of specific seasonal water demands situations. From a practical application, it is possible to conclude that the program provides the calculation originally proposed. It was designed to keep information organized and retrievable at any time, and to show simulation on seasonal water demands throughout the year, contributing with the elements of study concerning reservoir projects. This program, with its functionality, is an important tool for decision making in the water resources management.
Resumo:
Statistical analyses of measurements that can be described by statistical models are of essence in astronomy and in scientific inquiry in general. The sensitivity of such analyses, modelling approaches, and the consequent predictions, is sometimes highly dependent on the exact techniques applied, and improvements therein can result in significantly better understanding of the observed system of interest. Particularly, optimising the sensitivity of statistical techniques in detecting the faint signatures of low-mass planets orbiting the nearby stars is, together with improvements in instrumentation, essential in estimating the properties of the population of such planets, and in the race to detect Earth-analogs, i.e. planets that could support liquid water and, perhaps, life on their surfaces. We review the developments in Bayesian statistical techniques applicable to detections planets orbiting nearby stars and astronomical data analysis problems in general. We also discuss these techniques and demonstrate their usefulness by using various examples and detailed descriptions of the respective mathematics involved. We demonstrate the practical aspects of Bayesian statistical techniques by describing several algorithms and numerical techniques, as well as theoretical constructions, in the estimation of model parameters and in hypothesis testing. We also apply these algorithms to Doppler measurements of nearby stars to show how they can be used in practice to obtain as much information from the noisy data as possible. Bayesian statistical techniques are powerful tools in analysing and interpreting noisy data and should be preferred in practice whenever computational limitations are not too restrictive.
Resumo:
The determination of the intersection curve between Bézier Surfaces may be seen as the composition of two separated problems: determining initial points and tracing the intersection curve from these points. The Bézier Surface is represented by a parametric function (polynomial with two variables) that maps a point in the tridimensional space from the bidimensional parametric space. In this article, it is proposed an algorithm to determine the initial points of the intersection curve of Bézier Surfaces, based on the solution of polynomial systems with the Projected Polyhedral Method, followed by a method for tracing the intersection curves (Marching Method with differential equations). In order to allow the use of the Projected Polyhedral Method, the equations of the system must be represented in terms of the Bernstein basis, and towards this goal it is proposed a robust and reliable algorithm to exactly transform a multivariable polynomial in terms of power basis to a polynomial written in terms of Bernstein basis .
Resumo:
In this paper we present an algorithm for the numerical simulation of the cavitation in the hydrodynamic lubrication of journal bearings. Despite the fact that this physical process is usually modelled as a free boundary problem, we adopted the equivalent variational inequality formulation. We propose a two-level iterative algorithm, where the outer iteration is associated to the penalty method, used to transform the variational inequality into a variational equation, and the inner iteration is associated to the conjugate gradient method, used to solve the linear system generated by applying the finite element method to the variational equation. This inner part was implemented using the element by element strategy, which is easily parallelized. We analyse the behavior of two physical parameters and discuss some numerical results. Also, we analyse some results related to the performance of a parallel implementation of the algorithm.
Resumo:
Med prediktion avses att man skattar det framtida värdet på en observerbar storhet. Kännetecknande för det bayesianska paradigmet är att osäkerhet gällande okända storheter uttrycks i form av sannolikheter. En bayesiansk prediktiv modell är således en sannolikhetsfördelning över de möjliga värden som en observerbar, men ännu inte observerad storhet kan anta. I de artiklar som ingår i avhandlingen utvecklas metoder, vilka bl.a. tillämpas i analys av kromatografiska data i brottsutredningar. Med undantag för den första artikeln, bygger samtliga metoder på bayesiansk prediktiv modellering. I artiklarna betraktas i huvudsak tre olika typer av problem relaterade till kromatografiska data: kvantifiering, parvis matchning och klustring. I den första artikeln utvecklas en icke-parametrisk modell för mätfel av kromatografiska analyser av alkoholhalt i blodet. I den andra artikeln utvecklas en prediktiv inferensmetod för jämförelse av två stickprov. Metoden tillämpas i den tredje artik eln för jämförelse av oljeprover i syfte att kunna identifiera den förorenande källan i samband med oljeutsläpp. I den fjärde artikeln härleds en prediktiv modell för klustring av data av blandad diskret och kontinuerlig typ, vilken bl.a. tillämpas i klassificering av amfetaminprover med avseende på produktionsomgångar.
Resumo:
The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.
Resumo:
The growing population in cities increases the energy demand and affects the environment by increasing carbon emissions. Information and communications technology solutions which enable energy optimization are needed to address this growing energy demand in cities and to reduce carbon emissions. District heating systems optimize the energy production by reusing waste energy with combined heat and power plants. Forecasting the heat load demand in residential buildings assists in optimizing energy production and consumption in a district heating system. However, the presence of a large number of factors such as weather forecast, district heating operational parameters and user behavioural parameters, make heat load forecasting a challenging task. This thesis proposes a probabilistic machine learning model using a Naive Bayes classifier, to forecast the hourly heat load demand for three residential buildings in the city of Skellefteå, Sweden over a period of winter and spring seasons. The district heating data collected from the sensors equipped at the residential buildings in Skellefteå, is utilized to build the Bayesian network to forecast the heat load demand for horizons of 1, 2, 3, 6 and 24 hours. The proposed model is validated by using four cases to study the influence of various parameters on the heat load forecast by carrying out trace driven analysis in Weka and GeNIe. Results show that current heat load consumption and outdoor temperature forecast are the two parameters with most influence on the heat load forecast. The proposed model achieves average accuracies of 81.23 % and 76.74 % for a forecast horizon of 1 hour in the three buildings for winter and spring seasons respectively. The model also achieves an average accuracy of 77.97 % for three buildings across both seasons for the forecast horizon of 1 hour by utilizing only 10 % of the training data. The results indicate that even a simple model like Naive Bayes classifier can forecast the heat load demand by utilizing less training data.
Resumo:
This thesis concerns the analysis of epidemic models. We adopt the Bayesian paradigm and develop suitable Markov Chain Monte Carlo (MCMC) algorithms. This is done by considering an Ebola outbreak in the Democratic Republic of Congo, former Zaïre, 1995 as a case of SEIR epidemic models. We model the Ebola epidemic deterministically using ODEs and stochastically through SDEs to take into account a possible bias in each compartment. Since the model has unknown parameters, we use different methods to estimate them such as least squares, maximum likelihood and MCMC. The motivation behind choosing MCMC over other existing methods in this thesis is that it has the ability to tackle complicated nonlinear problems with large number of parameters. First, in a deterministic Ebola model, we compute the likelihood function by sum of square of residuals method and estimate parameters using the LSQ and MCMC methods. We sample parameters and then use them to calculate the basic reproduction number and to study the disease-free equilibrium. From the sampled chain from the posterior, we test the convergence diagnostic and confirm the viability of the model. The results show that the Ebola model fits the observed onset data with high precision, and all the unknown model parameters are well identified. Second, we convert the ODE model into a SDE Ebola model. We compute the likelihood function using extended Kalman filter (EKF) and estimate parameters again. The motivation of using the SDE formulation here is to consider the impact of modelling errors. Moreover, the EKF approach allows us to formulate a filtered likelihood for the parameters of such a stochastic model. We use the MCMC procedure to attain the posterior distributions of the parameters of the SDE Ebola model drift and diffusion parts. In this thesis, we analyse two cases: (1) the model error covariance matrix of the dynamic noise is close to zero , i.e. only small stochasticity added into the model. The results are then similar to the ones got from deterministic Ebola model, even if methods of computing the likelihood function are different (2) the model error covariance matrix is different from zero, i.e. a considerable stochasticity is introduced into the Ebola model. This accounts for the situation where we would know that the model is not exact. As a results, we obtain parameter posteriors with larger variances. Consequently, the model predictions then show larger uncertainties, in accordance with the assumption of an incomplete model.
Resumo:
This work presents synopsis of efficient strategies used in power managements for achieving the most economical power and energy consumption in multicore systems, FPGA and NoC Platforms. In this work, a practical approach was taken, in an effort to validate the significance of the proposed Adaptive Power Management Algorithm (APMA), proposed for system developed, for this thesis project. This system comprise arithmetic and logic unit, up and down counters, adder, state machine and multiplexer. The essence of carrying this project firstly, is to develop a system that will be used for this power management project. Secondly, to perform area and power synopsis of the system on these various scalable technology platforms, UMC 90nm nanotechnology 1.2v, UMC 90nm nanotechnology 1.32v and UMC 0.18 μmNanotechnology 1.80v, in order to examine the difference in area and power consumption of the system on the platforms. Thirdly, to explore various strategies that can be used to reducing system’s power consumption and to propose an adaptive power management algorithm that can be used to reduce the power consumption of the system. The strategies introduced in this work comprise Dynamic Voltage Frequency Scaling (DVFS) and task parallelism. After the system development, it was run on FPGA board, basically NoC Platforms and on these various technology platforms UMC 90nm nanotechnology1.2v, UMC 90nm nanotechnology 1.32v and UMC180 nm nanotechnology 1.80v, the system synthesis was successfully accomplished, the simulated result analysis shows that the system meets all functional requirements, the power consumption and the area utilization were recorded and analyzed in chapter 7 of this work. This work extensively reviewed various strategies for managing power consumption which were quantitative research works by many researchers and companies, it's a mixture of study analysis and experimented lab works, it condensed and presents the whole basic concepts of power management strategy from quality technical papers.
Resumo:
This thesis introduces the Salmon Algorithm, a search meta-heuristic which can be used for a variety of combinatorial optimization problems. This algorithm is loosely based on the path finding behaviour of salmon swimming upstream to spawn. There are a number of tunable parameters in the algorithm, so experiments were conducted to find the optimum parameter settings for different search spaces. The algorithm was tested on one instance of the Traveling Salesman Problem and found to have superior performance to an Ant Colony Algorithm and a Genetic Algorithm. It was then tested on three coding theory problems - optimal edit codes, optimal Hamming distance codes, and optimal covering codes. The algorithm produced improvements on the best known values for five of six of the test cases using edit codes. It matched the best known results on four out of seven of the Hamming codes as well as three out of three of the covering codes. The results suggest the Salmon Algorithm is competitive with established guided random search techniques, and may be superior in some search spaces.
Resumo:
The purpose of this study is to examine the impact of the choice of cut-off points, sampling procedures, and the business cycle on the accuracy of bankruptcy prediction models. Misclassification can result in erroneous predictions leading to prohibitive costs to firms, investors and the economy. To test the impact of the choice of cut-off points and sampling procedures, three bankruptcy prediction models are assessed- Bayesian, Hazard and Mixed Logit. A salient feature of the study is that the analysis includes both parametric and nonparametric bankruptcy prediction models. A sample of firms from Lynn M. LoPucki Bankruptcy Research Database in the U. S. was used to evaluate the relative performance of the three models. The choice of a cut-off point and sampling procedures were found to affect the rankings of the various models. In general, the results indicate that the empirical cut-off point estimated from the training sample resulted in the lowest misclassification costs for all three models. Although the Hazard and Mixed Logit models resulted in lower costs of misclassification in the randomly selected samples, the Mixed Logit model did not perform as well across varying business-cycles. In general, the Hazard model has the highest predictive power. However, the higher predictive power of the Bayesian model, when the ratio of the cost of Type I errors to the cost of Type II errors is high, is relatively consistent across all sampling methods. Such an advantage of the Bayesian model may make it more attractive in the current economic environment. This study extends recent research comparing the performance of bankruptcy prediction models by identifying under what conditions a model performs better. It also allays a range of user groups, including auditors, shareholders, employees, suppliers, rating agencies, and creditors' concerns with respect to assessing failure risk.
Resumo:
Understanding the machinery of gene regulation to control gene expression has been one of the main focuses of bioinformaticians for years. We use a multi-objective genetic algorithm to evolve a specialized version of side effect machines for degenerate motif discovery. We compare some suggested objectives for the motifs they find, test different multi-objective scoring schemes and probabilistic models for the background sequence models and report our results on a synthetic dataset and some biological benchmarking suites. We conclude with a comparison of our algorithm with some widely used motif discovery algorithms in the literature and suggest future directions for research in this area.
Resumo:
DNA assembly is among the most fundamental and difficult problems in bioinformatics. Near optimal assembly solutions are available for bacterial and small genomes, however assembling large and complex genomes especially the human genome using Next-Generation-Sequencing (NGS) technologies is shown to be very difficult because of the highly repetitive and complex nature of the human genome, short read lengths, uneven data coverage and tools that are not specifically built for human genomes. Moreover, many algorithms are not even scalable to human genome datasets containing hundreds of millions of short reads. The DNA assembly problem is usually divided into several subproblems including DNA data error detection and correction, contig creation, scaffolding and contigs orientation; each can be seen as a distinct research area. This thesis specifically focuses on creating contigs from the short reads and combining them with outputs from other tools in order to obtain better results. Three different assemblers including SOAPdenovo [Li09], Velvet [ZB08] and Meraculous [CHS+11] are selected for comparative purposes in this thesis. Obtained results show that this thesis’ work produces comparable results to other assemblers and combining our contigs to outputs from other tools, produces the best results outperforming all other investigated assemblers.
Resumo:
Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances.