966 resultados para Maximum entropy methods
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
We consider the problem of estimating P(Yi + (...) + Y-n > x) by importance sampling when the Yi are i.i.d. and heavy-tailed. The idea is to exploit the cross-entropy method as a toot for choosing good parameters in the importance sampling distribution; in doing so, we use the asymptotic description that given P(Y-1 + (...) + Y-n > x), n - 1 of the Yi have distribution F and one the conditional distribution of Y given Y > x. We show in some specific parametric examples (Pareto and Weibull) how this leads to precise answers which, as demonstrated numerically, are close to being variance minimal within the parametric class under consideration. Related problems for M/G/l and GI/G/l queues are also discussed.
Resumo:
Many variables that are of interest in social science research are nominal variables with two or more categories, such as employment status, occupation, political preference, or self-reported health status. With longitudinal survey data it is possible to analyse the transitions of individuals between different employment states or occupations (for example). In the statistical literature, models for analysing categorical dependent variables with repeated observations belong to the family of models known as generalized linear mixed models (GLMMs). The specific GLMM for a dependent variable with three or more categories is the multinomial logit random effects model. For these models, the marginal distribution of the response does not have a closed form solution and hence numerical integration must be used to obtain maximum likelihood estimates for the model parameters. Techniques for implementing the numerical integration are available but are computationally intensive requiring a large amount of computer processing time that increases with the number of clusters (or individuals) in the data and are not always readily accessible to the practitioner in standard software. For the purposes of analysing categorical response data from a longitudinal social survey, there is clearly a need to evaluate the existing procedures for estimating multinomial logit random effects model in terms of accuracy, efficiency and computing time. The computational time will have significant implications as to the preferred approach by researchers. In this paper we evaluate statistical software procedures that utilise adaptive Gaussian quadrature and MCMC methods, with specific application to modeling employment status of women using a GLMM, over three waves of the HILDA survey.
Resumo:
Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.
Resumo:
PCA/FA is a method of analyzing complex data sets in which there are no clearly defined X or Y variables. It has multiple uses including the study of the pattern of variation between individual entities such as patients with particular disorders and the detailed study of descriptive variables. In most applications, variables are related to a smaller number of ‘factors’ or PCs that account for the maximum variance in the data and hence, may explain important trends among the variables. An increasingly important application of the method is in the ‘validation’ of questionnaires that attempt to relate subjective aspects of a patients experience with more objective measures of vision.
Resumo:
Several fermentation methods for the production of the enzyme dextransucrase have been employed. The theoretical aspects of these fermentation techniques have been given in the early chapters of this thesis together with a brief overview of enzyme biotechnology. A literature survey on cell recycle fermentation has been carried out followed by a survey report on dextransucrase production, purification and the reaction mechanism of dextran biosynthesis. The various experimental apparatus as employed in this research are described in detail. In particular, emphasis has been given to the development of continuous cell recycle fermenters. On the laboratory scale, fed-batch fermentations under anaerobic low agitation conditions resulted in dextransucrase activities of about 450 DSU/cm3 which are much higher than the yields reported in the literature and obtained under aerobic conditions. In conventional continuous culture the dilution rate was varied in the range between 0.375 h-1 to 0.55 h-1. The general pattern observed from the data obtained was that the enzyme activity decreased with increase in dilution rate. In these experiments the maximum value of enzyme activity was ∼74 DSU/cm3. Sparging the fermentation broth with CO2 in continuous culture appears to result in a decrease in enzyme activity. In continuous total cell recycle fermentations high steady state biomass levels were achieved but the enzyme activity was low, in the range 4 - 27 DSU/cm3. This fermentation environment affected the physiology of the microorganism. The behaviour of the cell recycle system employed in this work together with its performance and the factors that affected it are discussed in the relevant chapters. By retaining the whole broth leaving a continuous fermenter for between 1.5 - 4 h under controlled conditions, the enzyme activity was enhanced with a certain treatment from 86 DSU/cm3 to 180 DSU/cm3 which represents a 106% increase over the enzyme activity achieved by a steady-state conventional chemostat. A novel process for dextran production has been proposed based on the findings of this latter part of the experimental work.
Resumo:
The available literature concerning dextransucrase and dextran production and purification has been reviewed along with the reaction mechanisms of the enzyme. A discussion of basic fermentation theory is included, together with a brief description of bioreactor hydrodynamics and general biotechnology. The various fermenters used in this research work are described in detail, along with the various experimental techniques employed. The micro-organism Leuconostoc mesenteroides NRRL B512 (F) secretes dextransucrase in the presence of an inducer, sucrose, this being the only known inducer of the enzyme. Dextransucrase is a growth related product and a series of fed-batch fermentations have been carried out to extend the exponential growth phase of the organism. These experiments were carried out in a number of different sized vessels, ranging in size from 2.5 to 1,000 litres. Using a 16 litre vessel, dextransucrase activities in excess of 450 DSU/cm3 (21.67 U/cm3) have been obtained under non-aerated conditions. It has also been possible to achieve 442 DSU/cm3 (21.28 U/cm3) using the 1,000 litre vessel, although this has not been done consistently. A 1 litre and a 2.5 litre vessel were used for the continuous fermentations of dextransucrase. The 2.5 litre vessel was a very sophisticated MBR MiniBioreactor and was used for the majority of continuous fermentations carried out. An enzyme activity of approximately 108 DSU/cm3 (5.20 U/cm3) was achieved at a dilution rate of 0.50 h-1, which corresponds to the maximum growth rate of the cells under the process conditions. A number of continuous fermentations were operated for prolonged periods of time, with experimental run-times of up to 389 h being recorded without any incidence of contamination. The phenomenon of enzyme enhancement on hold-up of up to 100% was also noted during these fermentations, with dextransucrase of activity 89.7 DSU/cm3 (4.32 U/cm3) being boosted to 155.7 DSU/cm3 (7.50 U/cm3) following 24 hours of hold-up. These findings support the recommendation of a second reactor being placed in series with the existing vessel.
Resumo:
There has been much recent research into extracting useful diagnostic features from the electrocardiogram with numerous studies claiming impressive results. However, the robustness and consistency of the methods employed in these studies is rarely, if ever, mentioned. Hence, we propose two new methods; a biologically motivated time series derived from consecutive P-wave durations, and a mathematically motivated regularity measure. We investigate the robustness of these two methods when compared with current corresponding methods. We find that the new time series performs admirably as a compliment to the current method and the new regularity measure consistently outperforms the current measure in numerous tests on real and synthetic data.
Resumo:
A combination of experimental methods was applied at a clogged, horizontal subsurface flow (HSSF) municipal wastewater tertiary treatment wetland (TW) in the UK, to quantify the extent of surface and subsurface clogging which had resulted in undesirable surface flow. The three dimensional hydraulic conductivity profile was determined, using a purpose made device which recreates the constant head permeameter test in-situ. The hydrodynamic pathways were investigated by performing dye tracing tests with Rhodamine WT and a novel multi-channel, data-logging, flow through Fluorimeter which allows synchronous measurements to be taken from a matrix of sampling points. Hydraulic conductivity varied in all planes, with the lowest measurement of 0.1 md1 corresponding to the surface layer at the inlet, and the maximum measurement of 1550 md1 located at a 0.4m depth at the outlet. According to dye tracing results, the region where the overland flow ceased received five times the average flow, which then vertically short-circuited below the rhizosphere. The tracer break-through curve obtained from the outlet showed that this preferential flow-path accounted for approximately 80% of the flow overall and arrived 8 h before a distinctly separate secondary flow-path. The overall volumetric efficiencyof the clogged system was 71% and the hydrology was simulated using a dual-path, dead-zone storage model. It is concluded that uneven inlet distribution, continuous surface loading and high rhizosphere resistance is responsible for the clog formation observed in this system. The average inlet hydraulic conductivity was 2 md1, suggesting that current European design guidelines, which predict that the system will reach an equilibrium hydraulic conductivity of 86 md1, do not adequately describe the hydrology of mature systems.
Resumo:
Possibilities for investigations of 43 varieties of file formats (objects), joined in 10 groups; 89 information attacks, joined in 33 groups and 73 methods of compression, joined in 10 groups are described in the paper. Experimental, expert, possible and real relations between attacks’ groups, method’ groups and objects’ groups are determined by means of matrix transformations and the respective maximum and potential sets are defined. At the end assessments and conclusions for future investigation are proposed.
Resumo:
Computing the similarity between two protein structures is a crucial task in molecular biology, and has been extensively investigated. Many protein structure comparison methods can be modeled as maximum weighted clique problems in specific k-partite graphs, referred here as alignment graphs. In this paper we present both a new integer programming formulation for solving such clique problems and a dedicated branch and bound algorithm for solving the maximum cardinality clique problem. Both approaches have been integrated in VAST, a software for aligning protein 3D structures largely used in the National Center for Biotechnology Information, an original clique solver which uses the well known Bron and Kerbosch algorithm (BK). Our computational results on real protein alignment instances show that our branch and bound algorithm is up to 116 times faster than BK.
Resumo:
This paper presents a Variable neighbourhood search (VNS) approach for solving the Maximum Set Splitting Problem (MSSP). The algorithm forms a system of neighborhoods based on changing the component for an increasing number of elements. An efficient local search procedure swaps the components of pairs of elements and yields a relatively short running time. Numerical experiments are performed on the instances known in the literature: minimum hitting set and Steiner triple systems. Computational results show that the proposed VNS achieves all optimal or best known solutions in short times. The experiments indicate that the VNS compares favorably with other methods previously used for solving the MSSP. ACM Computing Classification System (1998): I.2.8.
Resumo:
2000 Mathematics Subject Classification: 62P10, 92D10, 92D30, 94A17, 62L10.