7 resultados para Objective function values

em CaltechTHESIS


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The connections between convexity and submodularity are explored, for purposes of minimizing and learning submodular set functions.

First, we develop a novel method for minimizing a particular class of submodular functions, which can be expressed as a sum of concave functions composed with modular functions. The basic algorithm uses an accelerated first order method applied to a smoothed version of its convex extension. The smoothing algorithm is particularly novel as it allows us to treat general concave potentials without needing to construct a piecewise linear approximation as with graph-based techniques.

Second, we derive the general conditions under which it is possible to find a minimizer of a submodular function via a convex problem. This provides a framework for developing submodular minimization algorithms. The framework is then used to develop several algorithms that can be run in a distributed fashion. This is particularly useful for applications where the submodular objective function consists of a sum of many terms, each term dependent on a small part of a large data set.

Lastly, we approach the problem of learning set functions from an unorthodox perspective---sparse reconstruction. We demonstrate an explicit connection between the problem of learning set functions from random evaluations and that of sparse signals. Based on the observation that the Fourier transform for set functions satisfies exactly the conditions needed for sparse reconstruction algorithms to work, we examine some different function classes under which uniform reconstruction is possible.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many engineering applications face the problem of bounding the expected value of a quantity of interest (performance, risk, cost, etc.) that depends on stochastic uncertainties whose probability distribution is not known exactly. Optimal uncertainty quantification (OUQ) is a framework that aims at obtaining the best bound in these situations by explicitly incorporating available information about the distribution. Unfortunately, this often leads to non-convex optimization problems that are numerically expensive to solve.

This thesis emphasizes on efficient numerical algorithms for OUQ problems. It begins by investigating several classes of OUQ problems that can be reformulated as convex optimization problems. Conditions on the objective function and information constraints under which a convex formulation exists are presented. Since the size of the optimization problem can become quite large, solutions for scaling up are also discussed. Finally, the capability of analyzing a practical system through such convex formulations is demonstrated by a numerical example of energy storage placement in power grids.

When an equivalent convex formulation is unavailable, it is possible to find a convex problem that provides a meaningful bound for the original problem, also known as a convex relaxation. As an example, the thesis investigates the setting used in Hoeffding's inequality. The naive formulation requires solving a collection of non-convex polynomial optimization problems whose number grows doubly exponentially. After structures such as symmetry are exploited, it is shown that both the number and the size of the polynomial optimization problems can be reduced significantly. Each polynomial optimization problem is then bounded by its convex relaxation using sums-of-squares. These bounds are found to be tight in all the numerical examples tested in the thesis and are significantly better than Hoeffding's bounds.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We examine voting situations in which individuals have incomplete information over each others' true preferences. In many respects, this work is motivated by a desire to provide a more complete understanding of so-called probabilistic voting.

Chapter 2 examines the similarities and differences between the incentives faced by politicians who seek to maximize expected vote share, expected plurality, or probability of victory in single member: single vote, simple plurality electoral systems. We find that, in general, the candidates' optimal policies in such an electoral system vary greatly depending on their objective function. We provide several examples, as well as a genericity result which states that almost all such electoral systems (with respect to the distributions of voter behavior) will exhibit different incentives for candidates who seek to maximize expected vote share and those who seek to maximize probability of victory.

In Chapter 3, we adopt a random utility maximizing framework in which individuals' preferences are subject to action-specific exogenous shocks. We show that Nash equilibria exist in voting games possessing such an information structure and in which voters and candidates are each aware that every voter's preferences are subject to such shocks. A special case of our framework is that in which voters are playing a Quantal Response Equilibrium (McKelvey and Palfrey (1995), (1998)). We then examine candidate competition in such games and show that, for sufficiently large electorates, regardless of the dimensionality of the policy space or the number of candidates, there exists a strict equilibrium at the social welfare optimum (i.e., the point which maximizes the sum of voters' utility functions). In two candidate contests we find that this equilibrium is unique.

Finally, in Chapter 4, we attempt the first steps towards a theory of equilibrium in games possessing both continuous action spaces and action-specific preference shocks. Our notion of equilibrium, Variational Response Equilibrium, is shown to exist in all games with continuous payoff functions. We discuss the similarities and differences between this notion of equilibrium and the notion of Quantal Response Equilibrium and offer possible extensions of our framework.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An economic air pollution control model, which determines the least cost of reaching various air quality levels, is formulated. The model takes the form of a general, nonlinear, mathematical programming problem. Primary contaminant emission levels are the independent variables. The objective function is the cost of attaining various emission levels and is to be minimized subject to constraints that given air quality levels be attained.

The model is applied to a simplified statement of the photochemical smog problem in Los Angeles County in 1975 with emissions specified by a two-dimensional vector, total reactive hydrocarbon, (RHC), and nitrogen oxide, (NOx), emissions. Air quality, also two-dimensional, is measured by the expected number of days per year that nitrogen dioxide, (NO2), and mid-day ozone, (O3), exceed standards in Central Los Angeles.

The minimum cost of reaching various emission levels is found by a linear programming model. The base or "uncontrolled" emission levels are those that will exist in 1975 with the present new car control program and with the degree of stationary source control existing in 1971. Controls, basically "add-on devices", are considered here for used cars, aircraft, and existing stationary sources. It is found that with these added controls, Los Angeles County emission levels [(1300 tons/day RHC, 1000 tons /day NOx) in 1969] and [(670 tons/day RHC, 790 tons/day NOx) at the base 1975 level], can be reduced to 260 tons/day RHC (minimum RHC program) and 460 tons/day NOx (minimum NOx program).

"Phenomenological" or statistical air quality models provide the relationship between air quality and emissions. These models estimate the relationship by using atmospheric monitoring data taken at one (yearly) emission level and by using certain simple physical assumptions, (e. g., that emissions are reduced proportionately at all points in space and time). For NO2, (concentrations assumed proportional to NOx emissions), it is found that standard violations in Central Los Angeles, (55 in 1969), can be reduced to 25, 5, and 0 days per year by controlling emissions to 800, 550, and 300 tons /day, respectively. A probabilistic model reveals that RHC control is much more effective than NOx control in reducing Central Los Angeles ozone. The 150 days per year ozone violations in 1969 can be reduced to 75, 30, 10, and 0 days per year by abating RHC emissions to 700, 450, 300, and 150 tons/day, respectively, (at the 1969 NOx emission level).

The control cost-emission level and air quality-emission level relationships are combined in a graphical solution of the complete model to find the cost of various air quality levels. Best possible air quality levels with the controls considered here are 8 O3 and 10 NO2 violations per year (minimum ozone program) or 25 O3 and 3 NO2 violations per year (minimum NO2 program) with an annualized cost of $230,000,000 (above the estimated $150,000,000 per year for the new car control program for Los Angeles County motor vehicles in 1975).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

There is a growing interest in taking advantage of possible patterns and structures in data so as to extract the desired information and overcome the curse of dimensionality. In a wide range of applications, including computer vision, machine learning, medical imaging, and social networks, the signal that gives rise to the observations can be modeled to be approximately sparse and exploiting this fact can be very beneficial. This has led to an immense interest in the problem of efficiently reconstructing a sparse signal from limited linear observations. More recently, low-rank approximation techniques have become prominent tools to approach problems arising in machine learning, system identification and quantum tomography.

In sparse and low-rank estimation problems, the challenge is the inherent intractability of the objective function, and one needs efficient methods to capture the low-dimensionality of these models. Convex optimization is often a promising tool to attack such problems. An intractable problem with a combinatorial objective can often be "relaxed" to obtain a tractable but almost as powerful convex optimization problem. This dissertation studies convex optimization techniques that can take advantage of low-dimensional representations of the underlying high-dimensional data. We provide provable guarantees that ensure that the proposed algorithms will succeed under reasonable conditions, and answer questions of the following flavor:

  • For a given number of measurements, can we reliably estimate the true signal?
  • If so, how good is the reconstruction as a function of the model parameters?

More specifically, i) Focusing on linear inverse problems, we generalize the classical error bounds known for the least-squares technique to the lasso formulation, which incorporates the signal model. ii) We show that intuitive convex approaches do not perform as well as expected when it comes to signals that have multiple low-dimensional structures simultaneously. iii) Finally, we propose convex relaxations for the graph clustering problem and give sharp performance guarantees for a family of graphs arising from the so-called stochastic block model. We pay particular attention to the following aspects. For i) and ii), we aim to provide a general geometric framework, in which the results on sparse and low-rank estimation can be obtained as special cases. For i) and iii), we investigate the precise performance characterization, which yields the right constants in our bounds and the true dependence between the problem parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The author has constructed a synthetic gene for ∝-lytic protease. Since the DNA sequence of the protein is not known, the gene was designed by using the reverse translation of ∝-lytic protease's amino acid sequence. Unique restriction sites are carefully sought in the degenerate DNA sequence to aid in future mutagenesis studies. The unique restriction sites are designed approximately 50 base pairs apart and their appropriate codons used in the DNA sequence. The codons used to construct the DNA sequence of ∝-lytic protease are preferred codons in E-coli or used in the production of β-lactamase. Codon usage is also distributed evenly to ensure that one particular codon is not heavily used. The gene is essentially constructed from the outside in. The gene is built in a stepwise fashion using plasmids as the vehicles for the ∝-lytic oligomers. The use of plasmids allows the replication and isolation of large quantities of the intermediates during gene synthesis. The ∝-lytic DNA is a double-stranded oligomer that has sufficient overhang and sticky ends to anneal correctly in the vector. After six steps of incorporating ∝-lytic DNA, the gene is completed and sequenced to ensure that the correct DNA sequence is present and that no mutations occurred in the structural gene.

β-lactamase is the other serine hydrolase studied in this thesis. The author used the class A RTEM-1 β- lactamase encoded on the plasmid pBR322 to investigate the roll of the conserved threonine residue at position 71. Cassette mutagenesis was previously used to generate all possible amino acid substitutions at position 71. The work presented here describes the purification and kinetic characterization of a T71H mutant previously constructed by S. Schultz. The mutated gene was transferred into plasmid pJN for expression and induced with IPTG. The enzyme is purified by column chromatography and FPLC to homogeneity. Kinetic studies reveal that the mutant has lower k_(cat) values on benzylpenicillin, cephalothin and 6-aminopenicillanic acid but no changes in k_m except for cephalothin which is approximately 4 times higher. The mutant did not change siginificantly in its pH profile compared to the wild-type enzyme. Also, the mutant is more sensitive to thermal denaturation as compared to the wild-type enzyme. However, experimental evidence indicates that the probable generation of a positive charge at position 71 thermally stabilized the mutant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding how transcriptional regulatory sequence maps to regulatory function remains a difficult problem in regulatory biology. Given a particular DNA sequence for a bacterial promoter region, we would like to be able to say which transcription factors bind there, how strongly they bind, and whether they interact with each other and/or RNA polymerase, with the ultimate objective of integrating knowledge of these parameters into a prediction of gene expression levels. The theoretical framework of statistical thermodynamics provides a useful framework for doing so, enabling us to predict how gene expression levels depend on transcription factor binding energies and concentrations. We used thermodynamic models, coupled with models of the sequence-dependent binding energies of transcription factors and RNAP, to construct a genotype to phenotype map for the level of repression exhibited by the lac promoter, and tested it experimentally using a set of promoter variants from E. coli strains isolated from different natural environments. For this work, we sought to ``reverse engineer'' naturally occurring promoter sequences to understand how variations in promoter sequence affects gene expression. The natural inverse of this approach is to ``forward engineer'' promoter sequences to obtain targeted levels of gene expression. We used a high precision model of RNAP-DNA sequence dependent binding energy, coupled with a thermodynamic model relating binding energy to gene expression, to predictively design and verify a suite of synthetic E. coli promoters whose expression varied over nearly three orders of magnitude.

However, although thermodynamic models enable predictions of mean levels of gene expression, it has become evident that cell-to-cell variability or ``noise'' in gene expression can also play a biologically important role. In order to address this aspect of gene regulation, we developed models based on the chemical master equation framework and used them to explore the noise properties of a number of common E. coli regulatory motifs; these properties included the dependence of the noise on parameters such as transcription factor binding strength and copy number. We then performed experiments in which these parameters were systematically varied and measured the level of variability using mRNA FISH. The results showed a clear dependence of the noise on these parameters, in accord with model predictions.

Finally, one shortcoming of the preceding modeling frameworks is that their applicability is largely limited to systems that are already well-characterized, such as the lac promoter. Motivated by this fact, we used a high throughput promoter mutagenesis assay called Sort-Seq to explore the completely uncharacterized transcriptional regulatory DNA of the E. coli mechanosensitive channel of large conductance (MscL). We identified several candidate transcription factor binding sites, and work is continuing to identify the associated proteins.