7 resultados para Hyperbolic smoothing
em CaltechTHESIS
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.
Resumo:
The connections between convexity and submodularity are explored, for purposes of minimizing and learning submodular set functions.
First, we develop a novel method for minimizing a particular class of submodular functions, which can be expressed as a sum of concave functions composed with modular functions. The basic algorithm uses an accelerated first order method applied to a smoothed version of its convex extension. The smoothing algorithm is particularly novel as it allows us to treat general concave potentials without needing to construct a piecewise linear approximation as with graph-based techniques.
Second, we derive the general conditions under which it is possible to find a minimizer of a submodular function via a convex problem. This provides a framework for developing submodular minimization algorithms. The framework is then used to develop several algorithms that can be run in a distributed fashion. This is particularly useful for applications where the submodular objective function consists of a sum of many terms, each term dependent on a small part of a large data set.
Lastly, we approach the problem of learning set functions from an unorthodox perspective---sparse reconstruction. We demonstrate an explicit connection between the problem of learning set functions from random evaluations and that of sparse signals. Based on the observation that the Fourier transform for set functions satisfies exactly the conditions needed for sparse reconstruction algorithms to work, we examine some different function classes under which uniform reconstruction is possible.
Resumo:
In the quest for a descriptive theory of decision-making, the rational actor model in economics imposes rather unrealistic expectations and abilities on human decision makers. The further we move from idealized scenarios, such as perfectly competitive markets, and ambitiously extend the reach of the theory to describe everyday decision making situations, the less sense these assumptions make. Behavioural economics has instead proposed models based on assumptions that are more psychologically realistic, with the aim of gaining more precision and descriptive power. Increased psychological realism, however, comes at the cost of a greater number of parameters and model complexity. Now there are a plethora of models, based on different assumptions, applicable in differing contextual settings, and selecting the right model to use tends to be an ad-hoc process. In this thesis, we develop optimal experimental design methods and evaluate different behavioral theories against evidence from lab and field experiments.
We look at evidence from controlled laboratory experiments. Subjects are presented with choices between monetary gambles or lotteries. Different decision-making theories evaluate the choices differently and would make distinct predictions about the subjects' choices. Theories whose predictions are inconsistent with the actual choices can be systematically eliminated. Behavioural theories can have multiple parameters requiring complex experimental designs with a very large number of possible choice tests. This imposes computational and economic constraints on using classical experimental design methods. We develop a methodology of adaptive tests: Bayesian Rapid Optimal Adaptive Designs (BROAD) that sequentially chooses the "most informative" test at each stage, and based on the response updates its posterior beliefs over the theories, which informs the next most informative test to run. BROAD utilizes the Equivalent Class Edge Cutting (EC2) criteria to select tests. We prove that the EC2 criteria is adaptively submodular, which allows us to prove theoretical guarantees against the Bayes-optimal testing sequence even in the presence of noisy responses. In simulated ground-truth experiments, we find that the EC2 criteria recovers the true hypotheses with significantly fewer tests than more widely used criteria such as Information Gain and Generalized Binary Search. We show, theoretically as well as experimentally, that surprisingly these popular criteria can perform poorly in the presence of noise, or subject errors. Furthermore, we use the adaptive submodular property of EC2 to implement an accelerated greedy version of BROAD which leads to orders of magnitude speedup over other methods.
We use BROAD to perform two experiments. First, we compare the main classes of theories for decision-making under risk, namely: expected value, prospect theory, constant relative risk aversion (CRRA) and moments models. Subjects are given an initial endowment, and sequentially presented choices between two lotteries, with the possibility of losses. The lotteries are selected using BROAD, and 57 subjects from Caltech and UCLA are incentivized by randomly realizing one of the lotteries chosen. Aggregate posterior probabilities over the theories show limited evidence in favour of CRRA and moments' models. Classifying the subjects into types showed that most subjects are described by prospect theory, followed by expected value. Adaptive experimental design raises the possibility that subjects could engage in strategic manipulation, i.e. subjects could mask their true preferences and choose differently in order to obtain more favourable tests in later rounds thereby increasing their payoffs. We pay close attention to this problem; strategic manipulation is ruled out since it is infeasible in practice, and also since we do not find any signatures of it in our data.
In the second experiment, we compare the main theories of time preference: exponential discounting, hyperbolic discounting, "present bias" models: quasi-hyperbolic (α, β) discounting and fixed cost discounting, and generalized-hyperbolic discounting. 40 subjects from UCLA were given choices between 2 options: a smaller but more immediate payoff versus a larger but later payoff. We found very limited evidence for present bias models and hyperbolic discounting, and most subjects were classified as generalized hyperbolic discounting types, followed by exponential discounting.
In these models the passage of time is linear. We instead consider a psychological model where the perception of time is subjective. We prove that when the biological (subjective) time is positively dependent, it gives rise to hyperbolic discounting and temporal choice inconsistency.
We also test the predictions of behavioral theories in the "wild". We pay attention to prospect theory, which emerged as the dominant theory in our lab experiments of risky choice. Loss aversion and reference dependence predicts that consumers will behave in a uniquely distinct way than the standard rational model predicts. Specifically, loss aversion predicts that when an item is being offered at a discount, the demand for it will be greater than that explained by its price elasticity. Even more importantly, when the item is no longer discounted, demand for its close substitute would increase excessively. We tested this prediction using a discrete choice model with loss-averse utility function on data from a large eCommerce retailer. Not only did we identify loss aversion, but we also found that the effect decreased with consumers' experience. We outline the policy implications that consumer loss aversion entails, and strategies for competitive pricing.
In future work, BROAD can be widely applicable for testing different behavioural models, e.g. in social preference and game theory, and in different contextual settings. Additional measurements beyond choice data, including biological measurements such as skin conductance, can be used to more rapidly eliminate hypothesis and speed up model comparison. Discrete choice models also provide a framework for testing behavioural models with field data, and encourage combined lab-field experiments.
Resumo:
Kohn-Sham density functional theory (KSDFT) is currently the main work-horse of quantum mechanical calculations in physics, chemistry, and materials science. From a mechanical engineering perspective, we are interested in studying the role of defects in the mechanical properties in materials. In real materials, defects are typically found at very small concentrations e.g., vacancies occur at parts per million, dislocation density in metals ranges from $10^{10} m^{-2}$ to $10^{15} m^{-2}$, and grain sizes vary from nanometers to micrometers in polycrystalline materials, etc. In order to model materials at realistic defect concentrations using DFT, we would need to work with system sizes beyond millions of atoms. Due to the cubic-scaling computational cost with respect to the number of atoms in conventional DFT implementations, such system sizes are unreachable. Since the early 1990s, there has been a huge interest in developing DFT implementations that have linear-scaling computational cost. A promising approach to achieving linear-scaling cost is to approximate the density matrix in KSDFT. The focus of this thesis is to provide a firm mathematical framework to study the convergence of these approximations. We reformulate the Kohn-Sham density functional theory as a nested variational problem in the density matrix, the electrostatic potential, and a field dual to the electron density. The corresponding functional is linear in the density matrix and thus amenable to spectral representation. Based on this reformulation, we introduce a new approximation scheme, called spectral binning, which does not require smoothing of the occupancy function and thus applies at arbitrarily low temperatures. We proof convergence of the approximate solutions with respect to spectral binning and with respect to an additional spatial discretization of the domain. For a standard one-dimensional benchmark problem, we present numerical experiments for which spectral binning exhibits excellent convergence characteristics and outperforms other linear-scaling methods.
Resumo:
A study is made of the accuracy of electronic digital computer calculations of ground displacement and response spectra from strong-motion earthquake accelerograms. This involves an investigation of methods of the preparatory reduction of accelerograms into a form useful for the digital computation and of the accuracy of subsequent digital calculations. Various checks are made for both the ground displacement and response spectra results, and it is concluded that the main errors are those involved in digitizing the original record. Differences resulting from various investigators digitizing the same experimental record may become as large as 100% of the maximum computed ground displacements. The spread of the results of ground displacement calculations is greater than that of the response spectra calculations. Standardized methods of adjustment and calculation are recommended, to minimize such errors.
Studies are made of the spread of response spectral values about their mean. The distribution is investigated experimentally by Monte Carlo techniques using an electric analog system with white noise excitation, and histograms are presented indicating the dependence of the distribution on the damping and period of the structure. Approximate distributions are obtained analytically by confirming and extending existing results with accurate digital computer calculations. A comparison of the experimental and analytical approaches indicates good agreement for low damping values where the approximations are valid. A family of distribution curves to be used in conjunction with existing average spectra is presented. The combination of analog and digital computations used with Monte Carlo techniques is a promising approach to the statistical problems of earthquake engineering.
Methods of analysis of very small earthquake ground motion records obtained simultaneously at different sites are discussed. The advantages of Fourier spectrum analysis for certain types of studies and methods of calculation of Fourier spectra are presented. The digitizing and analysis of several earthquake records is described and checks are made of the dependence of results on digitizing procedure, earthquake duration and integration step length. Possible dangers of a direct ratio comparison of Fourier spectra curves are pointed out and the necessity for some type of smoothing procedure before comparison is established. A standard method of analysis for the study of comparative ground motion at different sites is recommended.
Resumo:
Jet noise reduction is an important goal within both commercial and military aviation. Although large-scale numerical simulations are now able to simultaneously compute turbulent jets and their radiated sound, lost-cost, physically-motivated models are needed to guide noise-reduction efforts. A particularly promising modeling approach centers around certain large-scale coherent structures, called wavepackets, that are observed in jets and their radiated sound. The typical approach to modeling wavepackets is to approximate them as linear modal solutions of the Euler or Navier-Stokes equations linearized about the long-time mean of the turbulent flow field. The near-field wavepackets obtained from these models show compelling agreement with those educed from experimental and simulation data for both subsonic and supersonic jets, but the acoustic radiation is severely under-predicted in the subsonic case. This thesis contributes to two aspects of these models. First, two new solution methods are developed that can be used to efficiently compute wavepackets and their acoustic radiation, reducing the computational cost of the model by more than an order of magnitude. The new techniques are spatial integration methods and constitute a well-posed, convergent alternative to the frequently used parabolized stability equations. Using concepts related to well-posed boundary conditions, the methods are formulated for general hyperbolic equations and thus have potential applications in many fields of physics and engineering. Second, the nonlinear and stochastic forcing of wavepackets is investigated with the goal of identifying and characterizing the missing dynamics responsible for the under-prediction of acoustic radiation by linear wavepacket models for subsonic jets. Specifically, we use ensembles of large-eddy-simulation flow and force data along with two data decomposition techniques to educe the actual nonlinear forcing experienced by wavepackets in a Mach 0.9 turbulent jet. Modes with high energy are extracted using proper orthogonal decomposition, while high gain modes are identified using a novel technique called empirical resolvent-mode decomposition. In contrast to the flow and acoustic fields, the forcing field is characterized by a lack of energetic coherent structures. Furthermore, the structures that do exist are largely uncorrelated with the acoustic field. Instead, the forces that most efficiently excite an acoustic response appear to take the form of random turbulent fluctuations, implying that direct feedback from nonlinear interactions amongst wavepackets is not an essential noise source mechanism. This suggests that the essential ingredients of sound generation in high Reynolds number jets are contained within the linearized Navier-Stokes operator rather than in the nonlinear forcing terms, a conclusion that has important implications for jet noise modeling.
Resumo:
This investigation deals with certain generalizations of the classical uniqueness theorem for the second boundary-initial value problem in the linearized dynamical theory of not necessarily homogeneous nor isotropic elastic solids. First, the regularity assumptions underlying the foregoing theorem are relaxed by admitting stress fields with suitably restricted finite jump discontinuities. Such singularities are familiar from known solutions to dynamical elasticity problems involving discontinuous surface tractions or non-matching boundary and initial conditions. The proof of the appropriate uniqueness theorem given here rests on a generalization of the usual energy identity to the class of singular elastodynamic fields under consideration.
Following this extension of the conventional uniqueness theorem, we turn to a further relaxation of the customary smoothness hypotheses and allow the displacement field to be differentiable merely in a generalized sense, thereby admitting stress fields with square-integrable unbounded local singularities, such as those encountered in the presence of focusing of elastic waves. A statement of the traction problem applicable in these pathological circumstances necessitates the introduction of "weak solutions'' to the field equations that are accompanied by correspondingly weakened boundary and initial conditions. A uniqueness theorem pertaining to this weak formulation is then proved through an adaptation of an argument used by O. Ladyzhenskaya in connection with the first boundary-initial value problem for a second-order hyperbolic equation in a single dependent variable. Moreover, the second uniqueness theorem thus obtained contains, as a special case, a slight modification of the previously established uniqueness theorem covering solutions that exhibit only finite stress-discontinuities.