8 resultados para Model transformation learning
em CaltechTHESIS
Resumo:
The molecular inputs necessary for cell behavior are vital to our understanding of development and disease. Proper cell behavior is necessary for processes ranging from creating one’s face (neural crest migration) to spreading cancer from one tissue to another (invasive metastatic cancers). Identifying the genes and tissues involved in cell behavior not only increases our understanding of biology but also has the potential to create targeted therapies in diseases hallmarked by aberrant cell behavior.
A well-characterized model system is key to determining the molecular and spatial inputs necessary for cell behavior. In this work I present the C. elegans uterine seam cell (utse) as an ideal model for studying cell outgrowth and shape change. The utse is an H-shaped cell within the hermaphrodite uterus that functions in attaching the uterus to the body wall. Over L4 larval stage, the utse grows bidirectionally along the anterior-posterior axis, changing from an ellipsoidal shape to an elongated H-shape. Spatially, the utse requires the presence of the uterine toroid cells, sex muscles, and the anchor cell nucleus in order to properly grow outward. Several gene families are involved in utse development, including Trio, Nav, Rab GTPases, Arp2/3, as well as 54 other genes found from a candidate RNAi screen. The utse can be used as a model system for studying metastatic cancer. Meprin proteases are involved in promoting invasiveness of metastatic cancers and the meprin-likw genes nas-21, nas-22, and toh-1 act similarly within the utse. Studying nas-21 activity has also led to the discovery of novel upstream inhibitors and activators as well as targets of nas-21, some of which have been characterized to affect meprin activity. This illustrates that the utse can be used as an in vivo model for learning more about meprins, as well as various other proteins involved in metastasis.
Resumo:
These studies explore how, where, and when representations of variables critical to decision-making are represented in the brain. In order to produce a decision, humans must first determine the relevant stimuli, actions, and possible outcomes before applying an algorithm that will select an action from those available. When choosing amongst alternative stimuli, the framework of value-based decision-making proposes that values are assigned to the stimuli and that these values are then compared in an abstract “value space” in order to produce a decision. Despite much progress, in particular regarding the pinpointing of ventromedial prefrontal cortex (vmPFC) as a region that encodes the value, many basic questions remain. In Chapter 2, I show that distributed BOLD signaling in vmPFC represents the value of stimuli under consideration in a manner that is independent of the type of stimulus it is. Thus the open question of whether value is represented in abstraction, a key tenet of value-based decision-making, is confirmed. However, I also show that stimulus-dependent value representations are also present in the brain during decision-making and suggest a potential neural pathway for stimulus-to-value transformations that integrates these two results.
More broadly speaking, there is both neural and behavioral evidence that two distinct control systems are at work during action selection. These two systems compose the “goal-directed system”, which selects actions based on an internal model of the environment, and the “habitual” system, which generates responses based on antecedent stimuli only. Computational characterizations of these two systems imply that they have different informational requirements in terms of input stimuli, actions, and possible outcomes. Associative learning theory predicts that the habitual system should utilize stimulus and action information only, while goal-directed behavior requires that outcomes as well as stimuli and actions be processed. In Chapter 3, I test whether areas of the brain hypothesized to be involved in habitual versus goal-directed control represent the corresponding theorized variables.
The question of whether one or both of these neural systems drives Pavlovian conditioning is less well-studied. Chapter 4 describes an experiment in which subjects were scanned while engaged in a Pavlovian task with a simple non-trivial structure. After comparing a variety of model-based and model-free learning algorithms (thought to underpin goal-directed and habitual decision-making, respectively), it was found that subjects’ reaction times were better explained by a model-based system. In addition, neural signaling of precision, a variable based on a representation of a world model, was found in the amygdala. These data indicate that the influence of model-based representations of the environment can extend even to the most basic learning processes.
Knowledge of the state of hidden variables in an environment is required for optimal inference regarding the abstract decision structure of a given environment and therefore can be crucial to decision-making in a wide range of situations. Inferring the state of an abstract variable requires the generation and manipulation of an internal representation of beliefs over the values of the hidden variable. In Chapter 5, I describe behavioral and neural results regarding the learning strategies employed by human subjects in a hierarchical state-estimation task. In particular, a comprehensive model fit and comparison process pointed to the use of "belief thresholding". This implies that subjects tended to eliminate low-probability hypotheses regarding the state of the environment from their internal model and ceased to update the corresponding variables. Thus, in concert with incremental Bayesian learning, humans explicitly manipulate their internal model of the generative process during hierarchical inference consistent with a serial hypothesis testing strategy.
Resumo:
Optical Coherence Tomography(OCT) is a popular, rapidly growing imaging technique with an increasing number of bio-medical applications due to its noninvasive nature. However, there are three major challenges in understanding and improving an OCT system: (1) Obtaining an OCT image is not easy. It either takes a real medical experiment or requires days of computer simulation. Without much data, it is difficult to study the physical processes underlying OCT imaging of different objects simply because there aren't many imaged objects. (2) Interpretation of an OCT image is also hard. This challenge is more profound than it appears. For instance, it would require a trained expert to tell from an OCT image of human skin whether there is a lesion or not. This is expensive in its own right, but even the expert cannot be sure about the exact size of the lesion or the width of the various skin layers. The take-away message is that analyzing an OCT image even from a high level would usually require a trained expert, and pixel-level interpretation is simply unrealistic. The reason is simple: we have OCT images but not their underlying ground-truth structure, so there is nothing to learn from. (3) The imaging depth of OCT is very limited (millimeter or sub-millimeter on human tissues). While OCT utilizes infrared light for illumination to stay noninvasive, the downside of this is that photons at such long wavelengths can only penetrate a limited depth into the tissue before getting back-scattered. To image a particular region of a tissue, photons first need to reach that region. As a result, OCT signals from deeper regions of the tissue are both weak (since few photons reached there) and distorted (due to multiple scatterings of the contributing photons). This fact alone makes OCT images very hard to interpret.
This thesis addresses the above challenges by successfully developing an advanced Monte Carlo simulation platform which is 10000 times faster than the state-of-the-art simulator in the literature, bringing down the simulation time from 360 hours to a single minute. This powerful simulation tool not only enables us to efficiently generate as many OCT images of objects with arbitrary structure and shape as we want on a common desktop computer, but it also provides us the underlying ground-truth of the simulated images at the same time because we dictate them at the beginning of the simulation. This is one of the key contributions of this thesis. What allows us to build such a powerful simulation tool includes a thorough understanding of the signal formation process, clever implementation of the importance sampling/photon splitting procedure, efficient use of a voxel-based mesh system in determining photon-mesh interception, and a parallel computation of different A-scans that consist a full OCT image, among other programming and mathematical tricks, which will be explained in detail later in the thesis.
Next we aim at the inverse problem: given an OCT image, predict/reconstruct its ground-truth structure on a pixel level. By solving this problem we would be able to interpret an OCT image completely and precisely without the help from a trained expert. It turns out that we can do much better. For simple structures we are able to reconstruct the ground-truth of an OCT image more than 98% correctly, and for more complicated structures (e.g., a multi-layered brain structure) we are looking at 93%. We achieved this through extensive uses of Machine Learning. The success of the Monte Carlo simulation already puts us in a great position by providing us with a great deal of data (effectively unlimited), in the form of (image, truth) pairs. Through a transformation of the high-dimensional response variable, we convert the learning task into a multi-output multi-class classification problem and a multi-output regression problem. We then build a hierarchy architecture of machine learning models (committee of experts) and train different parts of the architecture with specifically designed data sets. In prediction, an unseen OCT image first goes through a classification model to determine its structure (e.g., the number and the types of layers present in the image); then the image is handed to a regression model that is trained specifically for that particular structure to predict the length of the different layers and by doing so reconstruct the ground-truth of the image. We also demonstrate that ideas from Deep Learning can be useful to further improve the performance.
It is worth pointing out that solving the inverse problem automatically improves the imaging depth, since previously the lower half of an OCT image (i.e., greater depth) can be hardly seen but now becomes fully resolved. Interestingly, although OCT signals consisting the lower half of the image are weak, messy, and uninterpretable to human eyes, they still carry enough information which when fed into a well-trained machine learning model spits out precisely the true structure of the object being imaged. This is just another case where Artificial Intelligence (AI) outperforms human. To the best knowledge of the author, this thesis is not only a success but also the first attempt to reconstruct an OCT image at a pixel level. To even give a try on this kind of task, it would require fully annotated OCT images and a lot of them (hundreds or even thousands). This is clearly impossible without a powerful simulation tool like the one developed in this thesis.
Resumo:
This thesis discusses various methods for learning and optimization in adaptive systems. Overall, it emphasizes the relationship between optimization, learning, and adaptive systems; and it illustrates the influence of underlying hardware upon the construction of efficient algorithms for learning and optimization. Chapter 1 provides a summary and an overview.
Chapter 2 discusses a method for using feed-forward neural networks to filter the noise out of noise-corrupted signals. The networks use back-propagation learning, but they use it in a way that qualifies as unsupervised learning. The networks adapt based only on the raw input data-there are no external teachers providing information on correct operation during training. The chapter contains an analysis of the learning and develops a simple expression that, based only on the geometry of the network, predicts performance.
Chapter 3 explains a simple model of the piriform cortex, an area in the brain involved in the processing of olfactory information. The model was used to explore the possible effect of acetylcholine on learning and on odor classification. According to the model, the piriform cortex can classify odors better when acetylcholine is present during learning but not present during recall. This is interesting since it suggests that learning and recall might be separate neurochemical modes (corresponding to whether or not acetylcholine is present). When acetylcholine is turned off at all times, even during learning, the model exhibits behavior somewhat similar to Alzheimer's disease, a disease associated with the degeneration of cells that distribute acetylcholine.
Chapters 4, 5, and 6 discuss algorithms appropriate for adaptive systems implemented entirely in analog hardware. The algorithms inject noise into the systems and correlate the noise with the outputs of the systems. This allows them to estimate gradients and to implement noisy versions of gradient descent, without having to calculate gradients explicitly. The methods require only noise generators, adders, multipliers, integrators, and differentiators; and the number of devices needed scales linearly with the number of adjustable parameters in the adaptive systems. With the exception of one global signal, the algorithms require only local information exchange.
Resumo:
We investigate the 2d O(3) model with the standard action by Monte Carlo simulation at couplings β up to 2.05. We measure the energy density, mass gap and susceptibility of the model, and gather high statistics on lattices of size L ≤ 1024 using the Floating Point Systems T-series vector hypercube and the Thinking Machines Corp.'s Connection Machine 2. Asymptotic scaling does not appear to set in for this action, even at β = 2.10, where the correlation length is 420. We observe a 20% difference between our estimate m/Λ^─_(Ms) = 3.52(6) at this β and the recent exact analytical result . We use the overrelaxation algorithm interleaved with Metropolis updates and show that decorrelation time scales with the correlation length and the number of overrelaxation steps per sweep. We determine its effective dynamical critical exponent to be z' = 1.079(10); thus critical slowing down is reduced significantly for this local algorithm that is vectorizable and parallelizable.
We also use the cluster Monte Carlo algorithms, which are non-local Monte Carlo update schemes which can greatly increase the efficiency of computer simulations of spin models. The major computational task in these algorithms is connected component labeling, to identify clusters of connected sites on a lattice. We have devised some new SIMD component labeling algorithms, and implemented them on the Connection Machine. We investigate their performance when applied to the cluster update of the two dimensional Ising spin model.
Finally we use a Monte Carlo Renormalization Group method to directly measure the couplings of block Hamiltonians at different blocking levels. For the usual averaging block transformation we confirm the renormalized trajectory (RT) observed by Okawa. For another improved probabilistic block transformation we find the RT, showing that it is much closer to the Standard Action. We then use this block transformation to obtain the discrete β-function of the model which we compare to the perturbative result. We do not see convergence, except when using a rescaled coupling β_E to effectively resum the series. For the latter case we see agreement for m/ Λ^─_(Ms) at , β = 2.14, 2.26, 2.38 and 2.50. To three loops m/Λ^─_(Ms) = 3.047(35) at β = 2.50, which is very close to the exact value m/ Λ^─_(Ms) = 2.943. Our last point at β = 2.62 disagrees with this estimate however.
Resumo:
A large number of technologically important materials undergo solid-solid phase transformations. Examples range from ferroelectrics (transducers and memory devices), zirconia (Thermal Barrier Coatings) to nickel superalloys and (lithium) iron phosphate (Li-ion batteries). These transformations involve a change in the crystal structure either through diffusion of species or local rearrangement of atoms. This change of crystal structure leads to a macroscopic change of shape or volume or both and results in internal stresses during the transformation. In certain situations this stress field gives rise to cracks (tin, iron phosphate etc.) which continue to propagate as the transformation front traverses the material. In other materials the transformation modifies the stress field around cracks and effects crack growth behavior (zirconia, ferroelectrics). These observations serve as our motivation to study cracks in solids undergoing phase transformations. Understanding these effects will help in improving the mechanical reliability of the devices employing these materials.
In this thesis we present work on two problems concerning the interplay between cracks and phase transformations. First, we consider the directional growth of a set of parallel edge cracks due to a solid-solid transformation. We conclude from our analysis that phase transformations can lead to formation of parallel edge cracks when the transformation strain satisfies certain conditions and the resulting cracks grow all the way till their tips cross over the phase boundary. Moreover the cracks continue to grow as the phase boundary traverses into the interior of the body at a uniform spacing without any instabilities. There exists an optimal value for the spacing between the cracks. We ascertain these conclusion by performing numerical simulations using finite elements.
Second, we model the effect of the semiconducting nature and dopants on cracks in ferroelectric perovskite materials, particularly barium titanate. Traditional approaches to model fracture in these materials have treated them as insulators. In reality, they are wide bandgap semiconductors with oxygen vacancies and trace impurities acting as dopants. We incorporate the space charge arising due the semiconducting effect and dopant ionization in a phase field model for the ferroelectric. We derive the governing equations by invoking the dissipation inequality over a ferroelectric domain containing a crack. This approach also yields the driving force acting on the crack. Our phase field simulations of polarization domain evolution around a crack show the accumulation of electronic charge on the crack surface making it more permeable than was previously believed so, as seen in recent experiments. We also discuss the effect the space charge has on domain formation and the crack driving force.
Resumo:
Inspired by key experimental and analytical results regarding Shape Memory Alloys (SMAs), we propose a modelling framework to explore the interplay between martensitic phase transformations and plastic slip in polycrystalline materials, with an eye towards computational efficiency. The resulting framework uses a convexified potential for the internal energy density to capture the stored energy associated with transformation at the meso-scale, and introduces kinetic potentials to govern the evolution of transformation and plastic slip. The framework is novel in the way it treats plasticity on par with transformation.
We implement the framework in the setting of anti-plane shear, using a staggered implicit/explict update: we first use a Fast-Fourier Transform (FFT) solver based on an Augmented Lagrangian formulation to implicitly solve for the full-field displacements of a simulated polycrystal, then explicitly update the volume fraction of martensite and plastic slip using their respective stick-slip type kinetic laws. We observe that, even in this simple setting with an idealized material comprising four martensitic variants and four slip systems, the model recovers a rich variety of SMA type behaviors. We use this model to gain insight into the isothermal behavior of stress-stabilized martensite, looking at the effects of the relative plastic yield strength, the memory of deformation history under non-proportional loading, and several others.
We extend the framework to the generalized 3-D setting, for which the convexified potential is a lower bound on the actual internal energy, and show that the fully implicit discrete time formulation of the framework is governed by a variational principle for mechanical equilibrium. We further propose an extension of the method to finite deformations via an exponential mapping. We implement the generalized framework using an existing Optimal Transport Mesh-free (OTM) solver. We then model the $\alpha$--$\gamma$ and $\alpha$--$\varepsilon$ transformations in pure iron, with an initial attempt in the latter to account for twinning in the parent phase. We demonstrate the scalability of the framework to large scale computing by simulating Taylor impact experiments, observing nearly linear (ideal) speed-up through 256 MPI tasks. Finally, we present preliminary results of a simulated Split-Hopkinson Pressure Bar (SHPB) experiment using the $\alpha$--$\varepsilon$ model.
Resumo:
The olfactory bulb of mammals aids in the discrimination of odors. A mathematical model based on the bulbar anatomy and electrophysiology is described. Simulations of the highly non-linear model produce a 35-60 Hz modulated activity, which is coherent across the bulb. The decision states (for the odor information) in this system can be thought of as stable cycles, rather than as point stable states typical of simpler neuro-computing models. Analysis shows that a group of coupled non-linear oscillators are responsible for the oscillatory activities. The output oscillation pattern of the bulb is determined by the odor input. The model provides a framework in which to understand the transformation between odor input and bulbar output to the olfactory cortex. This model can also be extended to other brain areas such as the hippocampus, thalamus, and neocortex, which show oscillatory neural activities. There is significant correspondence between the model behavior and observed electrophysiology.
It has also been suggested that the olfactory bulb, the first processing center after the sensory cells in the olfactory pathway, plays a role in olfactory adaptation, odor sensitivity enhancement by motivation, and other olfactory psychophysical phenomena. The input from the higher olfactory centers to the inhibitory cells in the bulb are shown to be able to modulate the response, and thus the sensitivity, of the bulb to odor input. It follows that the bulb can decrease its sensitivity to a pre-existing and detected odor (adaptation) while remaining sensitive to new odors, or can increase its sensitivity to discover interesting new odors. Other olfactory psychophysical phenomena such as cross-adaptation are also discussed.