114 resultados para reinforcement
Resumo:
The response of buildings to tunnelling induced ground movements is an area of great importance for many urban tunnelling projects. This paper presents the response of two buildings to the construction of a 12 m diameter sprayed concrete lining (SCL) tunnel with face reinforcement, in Italy. Soil and structure displacements were monitored through extensive instrumentation. The settlement response of the two buildings was found to differ significantly, demonstrating both flexible and rigid response mechanisms. Comparison of the building settlement profiles with greenfield settlements enables the soil structure interaction to be quantified. Encouraging agreement between the modification to the greenfield settlement profile displayed by buildings and estimates made from existing predictive tools is observed. Potential issues for infrastructure connected to buildings, arising from the embedment of rigid buildings into the soil, are also highlighted. © 2012 Taylor & Francis Group.
Resumo:
The current procedures in post-earthquake safety and structural assessment are performed manually by a skilled triage team of structural engineers/certified inspectors. These procedures, and particularly the physical measurement of the damage properties, are time-consuming and qualitative in nature. This paper proposes a novel method that automatically detects spalled regions on the surface of reinforced concrete columns and measures their properties in image data. Spalling has been accepted as an important indicator of significant damage to structural elements during an earthquake. According to this method, the region of spalling is first isolated by way of a local entropy-based thresholding algorithm. Following this, the exposure of longitudinal reinforcement (depth of spalling into the column) and length of spalling along the column are measured using a novel global adaptive thresholding algorithm in conjunction with image processing methods in template matching and morphological operations. The method was tested on a database of damaged RC column images collected after the 2010 Haiti earthquake, and comparison of the results with manual measurements indicate the validity of the method.
Resumo:
A recent trend in spoken dialogue research is the use of reinforcement learning to train dialogue systems in a simulated environment. Past researchers have shown that the types of errors that are simulated can have a significant effect on simulated dialogue performance. Since modern systems typically receive an N-best list of possible user utterances, it is important to be able to simulate a full N-best list of hypotheses. This paper presents a new method for simulating such errors based on logistic regression, as well as a new method for simulating the structure of N-best lists of semantics and their probabilities, based on the Dirichlet distribution. Off-line evaluations show that the new Dirichlet model results in a much closer match to the receiver operating characteristics (ROC) of the live data. Experiments also show that the logistic model gives confusions that are closer to the type of confusions observed in live situations. The hope is that these new error models will be able to improve the resulting performance of trained dialogue systems. © 2012 IEEE.
Resumo:
A severe shortage of good quality donor cornea is now an international crisis in public health. Alternatives for donor tissue need to be urgently developed to meet the increasing demand for corneal transplantation. Hydrogels have been widely used as scaffolds for corneal tissue regeneration due to their large water content, similar to that of native tissue. However, these hydrogel scaffolds lack the fibrous structure that functions as a load-bearing component in the native tissue, resulting in poor mechanical performance. This work shows that mechanical properties of compliant hydrogels can be substantially enhanced with electrospun nanofiber reinforcement. Electrospun gelatin nanofibers were infiltrated with alginate hydrogels, yielding transparent fiber-reinforced hydrogels. Without prior crosslinking, electrospun gelatin nanofibers improved the tensile elastic modulus of the hydrogels from 78±19 kPa to 450±100 kPa. Stiffer hydrogels, with elastic modulus of 820±210 kPa, were obtained by crosslinking the gelatin fibers with carbodiimide hydrochloride in ethanol before the infiltration process, but at the expense of transparency. The developed fiber-reinforced hydrogels show great promise as mechanically robust scaffolds for corneal tissue engineering applications.
Resumo:
The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors, and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.
Resumo:
The origin of altruism remains one of the most enduring puzzles of human behaviour. Indeed, true altruism is often thought either not to exist, or to arise merely as a miscalculation of otherwise selfish behaviour. In this paper, we argue that altruism emerges directly from the way in which distinct human decision-making systems learn about rewards. Using insights provided by neurobiological accounts of human decision-making, we suggest that reinforcement learning in game-theoretic social interactions (habitisation over either individuals or games) and observational learning (either imitative of inference based) lead to altruistic behaviour. This arises not only as a result of computational efficiency in the face of processing complexity, but as a direct consequence of optimal inference in the face of uncertainty. Critically, we argue that the fact that evolutionary pressure acts not over the object of learning ('what' is learned), but over the learning systems themselves ('how' things are learned), enables the evolution of altruism despite the direct threat posed by free-riders.
Resumo:
Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the subjects. This can be seen in both experimental and natural circumstances. In this paper we study the consequences of importing this competition into a reinforcement learning context, and demonstrate the resulting effects in an omission schedule and a maze navigation task. The misbehavior created by Pavlovian values can be quite debilitating; we discuss how it may be disciplined.
Resumo:
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this 'exploration-exploitation' dilemma, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning (RL) theory, indicates that a dopaminergic, striatal and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical and ethological perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.
Resumo:
Termination of a painful or unpleasant event can be rewarding. However, whether the brain treats relief in a similar way as it treats natural reward is unclear, and the neural processes that underlie its representation as a motivational goal remain poorly understood. We used fMRI (functional magnetic resonance imaging) to investigate how humans learn to generate expectations of pain relief. Using a pavlovian conditioning procedure, we show that subjects experiencing prolonged experimentally induced pain can be conditioned to predict pain relief. This proceeds in a manner consistent with contemporary reward-learning theory (average reward/loss reinforcement learning), reflected by neural activity in the amygdala and midbrain. Furthermore, these reward-like learning signals are mirrored by opposite aversion-like signals in lateral orbitofrontal cortex and anterior cingulate cortex. This dual coding has parallels to 'opponent process' theories in psychology and promotes a formal account of prediction and expectation during pain.
Resumo:
A severe shortage of good quality donor cornea is now an international crisis in public health. Alternatives for donor tissue need to be urgently developed to meet the increasing demand for corneal transplantation. Hydrogels have been widely used as scaffolds for corneal tissue regeneration due to their large water content, similar to that of native tissue. However, these hydrogel scaffolds lack the fibrous structure that functions as a load-bearing component in the native tissue, resulting in poor mechanical performance. This work shows that mechanical properties of compliant hydrogels can be substantially enhanced with electrospun nanofiber reinforcement. Electrospun gelatin nanofibers were infiltrated with alginate hydrogels, yielding transparent fiber-reinforced hydrogels. Without prior crosslinking, electrospun gelatin nanofibers improved the tensile elastic modulus of the hydrogels from 78±19. kPa to 450±100. kPa. Stiffer hydrogels, with elastic modulus of 820±210. kPa, were obtained by crosslinking the gelatin fibers with carbodiimide hydrochloride in ethanol before the infiltration process, but at the expense of transparency. The developed fiber-reinforced hydrogels show great promise as mechanically robust scaffolds for corneal tissue engineering applications. © 2013 Elsevier Ltd.
Resumo:
The past decade has seen a rise of interest in Laplacian eigenmaps (LEMs) for nonlinear dimensionality reduction. LEMs have been used in spectral clustering, in semisupervised learning, and for providing efficient state representations for reinforcement learning. Here, we show that LEMs are closely related to slow feature analysis (SFA), a biologically inspired, unsupervised learning algorithm originally designed for learning invariant visual representations. We show that SFA can be interpreted as a function approximation of LEMs, where the topological neighborhoods required for LEMs are implicitly defined by the temporal structure of the data. Based on this relation, we propose a generalization of SFA to arbitrary neighborhood relations and demonstrate its applicability for spectral clustering. Finally, we review previous work with the goal of providing a unifying view on SFA and LEMs. © 2011 Massachusetts Institute of Technology.
Resumo:
Although it is widely believed that reinforcement learning is a suitable tool for describing behavioral learning, the mechanisms by which it can be implemented in networks of spiking neurons are not fully understood. Here, we show that different learning rules emerge from a policy gradient approach depending on which features of the spike trains are assumed to influence the reward signals, i.e., depending on which neural code is in effect. We use the framework of Williams (1992) to derive learning rules for arbitrary neural codes. For illustration, we present policy-gradient rules for three different example codes - a spike count code, a spike timing code and the most general "full spike train" code - and test them on simple model problems. In addition to classical synaptic learning, we derive learning rules for intrinsic parameters that control the excitability of the neuron. The spike count learning rule has structural similarities with established Bienenstock-Cooper-Munro rules. If the distribution of the relevant spike train features belongs to the natural exponential family, the learning rules have a characteristic shape that raises interesting prediction problems.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
To determine the load at which FRPs debond from concrete beams using global-energy-balance-based fracture mechanics concepts, the single most important parameter is the fracture energy of the concrete-FRP interface, which is easy to define but difficult to determine. Debonding propagates in the narrow zone of concrete, between the FRP and the (tension) steel reinforcement bars in the beam, and the presence of nearby steel bars prevents the fracture process zone, which in concrete is normally extensive, from developing fully. The paper presents a detailed discussion of the mechanism of the FRP debonding, and shows that the initiation of debonding can be regarded as a Mode I (tensile) fracture in concrete, despite being loaded primarily in shear. It is shown that the incorporation of this fracture energy in the debonding model developed by the authors, details of which are presented elsewhere, gives predictions that match the test results reported in the literature. © 2013 Elsevier Ltd.
Resumo:
Aging concrete infrastructure in developed economies and more recently constructed concrete infrastructure in the developing world are frequently found to be deficient in structural strength relative to current needs. This can be attributed to a variety of factors including deterioration, construction defects, accidental damage, changes in understanding and failure to design for future loading requirements. Strengthening existing concrete structures can be a cost and carbon effective alternative to replacement. A competitive option for the strengthening of concrete slab-on-beam structures that are deficient in shear capacity is the U-wrapping of the down-stand beam portion of the shear span with externally bonded FRP fabric. While guidance exists for the strengthening of reinforced concrete by U-wrapping, the interaction between internal steel reinforcement, concrete and external FRP in the presence of a dominant diagonal shear crack is not well understood. An approach adopted in previous work has been to explore this interaction through conventional push-off testing. In conventional push-off testing, unlike in a beam, the shear plane is parallel to the direction of loading and perpendicular to the principal fibre orientation. This paper presents a novel push-off test variation in which the shear plane is inclined at 45° to the direction of loading and the principal fibre orientation. A variety of reinforcement ratios, FRP thicknesses and FRP end conditions are modelled. The implications of inclined cracking on debonding of FRP are investigated. The suitability and relevance of inclined push-off tests for further work in this area is also assessed. © 2013, NetComposite Limited.