955 resultados para Conditional entropy
Resumo:
Reinforcement Learning (RL) provides a powerful framework to address sequential decision-making problems in which the transition dynamics is unknown or too complex to be represented. The RL approach is based on speculating what is the best decision to make given sample estimates obtained from previous interactions, a recipe that led to several breakthroughs in various domains, ranging from game playing to robotics. Despite their success, current RL methods hardly generalize from one task to another, and achieving the kind of generalization obtained through unsupervised pre-training in non-sequential problems seems unthinkable. Unsupervised RL has recently emerged as a way to improve generalization of RL methods. Just as its non-sequential counterpart, the unsupervised RL framework comprises two phases: An unsupervised pre-training phase, in which the agent interacts with the environment without external feedback, and a supervised fine-tuning phase, in which the agent aims to efficiently solve a task in the same environment by exploiting the knowledge acquired during pre-training. In this thesis, we study unsupervised RL via state entropy maximization, in which the agent makes use of the unsupervised interactions to pre-train a policy that maximizes the entropy of its induced state distribution. First, we provide a theoretical characterization of the learning problem by considering a convex RL formulation that subsumes state entropy maximization. Our analysis shows that maximizing the state entropy in finite trials is inherently harder than RL. Then, we study the state entropy maximization problem from an optimization perspective. Especially, we show that the primal formulation of the corresponding optimization problem can be (approximately) addressed through tractable linear programs. Finally, we provide the first practical methodologies for state entropy maximization in complex domains, both when the pre-training takes place in a single environment as well as multiple environments.
Resumo:
In this PhD thesis a new firm level conditional risk measure is developed. It is named Joint Value at Risk (JVaR) and is defined as a quantile of a conditional distribution of interest, where the conditioning event is a latent upper tail event. It addresses the problem of how risk changes under extreme volatility scenarios. The properties of JVaR are studied based on a stochastic volatility representation of the underlying process. We prove that JVaR is leverage consistent, i.e. it is an increasing function of the dependence parameter in the stochastic representation. A feasible class of nonparametric M-estimators is introduced by exploiting the elicitability of quantiles and the stochastic ordering theory. Consistency and asymptotic normality of the two stage M-estimator are derived, and a simulation study is reported to illustrate its finite-sample properties. Parametric estimation methods are also discussed. The relation with the VaR is exploited to introduce a volatility contribution measure, and a tail risk measure is also proposed. The analysis of the dynamic JVaR is presented based on asymmetric stochastic volatility models. Empirical results with S&P500 data show that accounting for extreme volatility levels is relevant to better characterize the evolution of risk. The work is complemented by a review of the literature, where we provide an overview on quantile risk measures, elicitable functionals and several stochastic orderings.
Resumo:
The current climate crisis requires a comprehensive understanding of biodiversity to acknowledge how ecosystems’ responses to anthropogenic disturbances may result in feedback that can either mitigate or exacerbate global warming. Although ecosystems are dynamic and macroecological patterns change drastically in response to disturbance, dynamic macroecology has received insufficient attention and theoretical formalisation. In this context, the maximum entropy principle (MaxEnt) could provide an effective inference procedure to study ecosystems. Since the improper usage of entropy outside its scope often leads to misconceptions, the opening chapter will clarify its meaning by following its evolution from classical thermodynamics to information theory. The second chapter introduces the study of ecosystems from a physicist’s viewpoint. In particular, the MaxEnt Theory of Ecology (METE) will be the cornerstone of the discussion. METE predicts the shapes of macroecological metrics in relatively static ecosystems using constraints imposed by static state variables. However, in disturbed ecosystems with macroscale state variables that change rapidly over time, its predictions tend to fail. In the final chapter, DynaMETE is therefore presented as an extension of METE from static to dynamic. By predicting how macroecological patterns are likely to change in response to perturbations, DynaMETE can contribute to a better understanding of disturbed ecosystems’ fate and the improvement of conservation and management of carbon sinks, like forests. Targeted strategies in ecosystem management are now indispensable to enhance the interdependence of human well-being and the health of ecosystems, thus avoiding climate change tipping points.
Resumo:
This article analyzes food insecurity and hunger in Brazilian families with children under five years of age. This was a nationally representative cross-sectional study using data from the National Demographic and Health Survey on Women and Children (PNDS-2006), in which the outcome variable was moderate to severe food insecurity, measured by the Brazilian Food Insecurity Scale (EBIA). Prevalence estimates and prevalence ratios were generated with 95% confidence intervals. The results showed a high prevalence of moderate to severe food insecurity, concentrated in the North and Northeast regions (30.7%), in economic classes D and E (34%), and in beneficiaries of conditional cash transfer programs (36.5%). Multivariate analysis showed that the socioeconomic relative risks (beneficiaries of conditional cash transfers), regional relative risks (North and Northeast regions), and economic relative risks (classes D and E) were 1.8, 2.0 and 2.4, respectively. Aggregation of the three risks showed 48% of families with moderate to severe food insecurity, meaning that adults and children were going hungry during the three months preceding the survey.
Resumo:
In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student's t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student's t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.
Resumo:
Transfer of reaction products formed on the surfaces of two mutually rubbed dielectric solids makes an important if not dominating contribution to triboelectricity. New evidence in support of this statement is presented in this report, based on analytical electron microscopy coupled to electrostatic potential mapping techniques. Mechanical action on contacting surface asperities transforms them into hot-spots for free-radical formation, followed by electron transfer producing cationic and anionic polymer fragments, according to their electronegativity. Polymer ions accumulate creating domains with excess charge because they are formed at fracture surfaces of pulled-out asperities. Another factor for charge segregation is the low polymer mixing entropy, following Flory and Huggins. The formation of fractal charge patterns that was previously described is thus the result of polymer fragment fractal scatter on both contacting surfaces. The present results contribute to the explanation of the centuries-old difficulties for understanding the triboelectric series and triboelectricity in general, as well as the dissipative nature of friction, and they may lead to better control of friction and its consequences.
Resumo:
A complex iridium oxide β-Li_{2}IrO_{3} crystallizes in a hyperhoneycomb structure, a three-dimensional analogue of honeycomb lattice, and is found to be a spin-orbital Mott insulator with J_{eff}=1/2 moment. Ir ions are connected to the three neighboring Ir ions via Ir-O_{2}-Ir bonding planes, which very likely gives rise to bond-dependent ferromagnetic interactions between the J_{eff}=1/2 moments, an essential ingredient of Kitaev model with a spin liquid ground state. Dominant ferromagnetic interaction between J_{eff}=1/2 moments is indeed confirmed by the temperature dependence of magnetic susceptibility χ(T) which shows a positive Curie-Weiss temperature θ_{CW}∼+40 K. A magnetic ordering with a very small entropy change, likely associated with a noncollinear arrangement of J_{eff}=1/2 moments, is observed at T_{c}=38 K. With the application of magnetic field to the ordered state, a large moment of more than 0.35 μ_{B}/Ir is induced above 3 T, a substantially polarized J_{eff}=1/2 state. We argue that the close proximity to ferromagnetism and the presence of large fluctuations evidence that the ground state of hyperhoneycomb β-Li_{2}IrO_{3} is located in close proximity of a Kitaev spin liquid.
Resumo:
The copper and cadmium complexation properties in natural sediment suspensions of reservoirs of the Tietê River were studied using the solid membrane copper and cadmium ion-selective electrodes. The complexation and the average conditional stability constants were determined under equilibrium conditions at pH=6.00 ± 0.05 in a medium of 1.0 mol L-1 sodium nitrate, using the Scatchard method. The copper and cadmium electrodes presented Nernstian behavior from 1x10-6 to 1x10-3 mol L-1 of total metal concentration. Scatchard graphs suggest two classes of binding sites for both metals. A multivariate study was done to correlate the reservoirs and the variables: complexation properties, size, total organic carbon, volatile acid sulfide, E II and pH.
Resumo:
Considering intrinsic characteristics of the system exclusively, both statistical and information theory interpretations of the second law are used to provide more comprehensive meanings for the concepts of entropy, temperature, and Helmholtz and Gibbs energies. The coherence of Clausius inequality to these concepts is emphasized. The aim of this work is to re-discuss the second law of thermodynamics in accordance to homogeneous processes thermodynamics, a temporal science which is the very special oversimplification of continuum mechanics for spatially constant intensive properties.
Resumo:
This article analyzes food insecurity and hunger in Brazilian families with children under five years of age. This was a nationally representative cross-sectional study using data from the National Demographic and Health Survey on Women and Children (PNDS-2006), in which the outcome variable was moderate to severe food insecurity, measured by the Brazilian Food Insecurity Scale (EBIA). Prevalence estimates and prevalence ratios were generated with 95% confidence intervals. The results showed a high prevalence of moderate to severe food insecurity, concentrated in the North and Northeast regions (30.7%), in economic classes D and E (34%), and in beneficiaries of conditional cash transfer programs (36.5%). Multivariate analysis showed that the socioeconomic relative risks (beneficiaries of conditional cash transfers), regional relative risks (North and Northeast regions), and economic relative risks (classes D and E) were 1.8, 2.0 and 2.4, respectively. Aggregation of the three risks showed 48% of families with moderate to severe food insecurity, meaning that adults and children were going hungry during the three months preceding the survey.
Resumo:
This paper studies complex sentences with temporal hypotatic clauses and with conditional hypotatic clauses in order to investigate the degree of grammaticalization shown by these two kinds of utterances. Our hypothesis is that the more the hypotatic clause is integrated to the nuclear clause, the greater is the degree of grammaticalization. Such degree of integration was measured according to three groups of factors, and the results show that, regarding two of the variables evaluated, the conditional clauses are the most integrated to their nucleus, but, in another rank of evaluation, the temporal clauses are the most integrated ones. Considering that this study is based on a functionalist view, the results may be interpreted according to the principle that there is a competition of motivations in the use of language, so that each utterance reflects the balance of such forces.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
Universidade Estadual de Campinas. Faculdade de Educação Física
Resumo:
Neste trabalho analisamos as conexões entre entropia, reversibilidade, irreversibilidade, teorema H e equação de transporte de Boltzmann e o teorema de retorno de Poincaré. Estes tópicos são estudados separadamente em muitos artigos e livros, mas não são em geral analisados em conjunto mostrando as relações entre eles como fizemos aqui. Procuramos redigir o artigo didaticamente seguindo um caminho que achamos ser o mais simples possível a fim de tornar o conteúdo acessível aos alunos de graduação de física.
Resumo:
A expansão da tríplice continência em unidades com quatro ou mais elementos abriu novas perspectivas para a compreensão de comportamentos complexos, como a emergência de respostas que derivam da formação de classes de estímulos equivalentes e que modelam comportamentos simbólicos e conceituais. Na investigação experimental, o procedimento de matching to sample tem sido frequentemente empregado para estabelecer discriminações condicionais. Em particular, a obtenção do matching de identidade generalizado é considerada demonstrativa da aquisição dos conceitos de igualdade e diferença. Segundo argumentamos, o fato de se buscar a compreensão desses conceitos a partir de processos discriminativos condicionais pode ter sido responsável pelos frequentes fracassos em demonstrá-los em sujeitos não humanos. A falta de correspondência entre os processos discriminativos responsáveis por estabelecer a relação de reflexividade entre estímulos que formam classes equivalentes e o matching de identidade generalizado, nesse sentido, é aqui revista ao longo de estudos empíricos e discutida com respeito às suas implicações.