25 resultados para Reinforcement Learning,Deep Neural Networks,Python,Stable Baseline,Gym
Resumo:
Minimax lower bounds for concept learning state, for example, thatfor each sample size $n$ and learning rule $g_n$, there exists a distributionof the observation $X$ and a concept $C$ to be learnt such that the expectederror of $g_n$ is at least a constant times $V/n$, where $V$ is the VC dimensionof the concept class. However, these bounds do not tell anything about therate of decrease of the error for a {\sl fixed} distribution--concept pair.\\In this paper we investigate minimax lower bounds in such a--stronger--sense.We show that for several natural $k$--parameter concept classes, includingthe class of linear halfspaces, the class of balls, the class of polyhedrawith a certain number of faces, and a class of neural networks, for any{\sl sequence} of learning rules $\{g_n\}$, there exists a fixed distributionof $X$ and a fixed concept $C$ such that the expected error is larger thana constant times $k/n$ for {\sl infinitely many n}. We also obtain suchstrong minimax lower bounds for the tail distribution of the probabilityof error, which extend the corresponding minimax lower bounds.
Resumo:
The control and prediction of wastewater treatment plants poses an important goal: to avoid breaking the environmental balance by always keeping the system in stable operating conditions. It is known that qualitative information — coming from microscopic examinations and subjective remarks — has a deep influence on the activated sludge process. In particular, on the total amount of effluent suspended solids, one of the measures of overall plant performance. The search for an input–output model of this variable and the prediction of sudden increases (bulking episodes) is thus a central concern to ensure the fulfillment of current discharge limitations. Unfortunately, the strong interrelationbetween variables, their heterogeneity and the very high amount of missing information makes the use of traditional techniques difficult, or even impossible. Through the combined use of several methods — rough set theory and artificial neural networks, mainly — reasonable prediction models are found, which also serve to show the different importance of variables and provide insight into the process dynamics
Resumo:
An assortment of human behaviors is thought to be driven by rewards including reinforcement learning, novelty processing, learning, decision making, economic choice, incentive motivation, and addiction. In each case the ventral tegmental area/ventral striatum (nucleus accumbens) (VTAVS) system has been implicated as a key structure by functional imaging studies, mostly on the basis of standard, univariate analyses. Here we propose that standard functional magnetic resonance imaging analysis needs to be complemented by methods that take into account the differential connectivity of the VTAVS system in the different behavioral contexts in order to describe reward based processes more appropriately. We fi rst consider the wider network for reward processing as it emerged from animal experimentation. Subsequently, an example for a method to assess functional connectivity is given. Finally, we illustrate the usefulness of such analyses by examples regarding reward valuation, reward expectation and the role of reward in addiction.
Resumo:
Utilizing the well-known Ultimatum Game, this note presents the following phenomenon. If we start with simple stimulus-response agents, learning through naive reinforcement, and then grant them some introspective capabilities, we get outcomes that are not closer but farther away from the fully introspective game-theoretic approach. The cause of this is the following: there is an asymmetry in the information that agents can deduce from their experience, and this leads to a bias in their learning process.
Resumo:
Utilizing the well-known Ultimatum Game, this note presents the following phenomenon. If we start with simple stimulus-response agents,learning through naive reinforcement, and then grant them some introspective capabilities, we get outcomes that are not closer but farther away from the fully introspective game-theoretic approach. The cause of this is the following: there is an asymmetry in the information that agents can deduce from their experience, and this leads to a bias in their learning process.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
We study the relationship between topological scales and dynamic time scales in complex networks. The analysis is based on the full dynamics towards synchronization of a system of coupled oscillators. In the synchronization process, modular structures corresponding to well-defined communities of nodes emerge in different time scales, ordered in a hierarchical way. The analysis also provides a useful connection between synchronization dynamics, complex networks topology, and spectral graph analysis.
Resumo:
A recent method used to optimize biased neural networks with low levels of activity is applied to a hierarchical model. As a consequence, the performance of the system is strongly enhanced. The steps to achieve optimization are analyzed in detail.
Resumo:
Motivated by experiments on activity in neuronal cultures [J. Soriano, M. Rodr ́ıguez Mart́ınez, T. Tlusty, and E. Moses, Proc. Natl. Acad. Sci. 105, 13758 (2008)], we investigate the percolation transition and critical exponents of spatially embedded Erd̋os-Ŕenyi networks with degree correlations. In our model networks, nodes are randomly distributed in a two-dimensional spatial domain, and the connection probability depends on Euclidian link length by a power law as well as on the degrees of linked nodes. Generally, spatial constraints lead to higher percolation thresholds in the sense that more links are needed to achieve global connectivity. However, degree correlations favor or do not favor percolation depending on the connectivity rules. We employ two construction methods to introduce degree correlations. In the first one, nodes stay homogeneously distributed and are connected via a distance- and degree-dependent probability. We observe that assortativity in the resulting network leads to a decrease of the percolation threshold. In the second construction methods, nodes are first spatially segregated depending on their degree and afterwards connected with a distance-dependent probability. In this segregated model, we find a threshold increase that accompanies the rising assortativity. Additionally, when the network is constructed in a disassortative way, we observe that this property has little effect on the percolation transition.
Resumo:
We develop an analytical approach to the susceptible-infected-susceptible epidemic model that allows us to unravel the true origin of the absence of an epidemic threshold in heterogeneous networks. We find that a delicate balance between the number of high degree nodes in the network and the topological distance between them dictates the existence or absence of such a threshold. In particular, small-world random networks with a degree distribution decaying slower than an exponential have a vanishing epidemic threshold in the thermodynamic limit.