wizardasebo.blogg.se

Multi arm bandit games
Multi arm bandit games





multi arm bandit games

Furthermore, the Braess paradox does not occur to the extent proposed originally when travelers are risk-averse. It is shown by examples that the risk-averse behavior of travelers in a stochastic congestion game can improve the price of anarchy in Pigou and Braess networks. Furthermore, the stochastic congestion games are studied from a risk-averse perspective and three classes of equilibria are proposed for such games. pure and mixed risk-averse R-ABADI equilibrium and strict dominance, are studied in the new framework and the results are expanded to finite-time games. The fundamental properties of games, e.g. The payoff distributions are taken into account to derive the risk-averse equilibrium, while the expected payoffs are used to find the Nash equilibrium. We use a similar idea for games and propose a risk-averse R-ABADI equilibrium in game theory that is possibly different from the Nash equilibrium. Instead, we propose a new definition of score that is derived from the joint distribution of all arm rewards and captures the reward of an arm relative to those of all other arms. The goal of the classical multi-armed bandits is to exploit the arm with the maximum score defined as the expected value of the arm reward. In this manner, a specific class of multi-armed bandits, called explore-then-commit bandits, and stochastic games are studied in this dissertation, which are based on the notion of Risk-Averse Best Action Decision with Incomplete Information (R-ABADI, Abadi is the maiden name of the author's mother). The author believes that human beings are mostly risk-averse, so studying multi-armed bandits and game theory from the point of view of risk aversion, rather than expected reward/payoff, better captures reality. The focus of this dissertation is to study the fundamental limits of the existing bandits and game theory problems in a risk-averse framework and propose new ideas that address the shortcomings. In contrast, the rewards and the payoffs are often random variables whose expected values only capture a vague idea of the overall distribution. The multi-armed bandit (MAB) and game theory literature is mainly focused on the expected cumulative reward and the expected payoffs in a game, respectively. University of Illinois at Urbana-Champaign Hajek, Bruce Shomorony, Ilan Srikant, Rayadurgam

multi arm bandit games

Risk-averse multi-armed bandits and game theory Application/pdf YEKKEHKHANY-DISSERTATION-2020.pdf (7MB)







Multi arm bandit games