Speaker: Panayotis Mertikopoulos, CNRS Grenoble
The bane of decision-making in an unknown environment is regret: no rational agent would want to realize in hindsight that the decision policy they employed was strictly inferior to a crude policy prescribing the same action throughout. As such, the minimization of regret has become the linchpin of an extensive literature on online learning, with far-reaching applications in machine learning, economics, operations reserach, network science, and many other fields where online decision-making plays an important role.
In this talk, we will examine the ramifications of no-regret learning in multi-agent environments, i.e., when the agents' rewards are determined by a non-cooperative game. In this context, a natural question that arises is whether no-regret learning always leads to rationally admissible states - and, in particular, whether it converges to a Nash equilibrium? We will discuss a range of answers that can be given to this question - both positive and negative - and we will examine in detail how these answers are affected by the information available to the players.