BOSTON DATA FESTIVAL: MULTI-ARMED BANDITS AND REINFORCEMENT LEARNING IN COMPUTATIONAL ADVERTISING (MICHAEL ELS)

This talk will cover the most common learning strategies to solve the multi-armed bandit problem. It will involve a python simulation environment to illustrate how the system changes under different assumptions and how prior learning can influence can seed the system. This will also been discussed from the perspective of the computational advertising framework at MaxPoint where we employ these types of strategies to algorithmically learn optimal ad serving behavior.