Multi-arm Bandits

You are faced with $n$ options. After each choice, you receive a numerical reward chosen from a stationary probability distribution that depends on the action you selected.

Read More