Application of Reinforcement Learning in Portfolio Management

Last updated on Jan 13, 2023

The Bandit Problem

Determining the optimal portfolio is quite challenging because of the non-deterministic nature of financial markets. Multiarmed bandit serves as a sequential decision-making strategy under uncertainty. Researches reveal that if the bandit algorithm is combined with traditional methods, the portfolio's total performance will increase. Having sufficient market data, the bandit algorithm can improve the Sharpe ratio and the total return.

Finance Reinforcement Learning

Application of Reinforcement Learning in Portfolio Management

The Bandit Problem

Narges Goudarzi

Data Scientist Manager, Pricing