B2 R5209
Reinforcement Learning machine learning combinatorial multi-armed bandits large action spaces limited feedback efficient exploration submodular optimization black-box optimization global optimization
machine learning Deep learning Reinforcement Learning artificial intelligence combinatorial multi-armed bandits intelligent systems black-box optimization