Ask HN: Did bandits go out of fashion?

12 points by dira3 8 months ago

About 10 years ago, multi-armed bandits (and especially contextual bandits) were really important for optimization and business use cases.

But nowadays you hardly hear about it, despite optimization continuing to be a need. The only major open source library continues to be Vowpal Wabbit - and even that is not very robustly maintained or documented.

What are people doing for bandits or optimization needs nowadays? And what are some active, Python-centric, open-source libraries with a strong user community and stable code?

softwaredoug 8 months ago

I think what happens is traditional, supervised ML kind of comes through and steamrolls everything due to the proliferation of tools to operationalize and train models. There's just inertia behind one family of ML that other solutions end up with much more bespoke solutions. Mindshare builds momentum into more mindshare...

hruk 8 months ago

We've used this library for Bayesian contextual bandits in production (we have a critical business use case supported by a ~200K feature sparse Linear UCB bandit). It's a small community, but it's also a small enough codebase that we've read through all of it and feel fine about maintaining it ourselves in case it goes inactive.

https://github.com/bayesianbandits/bayesianbandits

arromatic 8 months ago

I have been looking for resources about contextual bandits too . But i failed finding anything good . Closest thing i found was Vowpal wabbit , same as you and same problem - unmaintained . Tried searching hn about bandits but didn't find anything useful either .

I was looking for algorithms that can find new interest of users . I felt like after all these research all i learned is the ancient technique of showing x percent of random items to users .

PaulHoule 8 months ago

Personally I have an RSS reader that does content based recommendation, my personal evaluation is that it is `great` so I am in no hurry to improve it. So far as bandits go I blend in 10% random items to help it learn and calibrate, I am thinking of raising that to 20% but I have no objective criteria to decide what that fraction should be.