We’ve had some interesting discussion recently on the future goals of reinforcement learning. Penetration into industry seems to be a hot topic within the community, with the consensus being that reinforcement learning is currently too difficult to apply to problems without expertise. A related issue is that RL has no standard tookit, unlike machine learning, which now has Weka among others.
Then, of course, there is the search for the killer app, a reinforcement learning application that demonstrably revolutionizes an industry. The killer app must of course be preceded by a killer problem. Allow me to humbly submit one such potential area of application: Revenue Management.
I am most familiar with airline revenue management, the method by which airlines choose prices for the seats they sell. Most of the interesting dynamics of airline revenue management policies come from two key properties.
- Customer behavior is difficult to predict.
- An empty seat is perishable.
Modern airlines have next to no capital. Everything from the terminal, offices, runways, baggage carts, and planes is rented or leased. What airlines are is an orchestration of the revenue that flows through the organization of these combined resources, revenue that starts in your wallet.
The optimization problem is to pick prices for seats that maximize profit. Airlines have a number of nifty tricks to do this, like observing that certain kinds of travelers (business travelers with expense budgets) book late whereas recreational flyers book early. This allows for price discrimination.
Of course this optimization problem has a natural reward signal – profit, a reasonably tractable set of actions (setting prices) and a nice random state transition (customers purchase seats). Enter reinforcement learning.
UPDATE: A quick search of Google scholar shows that some work has already been done in this area: link. link. It’s nice to know that I’m on the curve, even if I’m behind on it.