Currently I am involved in transitioning a larger corporation to more real-time processes. One of the questions that re-occurs is the following: How real-time is real-time? Are micro-second response times really adding to the bottom line. After all, consumers have contracts for a year or so. So why do it?
I have to admit. This question confused me quite a lot for some time. Gradually the mist has disappeared and a clear answer has formed. It goes like this.
Listen batch is really great. Most algorithms can be optimized in batch mode, and put online. Works, fulfills quite some needs. But still, you are going to lose out. These batches are produced once a month. By working hard, for quite a long time, you could probably run the batches every three of even two weeks. But here it comes.
The cost and complexity of speeding up batching is just going to be really high. Basically you are going to make data move more and faster between databases. It will break. For every company, there is a breaking point for batch. Beyond this breaking point the only paradigm that is going to save you is real-time. It might be that your algorithms only need updating once week. Beyond your breaking point, once a week is real-time for you. Simple.
Did I mention that a lot of batch type of algorithms find it really hard to model sequences through time? And that deep-learning allows to combine convolutional and LSTM networks? Trust me, the future is streaming; real-time.