Achilles, the tortoise and batch processing

Batch processing is my bête noire. Getting rid of it is my white whale.

Why? Because it represents one of the major blockers in our progress towards a modern digital enterprise.

My history of working with batches has followed the pattern of this conversation:

“How often do you need this information loaded?” 

“It’s from finance – we only need it monthly…”

[Implements monthly processing]

“Actually, we’re now comparing it with some weekly information from another system which comes in weekly.”

[Processing now done weekly]

“Actually, daily would be helpful to keep in step with our morning meeting …”

[Increase frequency to daily]

“Can we do it more often, say, every six hours?”

“We can’t wait for six hours – we need this for order processing”

[Hourly processing put into play]

“We have people relying on this – they can’t keep checking back hourly …”

[Batch increased to every 15 minutes and then I give up]

These days, everyone wants everything immediately, in real time – because that’s the rate at which digital businesses operate.

And that’s the way we should be building systems from now on. But old habits die hard – lots of us still implement batch processing and then increase the frequency in order to simulate real time information flows. But, like Zeno’s paradox in which Achilles never overtakes the tortoise, you never actually get to real time when you implement it this way. 

You have to reset your thinking and start implementing your information flows as a stream of events. That way you get true real time information flowing out of your system right from the outset. Your programming paradigms and technology will have to change, of course, but apart from some of the simplicity of event-driven programming, you get some serious advantages:

  1. You eliminate waste. There are lots of types of waste, but you eliminate the most important one: the waste of people’s time. Whenever you batch up changes, you force people to come back to the systems they use to see if the changes have registered in the system. Remember refreshing the COVID vaccination site’s appointment booking form? Every time you did that unsuccessfully, you wasted your time. It would have been much simpler if they’d just notified you when the free slot arrived – or, even better, booked it for you. Despite this recent example of queued information sharing, many of us are still implementing systems that refresh overnight and so forcing our users to play a guessing game of when what they need will be available.
  2. It spreads your processing out. You’re dealing with events as they happen in the system, and you can match your capacity to things that are relatively predictable like when people are busiest. This is ideal for systems designed to scale horizontally.
    People will tell you that some processes just can’t be managed in this way. I used to work a lot with financial systems, for example, and month end was always a combination of anxious finger-crossing that everything would work out while we crammed four days of processing into three calendar days. Which was followed by frantic manual fixing when it didn’t. But even with finance, you can do a lot incrementally if you set your mind on it. Lots of downstream systems would appreciate that information as soon as it’s available. And who wouldn’t want to be able to have a just-in-time profit and loss statement available at any time of the month?
  3. Breaking up processing like this into lots of incremental events is also very forgiving. If something goes wrong, you’ve lost a few events – not a whole month’s batch processing. It’s possible, with a bit of careful design, to replay history and the event processing that goes with it. You can recover those missing events by rewinding. You may not even have to do that: some event streams are probably quite tolerant of an interruption – does it really matter if you lose 10 minutes of web traffic stats?
  4. I reckon it’s less code overall. Implementing the code to send changes downstream when someone saves a record is pretty trivial on most platforms these days. Batch processing requires you to work out, post hoc, what’s changed and what should be included.
  5. These days it’s often easier to implement these data flows through a point-to-point API. When you really need your marketing leads to get into the hands of your sellers, the native integration between marketing automation platforms and your CRM is going to be the fastest. In the days when we were all afraid of integrations, we’d often send outgoing events from one system to a bus for distribution. The bus, with its store and forward architecture, centralised the translation between formats. It also looked cleaner on an architecture chart but it wasn’t cleaner in real life. “Storing and forwarding” is just queuing hence waste again. If the integration between systems doesn’t happen in real time triggered by events, then it’s time to ask your vendor why not. 
  6. Event processing enables multi-casting. Lots of downstream systems can listen for the change events and do their own processing in parallel. The sending system only has to make those events available on the event stream once. Batch processing has typically involved making multiple copies of the same information and sending them to multiple places.
  7. Most importantly, real time transmission just matches the way that people expect to use a system. When I send a mail, I expect it to show up in my recipient’s mailbox almost immediately. I don’t expect it to show up 5 or 30 minutes later (as my old Notes replication used to do). When I update a record in my sales system, I expect to see that change as quickly as I can switch to my dashboard.

Changing the way you implement data flows in your applications from batch to real time requires a change in how you look at things – just like resolving Zeno’s paradox. Even if your source systems aren’t changing – they’re still sending you files – you can change your frame of reference to treat the arrival of the file as an event. Then when the source system’s designers see the light and start sending you data as a stream of events, you’re already set up to receive them.

If you enjoyed this article, keep an eye out for future facile conflations of philosophy and information technology such as:

  • Why systems of engagement are like the shadows in Plato’s cave, and
  • How developer testing is susceptible to the same arguments that Wittgenstein uses to undermine the concept of a private language.
Posted in Uncategorized | 2 Comments