kuniga.me > NP-Incompleteness > Queues

Queues

29 Mar 2025

Image of people in an airport. AI-generated

We recently visited Vietnam and on the way back, departing from Hanoi, we had to stand in multiple queues: one for ticketing and luggage, one for immigration, one for X-ray screening, and finally one for boarding. We could even count the last queue inside the plane to reach our seats.

This got me thinking about queues in data processing and reasons to use (or avoid) them. In this post I’d like to explore some of those trade-offs.

Benefits of Queues

Amortizing Irregular Throughput

In a producer-consumer pattern, the producer may emit data at an irregular rate, some times all at once, some times barely at all. The consumer might not be able to handle a sudden burst and could stall or drop some data.

One solution is to introduce a queue as buffer. If the consumer can’t process everything immediately, the data gets queued and handled later.

Of course, this only works if the consumer’s average throughput exceeds the producer’s in the long run. If the producer is consistently faster, the queue will eventually fill up.

Digression: The first time I heard of the term amortization was in analysis of algorithms. An algorithm might be slow in one iteration, but when looking at the entire sequence, the average (amortized) complexity is much better.

The word amortize comes from the Latin ad mortire [1], meaning “to kill” and was used in the context of extinguishing debt. It’s an apt metaphor in algorithm analysis if you think of computational complexity as cost. In fact, some amortization proofs explicitly use the concept of. In fact some amortization analysis proofs explicitly use the concept of “debt”.

Checkpointing

Another reason to use queues is to persist intermediate data and avoid recomputation. Suppose we have a multi-stage data processing pipeline where the first stage is very expensive and the final stage is unreliable.

If we computed everything in a single pass, a failure in a later stage would require recomputing the more costly first stage. Technically, we don’t need a queue here, just persistence, but many queue implementations provide built-in durability.

Decoupling Parallelism

Continuing with the multi-stage pipeline example, suppose one of the stages becomes a bottleneck. Then we can increase the number of threads or machines allocated to processing it, while keeping the resouces on the cheaper stages the same.

This has a clear analogy with the airport queues: ticketing and luggage tagging are slower processes than X-ray screening, so you’d expect more agents handling the former. Queues make that kind of decoupling possible.

Observability

In a complex data processing pipeline it might be tricky to determine which stage is the bottleneck. We can measure how long it takes for a stage from receiving the data to sending it downstream but it might not be accurate if async computation is involved.

Queues can help. For example, if we design a queue to buffer occasional throughput spikes (see Amortizing Irregular Throughput), we can double that estimate and add monitoring. If the queue is ever more than half-full, it indicates the consumer cannot process data quick enough and is a bottleneck.

Downsides

Now that we’ve covers the benefits of using queues, let’s consider some downsides.

Processing Lag

Before filling up, queues might mask a bottlenecked consumer by absorbing data. Events will take longer to reach their destination because they stay parked in the queue. A more costly alternative would be to avoid queues and scale the whole processing system to handle peak load.

Memory

When the queue is between processing units on the same machine, it uses memory. Memory usage is unpredictable and it’s possible that a burst in throughput can cause OOMs and make processing stall.

When the machine is restarted there’s an ever increasing backlog to catch up and it can become a negative feedback loop. Overprovisioning for peak memory helps but it can be wasteful.

Conclusion

We covered some pros and cons of using queues. In my experience, using queues between machines is often beneficial (e.g. in a MapReduce architecture), while for local queues the trade-off is harder. I find the observability part very useful.

I don’t know much about Kakfa but I suspect learning more about it might shed light on additional trade-offs, especially for distributed queues. Queue theory is a topic on its own, so studying it might also yield other insights.

References