Responsive Retrieval of Consistent States in Pipelined Executions of Dataflows

Ni, Shengquan; Li, Chen

Responsive Retrieval of Consistent States in Pipelined Executions of Dataflows

2025.

Many modern analytics data workflows (i.e., dataflows) run on distributed data-processing systems to support real-time tasks such as fraud detection and product recommendation. During the execution of a workflow, users often want to inspect the internal state of operators or in-flight data tuples between operators (i.e., on the edges in the workflow) to understand their runtime behaviors for analysis and debugging purposes. While existing methods designed for fault-tolerance can support state retrieval, their response can be slow since they rely on the propagation of a special message (a.k.a. a “barrier” or “marker”) on the edges. In this paper, we study how to retrieve a consistent state during a pipelined execution of a workflow with a low latency. We consider a case where the user does not need to retrieve edge states and another case where the user pauses the execution. For each case, we leverage its unique properties to develop a novel retrieval method that does not require barrier propagation on the edges. We also compare these two methods with three existing checkpointing-based methods. We conducted experiments using real datasets and workflows to compare these methods and show that the two novel methods achieve lower latency for state retrieval.

ACM

Leave a Reply

Your email address will not be published. Required fields are marked *