When one actor in the network fails and shuts down its own channels, any actor that is not connected to those channels has no way to learn that the network is done. It stays blocked on its next channel->receive() or channel->send() waiting for a message or a consumer that will never come.
Reproducer
--- a/cpp/tests/streaming/test_error_handling.cpp
+++ b/cpp/tests/streaming/test_error_handling.cpp
+TEST_F(StreamingErrorHandling, UnconnectedActorCancelledOnFailure) {
+ // Actor A fails immediately.
+ auto ch_b = ctx->create_channel();
+ std::vector<Actor> actors;
+
+ // Actor A: fails immediately.
+ actors.push_back([](Context& ctx) -> Actor {
+ co_await ctx.executor()->schedule();
+ throw std::runtime_error("actor_a_failure");
+ }(*ctx));
+
+ // Actor B is unconnected to A but blocks
+ actors.push_back(
+ [](std::shared_ptr<Channel> ch_b) -> Actor {
+ std::ignore = co_await ch_b->receive();
+ }(ch_b)
+ );
+
+ try {
+ run_actor_network(std::move(actors), ctx);
+ FAIL() << "Expected std::runtime_error";
+ } catch (std::runtime_error const& e) {
+ EXPECT_THAT(e.what(), ::testing::HasSubstr("actor_a_failure"));
+ }
+}
One approach to a fix is to add a method to Context: cancel_network (for example) that broadcasts a shutdown to every channel and memory reservation the context has ever created, and to call it automatically whenever any actor in the network fails.
When one actor in the network fails and shuts down its own channels, any actor that is not connected to those channels has no way to learn that the network is done. It stays blocked on its next
channel->receive()orchannel->send()waiting for a message or a consumer that will never come.Reproducer
One approach to a fix is to add a method to
Context:cancel_network(for example) that broadcasts a shutdown to every channel and memory reservation the context has ever created, and to call it automatically whenever any actor in the network fails.