Skip to content

bugfix: prevent SIGSEGV in event timer rbtree during worker shutdown#2497

Merged
zhuizhuhaomeng merged 1 commit intoopenresty:masterfrom
climagabriel:fix-segv-rbtree-on-worker-shutdown
May 7, 2026
Merged

bugfix: prevent SIGSEGV in event timer rbtree during worker shutdown#2497
zhuizhuhaomeng merged 1 commit intoopenresty:masterfrom
climagabriel:fix-segv-rbtree-on-worker-shutdown

Conversation

@climagabriel
Copy link
Copy Markdown
Contributor

Summary

Worker shutdown can SIGSEGV inside ngx_rbtree_min / ngx_rbtree_delete when there are in-flight cosocket userdata at the time lua_close() runs.

ngx_worker_process_exit() calls ngx_destroy_pool(cycle->pool), which fires the cycle's cleanup handlers. One of these is ngx_http_lua_cleanup_vm, which calls lua_close() on the worker's Lua VM. lua_close() runs LuaJIT GC finalizers (__gc) on every live userdata, including TCP socket cosocket userdata still bound to in-flight requests.

The __gc finalizers ngx_http_lua_socket_tcp_upstream_destroy and ngx_http_lua_socket_downstream_destroy reach ngx_http_lua_socket_tcp_finalize_{read,write}_part, which call ngx_del_timer() on connection events. By that point the event timer rbtree may already have been partially torn down by sibling finalizers, so ngx_del_timer() corrupts the tree and the next deletion segfaults inside ngx_rbtree_min / ngx_rbtree_delete.

Fix

Skip the cleanup when ngx_terminate or ngx_exiting is set: the cycle pool is about to be destroyed and the kernel reclaims fds, so unwinding events and timers from the __gc path is unnecessary and unsafe.

The same guard is already used in ngx_http_lua_socket_tcp_setkeepalive() for the analogous case (cosocket put into the pool while the worker is shutting down).

Reproducer

Reproduces under SIGTERM and SIGQUIT shutdown when there are in-flight ngx.socket.tcp() / ngx.req.socket() coroutines — e.g. long-lived ngx.sleep + tcpsock:receive() loops.

Crash signature

#0  ngx_rbtree_min                                src/core/ngx_rbtree.h
#1  ngx_rbtree_delete                             src/core/ngx_rbtree.c
#2  ngx_event_del_timer
#3  ngx_http_lua_socket_tcp_finalize_read_part
#4  ngx_http_lua_socket_tcp_finalize
#5  ngx_http_lua_socket_tcp_cleanup
#6  ngx_http_lua_socket_tcp_upstream_destroy      (Lua __gc)
#7+ LuaJIT GC sweep
    ngx_http_lua_cleanup_vm
    ngx_destroy_pool(cycle->pool)
    ngx_worker_process_exit

ngx_http_lua_socket_downstream_destroy produces the same chain via the request-socket finalizer instead of the upstream finalizer.

Test plan

  • Builds clean against upstream/master (bbace592).
  • No regression in t/ test suite.
  • Smoke nginx -s stop / nginx -s quit with active cosockets — no SEGV.

When a worker exits, ngx_worker_process_exit() calls
ngx_destroy_pool(cycle->pool), which fires the cycle's cleanup
handlers. One of these is ngx_http_lua_cleanup_vm, which calls
lua_close() on the worker's Lua VM. lua_close() runs LuaJIT GC
finalizers (__gc) on every live userdata, including TCP socket
cosocket userdata still bound to in-flight requests.

The __gc finalizers ngx_http_lua_socket_tcp_upstream_destroy and
ngx_http_lua_socket_downstream_destroy reach
ngx_http_lua_socket_tcp_finalize_{read,write}_part, which call
ngx_del_timer() on connection events. By that point the event
timer rbtree may already have been partially torn down by sibling
finalizers, so ngx_del_timer() corrupts the tree and the next
deletion segfaults inside ngx_rbtree_min / ngx_rbtree_delete.

Skip the cleanup when ngx_terminate or ngx_exiting is set: the
cycle pool is about to be destroyed and the kernel reclaims fds,
so unwinding events and timers from the __gc path is unnecessary
and unsafe. The same guard is already used in
ngx_http_lua_socket_tcp_setkeepalive() for the analogous case.

Reproduces under SIGTERM and SIGQUIT shutdown when there are
in-flight ngx.socket.tcp() / ngx.req.socket() coroutines, e.g.
long-lived ngx.sleep + tcpsock:receive() loops.

Crash signature:

    ngx_rbtree_min     src/core/ngx_rbtree.h
    ngx_rbtree_delete  src/core/ngx_rbtree.c
    ngx_event_del_timer
    ngx_http_lua_socket_tcp_finalize_read_part
    ngx_http_lua_socket_tcp_finalize
    ngx_http_lua_socket_tcp_cleanup
    ngx_http_lua_socket_tcp_upstream_destroy   (Lua __gc)
    ... LuaJIT GC sweep ...
    ngx_http_lua_cleanup_vm
    ngx_destroy_pool(cycle->pool)
    ngx_worker_process_exit
@zhuizhuhaomeng zhuizhuhaomeng merged commit 86b5f1e into openresty:master May 7, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants