Skip to content

Commit 0ca1a83

Browse files
Darrick J. Wongcmaiolino
authored andcommitted
xfs: fix race between healthmon unmount and read_iter
xfs/1879 on one of my test VMs got stuck due to the xfs_io healthmon subcommand sleeping in wait_event_interruptible at: xfs_healthmon_read_iter+0x558/0x5f8 [xfs] vfs_read+0x248/0x320 ksys_read+0x78/0x120 Looking at xfs_healthmon_read_iter, in !O_NONBLOCK mode it will sleep until the mount cookie == DETACHED_MOUNT_COOKIE, there are events waiting to be formatted, or there are formatted events in the read buffer that could be copied to userspace. Poking into the running kernel, I see that there are zero events in the list, the read buffer is empty, and the mount cookie is indeed in DETACHED state. IOWs, xfs_healthmon_has_eventdata should have returned true, but instead we're asleep waiting for a wakeup. I think what happened here is that xfs_healthmon_read_iter and xfs_healthmon_unmount were racing with each other, and _read_iter lost the race. _unmount queued an unmount event, which woke up _read_iter. It found, formatted, and copied the event out to userspace. That cleared out the pending event list and emptied the read buffer. xfs_io then called read() again, so _has_eventdata decided that we should sleep on the empty event queue. Next, _unmount called xfs_healthmon_detach, which set the mount cookie to DETACHED. Unfortunately, it didn't call wake_up_all on the hm, so the wait_event_interruptible in the _read_iter thread remains asleep. That's why the test stalled. Fix this by moving the wake_up_all call to xfs_healthmon_detach. Fixes: b3a289a ("xfs: create event queuing, formatting, and discovery infrastructure") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
1 parent 6270b8a commit 0ca1a83

1 file changed

Lines changed: 10 additions & 7 deletions

File tree

fs/xfs/xfs_healthmon.c

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,16 @@ xfs_healthmon_detach(
141141
hm->mount_cookie = DETACHED_MOUNT_COOKIE;
142142
spin_unlock(&xfs_healthmon_lock);
143143

144+
/*
145+
* Wake up any readers that might remain. This can happen if unmount
146+
* races with the healthmon fd owner entering ->read_iter, having
147+
* already emptied the event queue.
148+
*
149+
* In the ->release case there shouldn't be any readers because the
150+
* only users of the waiter are read and poll.
151+
*/
152+
wake_up_all(&hm->wait);
153+
144154
trace_xfs_healthmon_detach(hm);
145155
xfs_healthmon_put(hm);
146156
}
@@ -1027,13 +1037,6 @@ xfs_healthmon_release(
10271037
* process can create another health monitor file.
10281038
*/
10291039
xfs_healthmon_detach(hm);
1030-
1031-
/*
1032-
* Wake up any readers that might be left. There shouldn't be any
1033-
* because the only users of the waiter are read and poll.
1034-
*/
1035-
wake_up_all(&hm->wait);
1036-
10371040
xfs_healthmon_put(hm);
10381041
return 0;
10391042
}

0 commit comments

Comments
 (0)