Skip to content

Commit 4544e9c

Browse files
committed
selftests/sched_ext: Fix init_enable_count flakiness
The init_enable_count test is flaky. The test forks 1024 children before attaching the scheduler to verify that existing tasks get ops.init_task() called. The children were using sleep(1) before exiting. 7900aa6 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()") changed when tasks are removed from scx_tasks - previously when the task_struct was freed, now immediately in finish_task_switch() when the task dies. Before the commit, pre-forked children would linger on scx_tasks until freed regardless of when they exited, so the scheduler would always see them during iteration. The sleep(1) was unnecessary. After the commit, children are removed as soon as they die. The sleep(1) masks the problem in most cases but the test becomes flaky depending on timing. Fix by synchronizing properly using a pipe. All children block on read() and the parent signals them to exit by closing the write end after attaching the scheduler. The children are auto-reaped so there's no need to wait on them. Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Andrea Righi <arighi@nvidia.com> Cc: Changwoo Min <changwoo@igalia.com> Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>
1 parent 2e06d54 commit 4544e9c

1 file changed

Lines changed: 23 additions & 11 deletions

File tree

tools/testing/selftests/sched_ext/init_enable_count.c

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
* Copyright (c) 2023 David Vernet <dvernet@meta.com>
55
* Copyright (c) 2023 Tejun Heo <tj@kernel.org>
66
*/
7+
#include <signal.h>
78
#include <stdio.h>
89
#include <unistd.h>
910
#include <sched.h>
@@ -23,6 +24,9 @@ static enum scx_test_status run_test(bool global)
2324
int ret, i, status;
2425
struct sched_param param = {};
2526
pid_t pids[num_pre_forks];
27+
int pipe_fds[2];
28+
29+
SCX_FAIL_IF(pipe(pipe_fds) < 0, "Failed to create pipe");
2630

2731
skel = init_enable_count__open();
2832
SCX_FAIL_IF(!skel, "Failed to open");
@@ -38,26 +42,34 @@ static enum scx_test_status run_test(bool global)
3842
* ensure (at least in practical terms) that there are more tasks that
3943
* transition from SCHED_OTHER -> SCHED_EXT than there are tasks that
4044
* take the fork() path either below or in other processes.
45+
*
46+
* All children will block on read() on the pipe until the parent closes
47+
* the write end after attaching the scheduler, which signals all of
48+
* them to exit simultaneously. Auto-reap so we don't have to wait on
49+
* them.
4150
*/
51+
signal(SIGCHLD, SIG_IGN);
4252
for (i = 0; i < num_pre_forks; i++) {
43-
pids[i] = fork();
44-
SCX_FAIL_IF(pids[i] < 0, "Failed to fork child");
45-
if (pids[i] == 0) {
46-
sleep(1);
53+
pid_t pid = fork();
54+
55+
SCX_FAIL_IF(pid < 0, "Failed to fork child");
56+
if (pid == 0) {
57+
char buf;
58+
59+
close(pipe_fds[1]);
60+
read(pipe_fds[0], &buf, 1);
61+
close(pipe_fds[0]);
4762
exit(0);
4863
}
4964
}
65+
close(pipe_fds[0]);
5066

5167
link = bpf_map__attach_struct_ops(skel->maps.init_enable_count_ops);
5268
SCX_FAIL_IF(!link, "Failed to attach struct_ops");
5369

54-
for (i = 0; i < num_pre_forks; i++) {
55-
SCX_FAIL_IF(waitpid(pids[i], &status, 0) != pids[i],
56-
"Failed to wait for pre-forked child\n");
57-
58-
SCX_FAIL_IF(status != 0, "Pre-forked child %d exited with status %d\n", i,
59-
status);
60-
}
70+
/* Signal all pre-forked children to exit. */
71+
close(pipe_fds[1]);
72+
signal(SIGCHLD, SIG_DFL);
6173

6274
bpf_link__destroy(link);
6375
SCX_GE(skel->bss->init_task_cnt, num_pre_forks);

0 commit comments

Comments
 (0)