Skip to content

Commit b133207

Browse files
committed
rv: Add nomiss deadline monitor
Add the deadline monitors collection to validate the deadline scheduler, both for deadline tasks and servers. The currently implemented monitors are: * nomiss: validate dl entities run to completion before their deadiline Reviewed-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20260330111010.153663-13-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
1 parent c85dbdd commit b133207

13 files changed

Lines changed: 839 additions & 0 deletions

File tree

Documentation/trace/rv/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,4 @@ Runtime Verification
1717
monitor_sched.rst
1818
monitor_rtapp.rst
1919
monitor_stall.rst
20+
monitor_deadline.rst
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
Deadline monitors
2+
=================
3+
4+
- Name: deadline
5+
- Type: container for multiple monitors
6+
- Author: Gabriele Monaco <gmonaco@redhat.com>
7+
8+
Description
9+
-----------
10+
11+
The deadline monitor is a set of specifications to describe the deadline
12+
scheduler behaviour. It includes monitors per scheduling entity (deadline tasks
13+
and servers) that work independently to verify different specifications the
14+
deadline scheduler should follow.
15+
16+
Specifications
17+
--------------
18+
19+
Monitor nomiss
20+
~~~~~~~~~~~~~~
21+
22+
The nomiss monitor ensures dl entities get to run *and* run to completion
23+
before their deadline, although deferrable servers may not run. An entity is
24+
considered done if ``throttled``, either because it yielded or used up its
25+
runtime, or when it voluntarily starts ``sleeping``.
26+
The monitor includes a user configurable deadline threshold. If the total
27+
utilisation of deadline tasks is larger than 1, they are only guaranteed
28+
bounded tardiness. See Documentation/scheduler/sched-deadline.rst for more
29+
details. The threshold (module parameter ``nomiss.deadline_thresh``) can be
30+
configured to avoid the monitor to fail based on the acceptable tardiness in
31+
the system. Since ``dl_throttle`` is a valid outcome for the entity to be done,
32+
the minimum tardiness needs be 1 tick to consider the throttle delay, unless
33+
the ``HRTICK_DL`` scheduler feature is active.
34+
35+
Servers have also an intermediate ``idle`` state, occurring as soon as no
36+
runnable task is available from ready or running where no timing constraint
37+
is applied. A server goes to sleep by stopping, there is no wakeup equivalent
38+
as the order of a server starting and replenishing is not defined, hence a
39+
server can run from sleeping without being ready::
40+
41+
|
42+
sched_wakeup v
43+
dl_replenish;reset(clk) -- #=========================#
44+
| H H dl_replenish;reset(clk)
45+
+-----------> H H <--------------------+
46+
H H |
47+
+- dl_server_stop ---- H ready H |
48+
| +-----------------> H clk < DEADLINE_NS() H dl_throttle; |
49+
| | H H is_defer == 1 |
50+
| | sched_switch_in - H H -----------------+ |
51+
| | | #=========================# | |
52+
| | | | ^ | |
53+
| | | dl_server_idle dl_replenish;reset(clk) | |
54+
| | | v | | |
55+
| | | +--------------+ | |
56+
| | | +------ | | | |
57+
| | | dl_server_idle | | dl_throttle | |
58+
| | | | | idle | -----------------+ | |
59+
| | | +-----> | | | | |
60+
| | | | | | | |
61+
| | | | | | | |
62+
+--+--+---+--- dl_server_stop -- +--------------+ | | |
63+
| | | | | ^ | | |
64+
| | | | sched_switch_in dl_server_idle | | |
65+
| | | | v | | | |
66+
| | | | +---------- +---------------------+ | | |
67+
| | | | sched_switch_in | | | | |
68+
| | | | sched_wakeup | | | | |
69+
| | | | dl_replenish; | running | -------+ | | |
70+
| | | | reset(clk) | clk < DEADLINE_NS() | | | | |
71+
| | | | +---------> | | dl_throttle | | |
72+
| | | +----------------> | | | | | |
73+
| | | +---------------------+ | | | |
74+
| | sched_wakeup ^ sched_switch_suspend | | | |
75+
v v dl_replenish;reset(clk) | dl_server_stop | | | |
76+
+--------------+ | | v v v |
77+
| | - sched_switch_in + | +---------------+
78+
| | <---------------------+ dl_throttle +-- | |
79+
| sleeping | sched_wakeup | | throttled |
80+
| | -- dl_server_stop dl_server_idle +-> | |
81+
| | dl_server_idle sched_switch_suspend +---------------+
82+
+--------------+ <---------+ ^
83+
| |
84+
+------ dl_throttle;is_constr_dl == 1 || is_defer == 1 ------+

kernel/trace/rv/Kconfig

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,10 @@ source "kernel/trace/rv/monitors/sleep/Kconfig"
7979
# Add new rtapp monitors here
8080

8181
source "kernel/trace/rv/monitors/stall/Kconfig"
82+
source "kernel/trace/rv/monitors/deadline/Kconfig"
83+
source "kernel/trace/rv/monitors/nomiss/Kconfig"
84+
# Add new deadline monitors here
85+
8286
# Add new monitors here
8387

8488
config RV_REACTORS

kernel/trace/rv/Makefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ obj-$(CONFIG_RV_MON_NRP) += monitors/nrp/nrp.o
1818
obj-$(CONFIG_RV_MON_SSSW) += monitors/sssw/sssw.o
1919
obj-$(CONFIG_RV_MON_OPID) += monitors/opid/opid.o
2020
obj-$(CONFIG_RV_MON_STALL) += monitors/stall/stall.o
21+
obj-$(CONFIG_RV_MON_DEADLINE) += monitors/deadline/deadline.o
22+
obj-$(CONFIG_RV_MON_NOMISS) += monitors/nomiss/nomiss.o
2123
# Add new monitors here
2224
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
2325
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
config RV_MON_DEADLINE
2+
depends on RV
3+
bool "deadline monitor"
4+
help
5+
Collection of monitors to check the deadline scheduler and server
6+
behave according to specifications. Enable this to enable all
7+
scheduler specification supported by the current kernel.
8+
9+
For further information, see:
10+
Documentation/trace/rv/monitor_deadline.rst
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
// SPDX-License-Identifier: GPL-2.0
2+
#include <linux/kernel.h>
3+
#include <linux/module.h>
4+
#include <linux/init.h>
5+
#include <linux/rv.h>
6+
#include <linux/kallsyms.h>
7+
8+
#define MODULE_NAME "deadline"
9+
10+
#include "deadline.h"
11+
12+
struct rv_monitor rv_deadline = {
13+
.name = "deadline",
14+
.description = "container for several deadline scheduler specifications.",
15+
.enable = NULL,
16+
.disable = NULL,
17+
.reset = NULL,
18+
.enabled = 0,
19+
};
20+
21+
/* Used by other monitors */
22+
struct sched_class *rv_ext_sched_class;
23+
24+
static int __init register_deadline(void)
25+
{
26+
if (IS_ENABLED(CONFIG_SCHED_CLASS_EXT)) {
27+
rv_ext_sched_class = (void *)kallsyms_lookup_name("ext_sched_class");
28+
if (!rv_ext_sched_class)
29+
pr_warn("rv: Missing ext_sched_class, monitors may not work.\n");
30+
}
31+
return rv_register_monitor(&rv_deadline, NULL);
32+
}
33+
34+
static void __exit unregister_deadline(void)
35+
{
36+
rv_unregister_monitor(&rv_deadline);
37+
}
38+
39+
module_init(register_deadline);
40+
module_exit(unregister_deadline);
41+
42+
MODULE_LICENSE("GPL");
43+
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
44+
MODULE_DESCRIPTION("deadline: container for several deadline scheduler specifications.");
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
3+
#include <linux/kernel.h>
4+
#include <linux/uaccess.h>
5+
#include <linux/sched/deadline.h>
6+
#include <asm/syscall.h>
7+
#include <uapi/linux/sched/types.h>
8+
#include <trace/events/sched.h>
9+
10+
/*
11+
* Dummy values if not available
12+
*/
13+
#ifndef __NR_sched_setscheduler
14+
#define __NR_sched_setscheduler -__COUNTER__
15+
#endif
16+
#ifndef __NR_sched_setattr
17+
#define __NR_sched_setattr -__COUNTER__
18+
#endif
19+
20+
extern struct rv_monitor rv_deadline;
21+
/* Initialised when registering the deadline container */
22+
extern struct sched_class *rv_ext_sched_class;
23+
24+
/*
25+
* If both have dummy values, the syscalls are not supported and we don't even
26+
* need to register the handler.
27+
*/
28+
static inline bool should_skip_syscall_handle(void)
29+
{
30+
return __NR_sched_setattr < 0 && __NR_sched_setscheduler < 0;
31+
}
32+
33+
/*
34+
* is_supported_type - return true if @type is supported by the deadline monitors
35+
*/
36+
static inline bool is_supported_type(u8 type)
37+
{
38+
return type == DL_TASK || type == DL_SERVER_FAIR || type == DL_SERVER_EXT;
39+
}
40+
41+
/*
42+
* is_server_type - return true if @type is a supported server
43+
*/
44+
static inline bool is_server_type(u8 type)
45+
{
46+
return is_supported_type(type) && type != DL_TASK;
47+
}
48+
49+
/*
50+
* Use negative numbers for the server.
51+
* Currently only one fair server per CPU, may change in the future.
52+
*/
53+
#define fair_server_id(cpu) (-cpu)
54+
#define ext_server_id(cpu) (-cpu - num_possible_cpus())
55+
#define NO_SERVER_ID (-2 * num_possible_cpus())
56+
/*
57+
* Get a unique id used for dl entities
58+
*
59+
* The cpu is not required for tasks as the pid is used there, if this function
60+
* is called on a dl_se that for sure corresponds to a task, DL_TASK can be
61+
* used in place of cpu.
62+
* We need the cpu for servers as it is provided in the tracepoint and we
63+
* cannot easily retrieve it from the dl_se (requires the struct rq definition).
64+
*/
65+
static inline int get_entity_id(struct sched_dl_entity *dl_se, int cpu, u8 type)
66+
{
67+
if (dl_server(dl_se) && type != DL_TASK) {
68+
if (type == DL_SERVER_FAIR)
69+
return fair_server_id(cpu);
70+
if (type == DL_SERVER_EXT)
71+
return ext_server_id(cpu);
72+
return NO_SERVER_ID;
73+
}
74+
return dl_task_of(dl_se)->pid;
75+
}
76+
77+
static inline bool task_is_scx_enabled(struct task_struct *tsk)
78+
{
79+
return IS_ENABLED(CONFIG_SCHED_CLASS_EXT) &&
80+
tsk->sched_class == rv_ext_sched_class;
81+
}
82+
83+
/* Expand id and target as arguments for da functions */
84+
#define EXPAND_ID(dl_se, cpu, type) get_entity_id(dl_se, cpu, type), dl_se
85+
#define EXPAND_ID_TASK(tsk) get_entity_id(&tsk->dl, task_cpu(tsk), DL_TASK), &tsk->dl
86+
87+
static inline u8 get_server_type(struct task_struct *tsk)
88+
{
89+
if (tsk->policy == SCHED_NORMAL || tsk->policy == SCHED_EXT ||
90+
tsk->policy == SCHED_BATCH || tsk->policy == SCHED_IDLE)
91+
return task_is_scx_enabled(tsk) ? DL_SERVER_EXT : DL_SERVER_FAIR;
92+
return DL_OTHER;
93+
}
94+
95+
static inline int extract_params(struct pt_regs *regs, long id, pid_t *pid_out)
96+
{
97+
size_t size = offsetofend(struct sched_attr, sched_flags);
98+
struct sched_attr __user *uattr, attr;
99+
int new_policy = -1, ret;
100+
unsigned long args[6];
101+
102+
switch (id) {
103+
case __NR_sched_setscheduler:
104+
syscall_get_arguments(current, regs, args);
105+
*pid_out = args[0];
106+
new_policy = args[1];
107+
break;
108+
case __NR_sched_setattr:
109+
syscall_get_arguments(current, regs, args);
110+
*pid_out = args[0];
111+
uattr = (struct sched_attr __user *)args[1];
112+
/*
113+
* Just copy up to sched_flags, we are not interested after that
114+
*/
115+
ret = copy_struct_from_user(&attr, size, uattr, size);
116+
if (ret)
117+
return ret;
118+
if (attr.sched_flags & SCHED_FLAG_KEEP_POLICY)
119+
return -EINVAL;
120+
new_policy = attr.sched_policy;
121+
break;
122+
default:
123+
return -EINVAL;
124+
}
125+
126+
return new_policy & ~SCHED_RESET_ON_FORK;
127+
}
128+
129+
/* Helper functions requiring DA/HA utilities */
130+
#ifdef RV_MON_TYPE
131+
132+
/*
133+
* get_fair_server - get the fair server associated to a task
134+
*
135+
* If the task is a boosted task, the server is available in the task_struct,
136+
* otherwise grab the dl entity saved for the CPU where the task is enqueued.
137+
* This function assumes the task is enqueued somewhere.
138+
*/
139+
static inline struct sched_dl_entity *get_server(struct task_struct *tsk, u8 type)
140+
{
141+
if (tsk->dl_server && get_server_type(tsk) == type)
142+
return tsk->dl_server;
143+
if (type == DL_SERVER_FAIR)
144+
return da_get_target_by_id(fair_server_id(task_cpu(tsk)));
145+
if (type == DL_SERVER_EXT)
146+
return da_get_target_by_id(ext_server_id(task_cpu(tsk)));
147+
return NULL;
148+
}
149+
150+
/*
151+
* Initialise monitors for all tasks and pre-allocate the storage for servers.
152+
* This is necessary since we don't have access to the servers here and
153+
* allocation can cause deadlocks from their tracepoints. We can only fill
154+
* pre-initialised storage from there.
155+
*/
156+
static inline int init_storage(bool skip_tasks)
157+
{
158+
struct task_struct *g, *p;
159+
int cpu;
160+
161+
for_each_possible_cpu(cpu) {
162+
if (!da_create_empty_storage(fair_server_id(cpu)))
163+
goto fail;
164+
if (IS_ENABLED(CONFIG_SCHED_CLASS_EXT) &&
165+
!da_create_empty_storage(ext_server_id(cpu)))
166+
goto fail;
167+
}
168+
169+
if (skip_tasks)
170+
return 0;
171+
172+
read_lock(&tasklist_lock);
173+
for_each_process_thread(g, p) {
174+
if (p->policy == SCHED_DEADLINE) {
175+
if (!da_create_storage(EXPAND_ID_TASK(p), NULL)) {
176+
read_unlock(&tasklist_lock);
177+
goto fail;
178+
}
179+
}
180+
}
181+
read_unlock(&tasklist_lock);
182+
return 0;
183+
184+
fail:
185+
da_monitor_destroy();
186+
return -ENOMEM;
187+
}
188+
189+
static void __maybe_unused handle_newtask(void *data, struct task_struct *task, u64 flags)
190+
{
191+
/* Might be superfluous as tasks are not started with this policy.. */
192+
if (task->policy == SCHED_DEADLINE)
193+
da_create_storage(EXPAND_ID_TASK(task), NULL);
194+
}
195+
196+
static void __maybe_unused handle_exit(void *data, struct task_struct *p, bool group_dead)
197+
{
198+
if (p->policy == SCHED_DEADLINE)
199+
da_destroy_storage(get_entity_id(&p->dl, DL_TASK, DL_TASK));
200+
}
201+
202+
#endif
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# SPDX-License-Identifier: GPL-2.0-only
2+
#
3+
config RV_MON_NOMISS
4+
depends on RV
5+
depends on HAVE_SYSCALL_TRACEPOINTS
6+
depends on RV_MON_DEADLINE
7+
default y
8+
select HA_MON_EVENTS_ID
9+
bool "nomiss monitor"
10+
help
11+
Monitor to ensure dl entities run to completion before their deadiline.
12+
This monitor is part of the deadline monitors collection.
13+
14+
For further information, see:
15+
Documentation/trace/rv/monitor_deadline.rst

0 commit comments

Comments
 (0)