Skip to content

Commit a2225b6

Browse files
diandersDanilo Krummrich
authored andcommitted
driver core: Don't let a device probe until it's ready
The moment we link a "struct device" into the list of devices for the bus, it's possible probe can happen. This is because another thread can load the driver at any time and that can cause the device to probe. This has been seen in practice with a stack crawl that looks like this [1]: really_probe() __driver_probe_device() driver_probe_device() __driver_attach() bus_for_each_dev() driver_attach() bus_add_driver() driver_register() __platform_driver_register() init_module() [some module] do_one_initcall() do_init_module() load_module() __arm64_sys_finit_module() invoke_syscall() As a result of the above, it was seen that device_links_driver_bound() could be called for the device before "dev->fwnode->dev" was assigned. This prevented __fw_devlink_pickup_dangling_consumers() from being called which meant that other devices waiting on our driver's sub-nodes were stuck deferring forever. It's believed that this problem is showing up suddenly for two reasons: 1. Android has recently (last ~1 year) implemented an optimization to the order it loads modules [2]. When devices opt-in to this faster loading, modules are loaded one-after-the-other very quickly. This is unlike how other distributions do it. The reproduction of this problem has only been seen on devices that opt-in to Android's "parallel module loading". 2. Android devices typically opt-in to fw_devlink, and the most noticeable issue is the NULL "dev->fwnode->dev" in device_links_driver_bound(). fw_devlink is somewhat new code and also not in use by all Linux devices. Even though the specific symptom where "dev->fwnode->dev" wasn't assigned could be fixed by moving that assignment higher in device_add(), other parts of device_add() (like the call to device_pm_add()) are also important to run before probe. Only moving the "dev->fwnode->dev" assignment would likely fix the current symptoms but lead to difficult-to-debug problems in the future. Fix the problem by preventing probe until device_add() has run far enough that the device is ready to probe. If somehow we end up trying to probe before we're allowed, __driver_probe_device() will return -EPROBE_DEFER which will make certain the device is noticed. In the race condition that was seen with Android's faster module loading, we will temporarily add the device to the deferred list and then take it off immediately when device_add() probes the device. Instead of adding another flag to the bitfields already in "struct device", instead add a new "flags" field and use that. This allows us to freely change the bit from different thread without worrying about corrupting nearby bits (and means threads changing other bit won't corrupt us). [1] Captured on a machine running a downstream 6.6 kernel [2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel Cc: stable@vger.kernel.org Fixes: 2023c61 ("Driver core: add new device to bus's list before probing") Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
1 parent c369299 commit a2225b6

3 files changed

Lines changed: 79 additions & 0 deletions

File tree

drivers/base/core.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3688,6 +3688,21 @@ int device_add(struct device *dev)
36883688
fw_devlink_link_device(dev);
36893689
}
36903690

3691+
/*
3692+
* The moment the device was linked into the bus's "klist_devices" in
3693+
* bus_add_device() then it's possible that probe could have been
3694+
* attempted in a different thread via userspace loading a driver
3695+
* matching the device. "ready_to_probe" being unset would have
3696+
* blocked those attempts. Now that all of the above initialization has
3697+
* happened, unblock probe. If probe happens through another thread
3698+
* after this point but before bus_probe_device() runs then it's fine.
3699+
* bus_probe_device() -> device_initial_probe() -> __device_attach()
3700+
* will notice (under device_lock) that the device is already bound.
3701+
*/
3702+
device_lock(dev);
3703+
dev_set_ready_to_probe(dev);
3704+
device_unlock(dev);
3705+
36913706
bus_probe_device(dev);
36923707

36933708
/*

drivers/base/dd.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -848,6 +848,26 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
848848
if (dev->driver)
849849
return -EBUSY;
850850

851+
/*
852+
* In device_add(), the "struct device" gets linked into the subsystem's
853+
* list of devices and broadcast to userspace (via uevent) before we're
854+
* quite ready to probe. Those open pathways to driver probe before
855+
* we've finished enough of device_add() to reliably support probe.
856+
* Detect this and tell other pathways to try again later. device_add()
857+
* itself will also try to probe immediately after setting
858+
* "ready_to_probe".
859+
*/
860+
if (!dev_ready_to_probe(dev))
861+
return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready to probe\n");
862+
863+
/*
864+
* Set can_match = true after calling dev_ready_to_probe(), so
865+
* driver_deferred_probe_add() won't actually add the device to the
866+
* deferred probe list when dev_ready_to_probe() returns false.
867+
*
868+
* When dev_ready_to_probe() returns false, it means that device_add()
869+
* will do another probe() attempt for us.
870+
*/
851871
dev->can_match = true;
852872
dev_dbg(dev, "bus: '%s': %s: matched device with driver %s\n",
853873
drv->bus->name, __func__, drv->name);

include/linux/device.h

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,21 @@ struct device_physical_location {
458458
bool lid;
459459
};
460460

461+
/**
462+
* enum struct_device_flags - Flags in struct device
463+
*
464+
* Each flag should have a set of accessor functions created via
465+
* __create_dev_flag_accessors() for each access.
466+
*
467+
* @DEV_FLAG_READY_TO_PROBE: If set then device_add() has finished enough
468+
* initialization that probe could be called.
469+
*/
470+
enum struct_device_flags {
471+
DEV_FLAG_READY_TO_PROBE = 0,
472+
473+
DEV_FLAG_COUNT
474+
};
475+
461476
/**
462477
* struct device - The basic device structure
463478
* @parent: The device's "parent" device, the device to which it is attached.
@@ -553,6 +568,7 @@ struct device_physical_location {
553568
* @dma_skip_sync: DMA sync operations can be skipped for coherent buffers.
554569
* @dma_iommu: Device is using default IOMMU implementation for DMA and
555570
* doesn't rely on dma_ops structure.
571+
* @flags: DEV_FLAG_XXX flags. Use atomic bitfield operations to modify.
556572
*
557573
* At the lowest level, every device in a Linux system is represented by an
558574
* instance of struct device. The device structure contains the information
@@ -675,8 +691,36 @@ struct device {
675691
#ifdef CONFIG_IOMMU_DMA
676692
bool dma_iommu:1;
677693
#endif
694+
695+
DECLARE_BITMAP(flags, DEV_FLAG_COUNT);
678696
};
679697

698+
#define __create_dev_flag_accessors(accessor_name, flag_name) \
699+
static inline bool dev_##accessor_name(const struct device *dev) \
700+
{ \
701+
return test_bit(flag_name, dev->flags); \
702+
} \
703+
static inline void dev_set_##accessor_name(struct device *dev) \
704+
{ \
705+
set_bit(flag_name, dev->flags); \
706+
} \
707+
static inline void dev_clear_##accessor_name(struct device *dev) \
708+
{ \
709+
clear_bit(flag_name, dev->flags); \
710+
} \
711+
static inline void dev_assign_##accessor_name(struct device *dev, bool value) \
712+
{ \
713+
assign_bit(flag_name, dev->flags, value); \
714+
} \
715+
static inline bool dev_test_and_set_##accessor_name(struct device *dev) \
716+
{ \
717+
return test_and_set_bit(flag_name, dev->flags); \
718+
}
719+
720+
__create_dev_flag_accessors(ready_to_probe, DEV_FLAG_READY_TO_PROBE);
721+
722+
#undef __create_dev_flag_accessors
723+
680724
/**
681725
* struct device_link - Device link representation.
682726
* @supplier: The device on the supplier end of the link.

0 commit comments

Comments
 (0)