linux.git - linus torvalds my love

Age	Commit message (Collapse)	Author
2026-03-17	driver core: generalize driver_override in struct device	Danilo Krummrich
	Currently, there are 12 busses (including platform and PCI) that duplicate the driver_override logic for their individual devices. All of them seem to be prone to the bug described in [1]. While this could be solved for every bus individually using a separate lock, solving this in the driver-core generically results in less (and cleaner) changes overall. Thus, move driver_override to struct device, provide corresponding accessors for busses and handle locking with a separate lock internally. In particular, add device_set_driver_override(), device_has_driver_override(), device_match_driver_override() and generalize the sysfs store() and show() callbacks via a driver_override feature flag in struct bus_type. Until all busses have migrated, keep driver_set_override() in place. Note that we can't use the device lock for the reasons described in [2]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2] Tested-by: Gui-Dong Han <hanguidong02@gmail.com> Co-developed-by: Gui-Dong Han <hanguidong02@gmail.com> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org [ Use dev->bus instead of sp->bus for consistency; fix commit message to refer to the struct bus_type's driver_override feature flag. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-03	Revert "driver core: enforce device_lock for driver_match_device()"	Danilo Krummrich
	This reverts commit dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") and commit 289b14592cef ("driver core: fix inverted "locked" suffix of driver_match_device()"). While technically correct, there is a major downside to this approach: When a device is already present in the system and a driver is registered on the same bus, we iterate over all devices registered on this bus to see if one of them matches. If we come across an already bound one where the corresponding driver crashed while holding the device lock (e.g. in probe()) we can't make any progress anymore. However, drivers are typically the least tested code in the kernel and hence it is a case that is likely to happen regularly. Besides hurting developer ergonomics, it potentially decreases chances of shutting things down cleanly and obtaining logs in production environments as well [1]. This came up in the context of a firewire bug, which only in combination with the reverted commit, caused the machine to hang [2]. Additionally, it was observed in [3]. Thus, revert commit dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") and add a brief note clarifying that an implementer of struct bus_type must not expect match() to be called with the device lock held. Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Link: https://lore.kernel.org/all/67f655bb-4d81-4609-b008-68d200255dd2@davidgow.net/ [2] Link: https://lore.kernel.org/lkml/CALbr=LZ4v7N=tO1vgOsyj9AS+XuNbn6kG-QcF+PacdMjSo0iyw@mail.gmail.com/ [3] Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Closes: https://lore.kernel.org/driver-core/CAHk-=wgJ_L1C=HjcYJotg_zrZEmiLFJaoic+PWthjuQrutrfJw@mail.gmail.com/ Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260302002545.19389-1-dakr@kernel.org [ Add additional Link: reference. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-02-01	driver core: fix inverted "locked" suffix of driver_match_device()	Danilo Krummrich
	In the current implementation driver_match_device() expects the device lock to be held, while driver_match_device_locked() acquires the device lock. By convention it should be the other way around, hence swap the name of both functions. Fixes: dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260131014211.12841-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-16	driver core: enforce device_lock for driver_match_device()	Gui-Dong Han
	Currently, driver_match_device() is called from three sites. One site (__device_attach_driver) holds device_lock(dev), but the other two (bind_store and __driver_attach) do not. This inconsistency means that bus match() callbacks are not guaranteed to be called with the lock held. Fix this by introducing driver_match_device_locked(), which guarantees holding the device lock using a scoped guard. Replace the unlocked calls in bind_store() and __driver_attach() with this new helper. Also add a lock assertion to driver_match_device() to enforce this guarantee. This consistency also fixes a known race condition. The driver_override implementation relies on the device_lock, so the missing lock led to the use-after-free (UAF) reported in Bugzilla for buses using this field. Stress testing the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep warnings. Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Fixes: 49b420a13ff9 ("driver core: check bus->match without holding device lock") Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/20260113162843.12712-1-hanguidong02@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-16	rust: driver: drop device private data post unbind	Danilo Krummrich
	Currently, the driver's device private data is allocated and initialized from driver core code called from bus abstractions after the driver's probe() callback returned the corresponding initializer. Similarly, the driver's device private data is dropped within the remove() callback of bus abstractions after calling the remove() callback of the corresponding driver. However, commit 6f61a2637abe ("rust: device: introduce Device::drvdata()") introduced an accessor for the driver's device private data for a Device<Bound>, i.e. a device that is currently bound to a driver. Obviously, this is in conflict with dropping the driver's device private data in remove(), since a device can not be considered to be fully unbound after remove() has finished: We also have to consider registrations guarded by devres - such as IRQ or class device registrations - which are torn down after remove() in devres_release_all(). Thus, it can happen that, for instance, a class device or IRQ callback still calls Device::drvdata(), which then runs concurrently to remove() (which sets dev->driver_data to NULL and drops the driver's device private data), before devres_release_all() started to tear down the corresponding registration. This is because devres guarded registrations can, as expected, access the corresponding Device<Bound> that defines their scope. In C it simply is the driver's responsibility to ensure that its device private data is freed after e.g. an IRQ registration is unregistered. Typically, C drivers achieve this by allocating their device private data with e.g. devm_kzalloc() before doing anything else, i.e. before e.g. registering an IRQ with devm_request_threaded_irq(), relying on the reverse order cleanup of devres. Technically, we could do something similar in Rust. However, the resulting code would be pretty messy: In Rust we have to differentiate between allocated but uninitialized memory and initialized memory in the type system. Thus, we would need to somehow keep track of whether the driver's device private data object has been initialized (i.e. probe() was successful and returned a valid initializer for this memory) and conditionally call the destructor of the corresponding object when it is freed. This is because we'd need to allocate and register the memory of the driver's device private data before it is initialized by the initializer returned by the driver's probe() callback, because the driver could already register devres guarded registrations within probe() outside of the driver's device private data initializer. Luckily there is a much simpler solution: Instead of dropping the driver's device private data at the end of remove(), we just drop it after the device has been fully unbound, i.e. after all devres callbacks have been processed. For this, we introduce a new post_unbind() callback private to the driver-core, i.e. the callback is neither exposed to drivers, nor to bus abstractions. This way, the driver-core code can simply continue to conditionally allocate the memory for the driver's device private data when the driver's initializer is returned from probe() - no change needed - and drop it when the driver-core code receives the post_unbind() callback. Closes: https://lore.kernel.org/all/DEZMS6Y4A7XE.XE7EUBT5SJFJ@kernel.org/ Fixes: 6f61a2637abe ("rust: device: introduce Device::drvdata()") Acked-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Igor Korotin <igor.korotin.linux@gmail.com> Link: https://patch.msgid.link/20260107103511.570525-7-dakr@kernel.org [ Remove #ifdef CONFIG_RUST, rename post_unbind() to post_unbind_rust(). - Danilo] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-11-26	driver core: Check drivers_autoprobe for all added devices	Vincent Liu
	When a device is hot-plugged, the drivers_autoprobe sysfs attribute is not checked (at least for PCI devices). This means that drivers_autoprobe is not working as intended, e.g. hot-plugged PCI devices will still be autoprobed and bound to drivers even with drivers_autoprobe disabled. The problem likely started when device_add() was removed from pci_bus_add_device() in commit 4f535093cf8f ("PCI: Put pci_dev in device tree as early as possible") which means that the check for drivers_autoprobe which used to happen in bus_probe_device() is no longer present (previously bus_add_device() calls bus_probe_device()). Conveniently, in commit 91703041697c ("PCI: Allow built-in drivers to use async initial probing") device_attach() was replaced with device_initial_probe() which faciliates this change to push the check for drivers_autoprobe into device_initial_probe(). Make sure all devices check drivers_autoprobe by pushing the drivers_autoprobe check into device_initial_probe(). This will only affect devices on the PCI bus for now as device_initial_probe() is only called by pci_bus_add_device() and bus_probe_device(), but bus_probe_device() already checks for autoprobe, so callers of bus_probe_device() should not observe changes on autoprobing. Note also that pushing this check into device_initial_probe() rather than device_attach() makes it only affect automatic probing of drivers (e.g. when a device is hot-plugged), userspace can still choose to manually bind a driver by writing to drivers_probe sysfs attribute, even with autoprobe disabled. Any future callers of device_initial_probe() will respect the drivers_autoprobe sysfs attribute, which is the intended purpose of drivers_autoprobe. Signed-off-by: Vincent Liu <vincent.liu@nutanix.com> Link: https://patch.msgid.link/20251022120740.2476482-1-vincent.liu@nutanix.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-26	driver core: replace use of system_unbound_wq with system_dfl_wq	Marco Crivellari
	Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") Switch to using system_dfl_wq because system_unbound_wq is going away as part of a workqueue restructuring. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251114141618.172154-2-marco.crivellari@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-07	PM: domains: Detach on device_unbind_cleanup()	Claudiu Beznea
	The dev_pm_domain_attach() function is typically used in bus code alongside dev_pm_domain_detach(), often following patterns like: static int bus_probe(struct device _dev) { struct bus_driver drv = to_bus_driver(dev->driver); struct bus_device dev = to_bus_device(_dev); int ret; // ... ret = dev_pm_domain_attach(_dev, true); if (ret) return ret; if (drv->probe) ret = drv->probe(dev); // ... } static void bus_remove(struct device _dev) { struct bus_driver drv = to_bus_driver(dev->driver); struct bus_device dev = to_bus_device(_dev); if (drv->remove) drv->remove(dev); dev_pm_domain_detach(_dev); } When the driver's probe function uses devres-managed resources that depend on the power domain state, those resources are released later during device_unbind_cleanup(). Releasing devres-managed resources that depend on the power domain state after detaching the device from its PM domain can cause failures. For example, if the driver uses devm_pm_runtime_enable() in its probe function, and the device's clocks are managed by the PM domain, then during removal the runtime PM is disabled in device_unbind_cleanup() after the clocks have been removed from the PM domain. It may happen that the devm_pm_runtime_enable() action causes the device to be runtime- resumed. If the driver specific runtime PM APIs access registers directly, this will lead to accessing device registers without clocks being enabled. Similar issues may occur with other devres actions that access device registers. Add detach_power_off member to struct dev_pm_info, to be used later in device_unbind_cleanup() as the power_off argument for dev_pm_domain_detach(). This is a preparatory step toward removing dev_pm_domain_detach() calls from bus remove functions. Since the current PM domain detach functions (genpd_dev_pm_detach() and acpi_dev_pm_detach()) already set dev->pm_domain = NULL, there should be no issues with bus drivers that still call dev_pm_domain_detach() in their remove functions. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20250703112708.1621607-3-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-15	driver core: introduce device_set_driver() helper	Dmitry Torokhov
	In preparation to closing a race when reading driver pointer in dev_uevent() code, instead of setting device->driver pointer directly introduce device_set_driver() helper. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250311052417.1846985-2-dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-27	Merge tag 'driver-core-6.12-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of patches for the driver core code for 6.12-rc1. This set is the one that caused the most delay on my side, due to lots of last-minute reports of problems in the async shutdown feature that was added. In the end, I've reverted all of the patches in that series so we are back to "normal" and the patch set is being reworked for the next merge window. Other than the async shutdown patches that were reverted, included in here are: - minor driver core cleanups - minor driver core bus and class api cleanups and simplifications for some callbacks - some const markings of structures - other even more minor cleanups All of these, including the last minute reverts, have been in linux-next, but all of the reports of problems in linux-next were before the reverts happened. After the reverts, all is good" * tag 'driver-core-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) Revert "driver core: don't always lock parent in shutdown" Revert "driver core: separate function to shutdown one device" Revert "driver core: shut down devices asynchronously" Revert "nvme-pci: Make driver prefer asynchronous shutdown" Revert "driver core: fix async device shutdown hang" driver core: fix async device shutdown hang driver core: attribute_container: Remove unused functions driver core: Trivially simplify ((struct device_private *)curr)->device->p to @curr devres: Correclty strip percpu address space of devm_free_percpu() argument driver core: Make parameter check consistent for API cluster device_(for_each\|find)_child() bus: fsl-mc: make fsl_mc_bus_type const nvme-pci: Make driver prefer asynchronous shutdown driver core: shut down devices asynchronously driver core: separate function to shutdown one device driver core: don't always lock parent in shutdown platform: Make platform_bus_type constant driver core: class: Check namespace relevant parameters in class_register() driver:base:core: Adding a "Return:" line in comment for device_link_add() drivers/base: Introduce device_match_t for device finding APIs firmware_loader: Block path traversal ...
2024-09-11	driver core: Trivially simplify ((struct device_private *)curr)->device->p ↵	Zijun Hu
	to @curr Trivially simplify ((struct device_private *)curr)->device->p to @curr in deferred_devs_show() since both are same. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20240908-trivial_simpli-v1-1-53e0f1363299@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-29	platform: Add test managed platform_device/driver APIs	Stephen Boyd
	Introduce KUnit resource wrappers around platform_driver_register(), platform_device_alloc(), and platform_device_add() so that test authors can register platform drivers/devices from their tests and have the drivers/devices automatically be unregistered when the test is done. This makes test setup code simpler when a platform driver or platform device is needed. Add a few test cases at the same time to make sure the APIs work as intended. Cc: Brendan Higgins <brendan.higgins@linux.dev> Reviewed-by: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org
2024-06-20	driver core: make [device_]driver_attach take a const *	Greg Kroah-Hartman
	Change device_driver_attach() and driver_attach() to take a const * to struct device driver as neither of them modify the structure at all. Also, for some odd reason, drivers/dma/idxd/compat.c had a duplicate external reference to device_driver_attach(), so remove that to fix up the build, it should never have had that there in the first place. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Petr Tesarik <petr.tesarik.ext@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: dmaengine@vger.kernel.org Link: https://lore.kernel.org/r/2024061401-rasping-manger-c385@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-13	driver core: mark async_driver as a const *	Greg Kroah-Hartman
	Within struct device_private, mark the async_driver * as const as it is never modified. This requires some internal-to-the-driver-core functions to also have their parameters marked as constant, and there is one place where we cast _back_ from the const pointer to a real one, as the driver core still wants to modify the structure in a number of remaining places. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-12-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-13	driver core: make driver_detach() take a const *	Greg Kroah-Hartman
	driver_detach() does not modify the driver itself, so make the pointer constant. In doing so, the function driver_allows_async_probing() also needs to be changed so that the pointer type passes through to that function properly. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-11-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-13	driver core: make device_release_driver_internal() take a const *	Greg Kroah-Hartman
	Change device_release_driver_internal() to take a const struct device_driver * as it is not modifying it at all. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-10-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-07	device: core: Log warning for devices pending deferred probe on timeout	Nícolas F. R. A. Prado
	Once the deferred probe timeout has elapsed it is very likely that the devices that are still deferring probe won't ever be probed. Therefore log the defer probe pending reason at the warning level instead to bring attention to the issue. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-3-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-07	driver: core: Use dev_* instead of pr_* so device metadata is added	Nícolas F. R. A. Prado
	Use the dev_* instead of the pr_* functions to log the status of device probe so that the log message gets the device metadata attached to it. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-2-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-07	driver: core: Log probe failure as error and with device metadata	Nícolas F. R. A. Prado
	Drivers can return -ENODEV or -ENXIO from their probe to reject a device match, and return -EPROBE_DEFER if probe should be retried. Any other error code is not expected during normal behavior and indicates an issue occurred, so it should be logged at the error level. Also make use of the device variant, dev_err(), so that the device metadata is attached to the log message. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-1-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-07	driver core: Emit reason for pending deferred probe	Uwe Kleine-König
	Ending a boot log with platform 3f202000.mmc: deferred probe pending is already a nice hint about the problem. Sometimes there is a more detailed error indicator available, add that to the output. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231122093332.274145-2-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-21	driver core: Release all resources during unbind before updating device links	Saravana Kannan
	This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released. Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: stable <stable@kernel.org> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by: Saravana Kannan <saravanak@google.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Matti Vaittinen <mazziesaccount@gmail.com> Cc: James Clark <james.clark@arm.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-05	driver core: Call dma_cleanup() on the test_remove path	Jason Gunthorpe
	When test_remove is enabled really_probe() does not properly pair dma_configure() with dma_remove(), it will end up calling dma_configure() twice. This corrupts the owner_cnt and renders the group unusable with VFIO/etc. Add the missing cleanup before going back to re_probe. Fixes: 25f3bcfc54bc ("driver core: Add dma_cleanup callback in bus_type") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Closes: https://lore.kernel.org/all/6472f254-c3c4-8610-4a37-8d9dfdd54ce8@huawei.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v2-4deed94e283e+40948-really_probe_dma_cleanup_jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-05	driver core: return bool from driver_probe_done	Christoph Hellwig
	bool is the most sensible return value for a yes/no return. Also add __init as this funtion is only called from the early boot code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-20	driver core: Don't require dynamic_debug for initcall_debug probe timing	Stephen Boyd
	Don't require the use of dynamic debug (or modification of the kernel to add a #define DEBUG to the top of this file) to get the printk message about driver probe timing. This printk is only emitted when initcall_debug is enabled on the kernel commandline, and it isn't immediately obvious that you have to do something else to debug boot timing issues related to driver probe. Add a comment too so it doesn't get converted back to pr_debug(). Fixes: eb7fbc9fb118 ("driver core: Add missing '\n' in log messages") Cc: stable <stable@kernel.org> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Brian Norris <briannorris@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10	driver core: Make state_synced device attribute writeable	Saravana Kannan
	If the file is written to and sync_state() hasn't been called for the device yet, then call sync_state() for the device independent of the state of its consumers. This is useful for supplier devices that have one or more consumers that don't have a driver but the consumers are in a state that don't use the resources supplied by the supplier device. This gives finer grained control than using the fw_devlink.sync_state=timeout kernel commandline parameter. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10	driver core: Add fw_devlink.sync_state command line param	Saravana Kannan
	When all devices that could probe have finished probing (based on deferred_probe_timeout configuration or late_initcall() when !CONFIG_MODULES), this parameter controls what to do with devices that haven't yet received their sync_state() calls. fw_devlink.sync_state=strict is the default and the driver core will continue waiting on all consumers of a device to probe successfully before sync_state() is called for the device. This is the default behavior since calling sync_state() on a device when all its consumers haven't probed could make some systems unusable/unstable. When this option is selected, we also print the list of devices that haven't had sync_state() called on them by the time all devices the could probe have finished probing. fw_devlink.sync_state=timeout will cause the driver core to give up waiting on consumers and call sync_state() on any devices that haven't yet received their sync_state() calls. This option is provided for systems that won't become unusable/unstable as they might be able to save power (depends on state of hardware before kernel starts) if all devices get their sync_state(). Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-08	drivers: base: dd: fix memory leak with using debugfs_lookup()	Greg Kroah-Hartman
	When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18	driver core: bus: move bus notifier logic into bus.c	Greg Kroah-Hartman
	The logic to touch the bus notifier was open-coded in numberous places in the driver core. Clean that up by creating a local bus_notify() function and have everyone call this function instead, making the reading of the caller code simpler and easier to maintain over time. Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-11	driver core: Make driver_deferred_probe_timeout a static variable	Javier Martinez Canillas
	It is not used outside of its compilation unit, so there's no need to export this variable. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-10	driver core: Fix bus_type.match() error handling in __driver_attach()	Isaac J. Manjarres
	When a driver registers with a bus, it will attempt to match with every device on the bus through the __driver_attach() function. Currently, if the bus_type.match() function encounters an error that is not -EPROBE_DEFER, __driver_attach() will return a negative error code, which causes the driver registration logic to stop trying to match with the remaining devices on the bus. This behavior is not correct; a failure while matching a driver to a device does not mean that the driver won't be able to match and bind with other devices on the bus. Update the logic in __driver_attach() to reflect this. Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()") Cc: stable@vger.kernel.org Cc: Saravana Kannan <saravanak@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-10	driver core: mark driver_allows_async_probing static	Christoph Hellwig
	driver_allows_async_probing is only used in drivers/base/dd.c, so mark it static and remove the declaration in drivers/base/base.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-12	Merge 6.0-rc5 into driver-core-next	Greg Kroah-Hartman
	We need the driver core and debugfs changes in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-01	driver_core: move from strlcpy with unused retval to strscpy	Wolfram Sang
	Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220818205956.6528-1-wsa+renesas@sang-engineering.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-01	driver core: Don't probe devices after bus_type.match() probe deferral	Isaac J. Manjarres
	Both __device_attach_driver() and __driver_attach() check the return code of the bus_type.match() function to see if the device needs to be added to the deferred probe list. After adding the device to the list, the logic attempts to bind the device to the driver anyway, as if the device had matched with the driver, which is not correct. If __device_attach_driver() detects that the device in question is not ready to match with a driver on the bus, then it doesn't make sense for the device to attempt to bind with the current driver or continue attempting to match with any of the other drivers on the bus. So, update the logic in __device_attach_driver() to reflect this. If __driver_attach() detects that a driver tried to match with a device that is not ready to match yet, then the driver should not attempt to bind with the device. However, the driver can still attempt to match and bind with other devices on the bus, as drivers can be bound to multiple devices. So, update the logic in __driver_attach() to reflect this. Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()") Cc: stable@vger.kernel.org Cc: Saravana Kannan <saravanak@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-23	Revert "driver core: Delete driver_deferred_probe_check_state()"	Saravana Kannan
	This reverts commit 9cbffc7a59561be950ecc675d19a3d2b45202b2b. There are a few more issues to fix that have been reported in the thread for the original series [1]. We'll need to fix those before this will work. So, revert it for now. [1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/ Fixes: 9cbffc7a5956 ("driver core: Delete driver_deferred_probe_check_state()") Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Peng Fan <peng.fan@nxp.com> Tested-by: Douglas Anderson <dianders@chromium.org> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27	driver core: fix potential deadlock in __driver_attach	Zhang Wensheng
	In __driver_attach function, There are also AA deadlock problem, like the commit b232b02bf3c2 ("driver core: fix deadlock in __device_attach"). stack like commit b232b02bf3c2 ("driver core: fix deadlock in __device_attach"). list below: In __driver_attach function, The lock holding logic is as follows: ... __driver_attach if (driver_allows_async_probing(drv)) device_lock(dev) // get lock dev async_schedule_dev(__driver_attach_async_helper, dev); // func async_schedule_node async_schedule_node_domain(func) entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* when fail or work limit, sync to execute func, but __driver_attach_async_helper will get lock dev as will, which will lead to A-A deadlock. */ if (!entry \|\| atomic_read(&entry_count) > MAX_WORK) { func; else queue_work_node(node, system_unbound_wq, &entry->work) device_unlock(dev) As above show, when it is allowed to do async probes, because of out of memory or work limit, async work is not be allowed, to do sync execute instead. it will lead to A-A deadlock because of __driver_attach_async_helper getting lock dev. Reproduce: and it can be reproduce by make the condition (if (!entry \|\| atomic_read(&entry_count) > MAX_WORK)) untenable, like below: [ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 370.788865] Call Trace: [ 370.789374] <TASK> [ 370.789841] __schedule+0x482/0x1050 [ 370.790613] schedule+0x92/0x1a0 [ 370.791290] schedule_preempt_disabled+0x2c/0x50 [ 370.792256] __mutex_lock.isra.0+0x757/0xec0 [ 370.793158] __mutex_lock_slowpath+0x1f/0x30 [ 370.794079] mutex_lock+0x50/0x60 [ 370.794795] __device_driver_lock+0x2f/0x70 [ 370.795677] ? driver_probe_device+0xd0/0xd0 [ 370.796576] __driver_attach_async_helper+0x1d/0xd0 [ 370.797318] ? driver_probe_device+0xd0/0xd0 [ 370.797957] async_schedule_node_domain+0xa5/0xc0 [ 370.798652] async_schedule_node+0x19/0x30 [ 370.799243] __driver_attach+0x246/0x290 [ 370.799828] ? driver_allows_async_probing+0xa0/0xa0 [ 370.800548] bus_for_each_dev+0x9d/0x130 [ 370.801132] driver_attach+0x22/0x30 [ 370.801666] bus_add_driver+0x290/0x340 [ 370.802246] driver_register+0x88/0x140 [ 370.802817] ? virtio_scsi_init+0x116/0x116 [ 370.803425] scsi_register_driver+0x1a/0x30 [ 370.804057] init_sd+0x184/0x226 [ 370.804533] do_one_initcall+0x71/0x3a0 [ 370.805107] kernel_init_freeable+0x39a/0x43a [ 370.805759] ? rest_init+0x150/0x150 [ 370.806283] kernel_init+0x26/0x230 [ 370.806799] ret_from_fork+0x1f/0x30 To fix the deadlock, move the async_schedule_dev outside device_lock, as we can see, in async_schedule_node_domain, the parameter of queue_work_node is system_unbound_wq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock. Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10	driver core: Delete driver_deferred_probe_check_state()	Saravana Kannan
	The function is no longer used. So delete it. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10	Revert "driver core: Set default deferred_probe_timeout back to 0."	Saravana Kannan
	This reverts commit 11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61. Let's take another shot at getting deferred_probe_timeout=10 to work. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-7-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10	driver core: Add wait_for_init_devices_probe helper function	Saravana Kannan
	Some devices might need to be probed and bound successfully before the kernel boot sequence can finish and move on to init/userspace. For example, a network interface might need to be bound to be able to mount a NFS rootfs. With fw_devlink=on by default, some of these devices might be blocked from probing because they are waiting on a optional supplier that doesn't have a driver. While fw_devlink will eventually identify such devices and unblock the probing automatically, it might be too late by the time it unblocks the probing of devices. For example, the IP4 autoconfig might timeout before fw_devlink unblocks probing of the network interface. This function is available to temporarily try and probe all devices that have a driver even if some of their suppliers haven't been added or don't have drivers. The drivers can then decide which of the suppliers are optional vs mandatory and probe the device if possible. By the time this function returns, all such "best effort" probes are guaranteed to be completed. If a device successfully probes in this mode, we delete all fw_devlink discovered dependencies of that device where the supplier hasn't yet probed successfully because they have to be optional dependencies. This also means that some devices that aren't needed for init and could have waited for their optional supplier to probe (when the supplier's module is loaded later on) would end up probing prematurely with limited functionality. So call this function only when boot would fail without it. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-03	driver core: Set default deferred_probe_timeout back to 0.	Saravana Kannan
	Since we had to effectively reverted commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") in an earlier patch, a non-zero deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set the default back to zero until we can fix that. [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ Fixes: 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration") Cc: Mark Brown <broonie@kernel.org> Cc: Rob Herring <robh@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-03	driver core: Fix wait_for_device_probe() & deferred_probe_timeout interaction	Saravana Kannan
	Mounting NFS rootfs was timing out when deferred_probe_timeout was non-zero [1]. This was because ip_auto_config() initcall times out waiting for the network interfaces to show up when deferred_probe_timeout was non-zero. While ip_auto_config() calls wait_for_device_probe() to make sure any currently running deferred probe work or asynchronous probe finishes, that wasn't sufficient to account for devices being deferred until deferred_probe_timeout. Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") tried to fix that by making sure wait_for_device_probe() waits for deferred_probe_timeout to expire before returning. However, if wait_for_device_probe() is called from the kernel_init() context: - Before deferred_probe_initcall() [2], it causes the boot process to hang due to a deadlock. - After deferred_probe_initcall() [3], it blocks kernel_init() from continuing till deferred_probe_timeout expires and beats the point of deferred_probe_timeout that's trying to wait for userspace to load modules. Neither of this is good. So revert the changes to wait_for_device_probe(). [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ [2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/ [3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/ Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") Cc: John Stultz <jstultz@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Basil Eljuse <Basil.Eljuse@arm.com> Cc: Ferry Toth <fntoth@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: linux-pm@vger.kernel.org Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-03	Merge tag 'driver-core-5.19-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info pld driver core: Add "" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...
2022-05-19	driver core: fix deadlock in __device_attach	Zhang Wensheng
	In __device_attach function, The lock holding logic is as follows: ... __device_attach device_lock(dev) // get lock dev async_schedule_dev(__device_attach_async_helper, dev); // func async_schedule_node async_schedule_node_domain(func) entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* when fail or work limit, sync to execute func, but __device_attach_async_helper will get lock dev as well, which will lead to A-A deadlock. */ if (!entry \|\| atomic_read(&entry_count) > MAX_WORK) { func; else queue_work_node(node, system_unbound_wq, &entry->work) device_unlock(dev) As shown above, when it is allowed to do async probes, because of out of memory or work limit, async work is not allowed, to do sync execute instead. it will lead to A-A deadlock because of __device_attach_async_helper getting lock dev. To fix the deadlock, move the async_schedule_dev outside device_lock, as we can see, in async_schedule_node_domain, the parameter of queue_work_node is system_unbound_wq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock. Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19	driver core: Extend deferred probe timeout on driver registration	Saravana Kannan
	The deferred probe timer that's used for this currently starts at late_initcall and runs for driver_deferred_probe_timeout seconds. The assumption being that all available drivers would be loaded and registered before the timer expires. This means, the driver_deferred_probe_timeout has to be pretty large for it to cover the worst case. But if we set the default value for it to cover the worst case, it would significantly slow down the average case. For this reason, the default value is set to 0. Also, with CONFIG_MODULES=y and the current default values of driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing drivers will cause their consumer devices to always defer their probes. This is because device links created by fw_devlink defer the probe even before the consumer driver's probe() is called. Instead of a fixed timeout, if we extend an unexpired deferred probe timer on every successful driver registration, with the expectation more modules would be loaded in the near future, then the default value of driver_deferred_probe_timeout only needs to be as long as the worst case time difference between two consecutive module loads. So let's implement that and set the default value to 10 seconds when CONFIG_MODULES=y. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Rob Herring <robh@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Kevin Hilman <khilman@kernel.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Mark Brown <broonie@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Cc: linux-gpio@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: iommu@lists.linux-foundation.org Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220429220933.1350374-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19	driver core: Add "*" wildcard support to driver_async_probe cmdline param	Saravana Kannan
	There's currently no way to use driver_async_probe kernel cmdline param to enable default async probe for all drivers. So, add support for "" to match with all driver names. When "" is used, all other drivers listed in driver_async_probe are drivers that will NOT match the "". For example: driver_async_probe=drvA,drvB,drvC drvA, drvB and drvC do asynchronous probing. * driver_async_probe=* All drivers do asynchronous probing except those that have set PROBE_FORCE_SYNCHRONOUS flag. * driver_async_probe=*,drvA,drvB,drvC All drivers do asynchronous probing except drvA, drvB, drvC and those that have set PROBE_FORCE_SYNCHRONOUS flag. Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220504005344.117803-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-02	Merge 5.18-rc5 into driver-core-next	Greg Kroah-Hartman
	We need the kernfs/driver core fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-28	driver core: Add dma_cleanup callback in bus_type	Lu Baolu
	The bus_type structure defines dma_configure() callback for bus drivers to configure DMA on the devices. This adds the paired dma_cleanup() callback and calls it during driver unbinding so that bus drivers can do some cleanup work. One use case for this paired DMA callbacks is for the bus driver to check for DMA ownership conflicts during driver binding, where multiple devices belonging to a same IOMMU group (the minimum granularity of isolation and protection) may be assigned to kernel drivers or user space respectively. Without this change, for example, the vfio driver has to listen to a bus BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict. This leads to bad user experience since careless driver binding operation may crash the system if the admin overlooks the group restriction. Aside from bad design, this leads to a security problem as a root user, even with lockdown=integrity, can force the kernel to BUG. With this change, the bus driver could check and set the DMA ownership in driver binding process and fail on ownership conflicts. The DMA ownership should be released during driver unbinding. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220418005000.897664-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-27	driver core: Prevent overriding async driver of a device before it probe	Mark-PK Tsai
	When there are 2 matched drivers for a device using async probe mechanism, the dev->p->async_driver might be overridden by the last attached driver. So just skip the later one if the previous matched driver was not handled by async thread yet. Below is my use case which having this problem. Make both driver mmcblk and mmc_test allow async probe, the dev->p->async_driver will be overridden by the later driver mmc_test and bind to the device then claim it for testing. When it happen, mmcblk will never do probe again. Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Link: https://lore.kernel.org/r/20220316074328.1801-1-mark-pk.tsai@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26	Documentation: dd: Use ReST lists for return values of ↵	Bagas Sanjaya
	driver_deferred_probe_check_state() Sphinx reported build warnings mentioning drivers/base/dd.c: </path/to/linux>/Documentation/driver-api/infrastructure:35: ./drivers/base/dd.c:280: WARNING: Unexpected indentation. </path/to/linux>/Documentation/driver-api/infrastructure:35: ./drivers/base/dd.c:281: WARNING: Block quote ends without a blank line; unexpected unindent. The warnings above is due to syntax error in the "Return" section of driver_deferred_probe_check_state() which messed up with desired line breaks. Fix the issue by using ReST lists syntax. Fixes: c8c43cee29f6ca ("driver core: Fix driver_deferred_probe_check_state() logic") Cc: linux-pm@vger.kernel.org Cc: stable@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Mark Brown <broonie@kernel.org> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Kevin Hilman <khilman@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Rob Herring <robh@kernel.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20220416071137.19512-1-bagasdotme@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08	net: mdio: don't defer probe forever if PHY IRQ provider is missing	Vladimir Oltean
	When a driver for an interrupt controller is missing, of_irq_get() returns -EPROBE_DEFER ad infinitum, causing fwnode_mdiobus_phy_device_register(), and ultimately, the entire of_mdiobus_register() call, to fail. In turn, any phy_connect() call towards a PHY on this MDIO bus will also fail. This is not what is expected to happen, because the PHY library falls back to poll mode when of_irq_get() returns a hard error code, and the MDIO bus, PHY and attached Ethernet controller work fine, albeit suboptimally, when the PHY library polls for link status. However, -EPROBE_DEFER has special handling given the assumption that at some point probe deferral will stop, and the driver for the supplier will kick in and create the IRQ domain. Reasons for which the interrupt controller may be missing: - It is not yet written. This may happen if a more recent DT blob (with an interrupt-parent for the PHY) is used to boot an old kernel where the driver didn't exist, and that kernel worked with the vintage-correct DT blob using poll mode. - It is compiled out. Behavior is the same as above. - It is compiled as a module. The kernel will wait for a number of seconds specified in the "deferred_probe_timeout" boot parameter for user space to load the required module. The current default is 0, which times out at the end of initcalls. It is possible that this might cause regressions unless users adjust this boot parameter. The proposed solution is to use the driver_deferred_probe_check_state() helper function provided by the driver core, which gives up after some -EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration. The return code is changed from -EPROBE_DEFER into -ENODEV or -ETIMEDOUT, depending on whether the kernel is compiled with support for modules or not. Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral") Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>