summaryrefslogtreecommitdiff
path: root/include/linux/pci.h
AgeCommit message (Collapse)Author
2026-02-12Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull CXL updates from Dave Jiang: - Introduce cxl_memdev_attach and pave way for soft reserved handling, type2 accelerator enabling, and LSA 2.0 enabling. All these series require the endpoint driver to settle before continuing the memdev driver probe. - Address CXL port error protocol handling and reporting. The large patch series was split into three parts. The first two parts are included here with the final part coming later. The first part consists of a series of code refactoring to PCI AER sub-system that addresses CXL and also CXL RAS code to prepare for port error handling. The second part refactors the CXL code to move management of component registers to cxl_port objects to allow all CXL AER errors to be handled through the cxl_port hierarchy. - Provide AMD Zen5 platform address translation for CXL using ACPI PRMT. This includes a conventions document to explain why this is needed and how it's implemented. - Misc CXL patches of fixes, cleanups, and updates. Including CXL address translation for unaligned MOD3 regions. [ TLA service: CXL is "Compute Express Link" ] * tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits) cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @hpa to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion cxl/port: Unify endpoint and switch port lookup cxl/port: Move endpoint component register management to cxl_port cxl/port: Map Port RAS registers cxl/port: Move dport RAS setup to dport add time ...
2026-02-11Merge tag 'pci-v7.0-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Don't try to enable Extended Tags on VFs since that bit is Reserved and causes misleading log messages (Håkon Bugge) - Initialize Endpoint Read Completion Boundary to match Root Port, regardless of ACPI _HPX (Håkon Bugge) - Apply _HPX PCIe Setting Record only to AER configuration, and only when OS owns PCIe hotplug but not AER, to avoid clobbering Extended Tag and Relaxed Ordering settings (Håkon Bugge) Resource management: - Move CardBus code to setup-cardbus.c and only build it when CONFIG_CARDBUS is set (Ilpo Järvinen) - Fix bridge window alignment with optional resources, where additional alignment requirement was previously lost (Ilpo Järvinen) - Stop over-estimating bridge window size since they are now assigned without any gaps between them (Ilpo Järvinen) - Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening for nested bridges and endpoints (Ilpo Järvinen) - Add pbus_mem_size_optional() to handle sizes of optional resources (SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen) - Don't claim disabled bridge windows to avoid spurious claim failures (Ilpo Järvinen) Driver binding: - Fix device reference leak in pcie_port_remove_service() (Uwe Kleine-König) - Move pcie_port_bus_match() and pcie_port_bus_type to PCIe-specific portdrv.c (Uwe Kleine-König) - Convert portdrv to use pcie_port_bus_type.probe() and .remove() callbacks so .probe() and .remove() can eventually be removed from struct device_driver (Uwe Kleine-König) Error handling: - Clear stale errors on reporting agents upon probe so they don't look like recent errors (Lukas Wunner) - Add generic RAS tracepoint for hotplug events (Shuai Xue) - Add RAS tracepoint for link speed changes (Shuai Xue) Power management: - Avoid redundant delay on transition from D3hot to D3cold if the device was already in D3hot (Brian Norris) - Prevent runtime suspend until devices are fully initialized to avoid saving incompletely configured device state (Brian Norris) Power control: - Add power_on/off callbacks with generic signature to pwrseq, tc9563, and slot drivers so they can be used by pwrctrl core (Manivannan Sadhasivam) - Add PCIe M.2 connector support to the slot pwrctrl driver (Manivannan Sadhasivam) - Switch to pwrctrl interfaces to create, destroy, and power on/off devices, calling them from host controller drivers instead of the PCI core (Manivannan Sadhasivam) - Drop qcom .assert_perst() callbacks since this is now done by the controller driver instead of the pwrctrl driver (Manivannan Sadhasivam) Virtualization: - Remove an incorrect unlock in pci_slot_trylock() error handling (Jinhui Guo) - Lock the bridge device for slot reset (Keith Busch) - Enable ACS after IOMMU configuration on OF platforms so ACS is enabled an all devices; previously the first device enumerated (typically a Root Port) didn't have ACS enabled (Manivannan Sadhasivam) - Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to work around hardware erratum; previously ACS SV was only temporarily disabled, which worked for enumeration but not after reset (Manivannan Sadhasivam) Peer-to-peer DMA: - Release per-CPU pgmap ref when vm_insert_page() fails to avoid hang when removing the PCI device (Hou Tao) - Remove incorrect p2pmem_alloc_mmap() warning about page refcount (Hou Tao) Endpoint framework: - Add configfs sub-groups synchronously to avoid NULL pointer dereference when racing with removal (Liu Song) - Fix swapped parameters in pci_{primary/secondary}_epc_epf_unlink() functions (Manikanta Maddireddy) ASPEED PCIe controller driver: - Add ASPEED Root Complex DT binding and driver (Jacky Chou) Freescale i.MX6 PCIe controller driver: - Add DT binding and driver support for an optional external refclock in addition to the refclock from the internal PLL (Richard Zhu) - Fix CLKREQ# control so host asserts it during enumeration and Endpoints can use it afterwards to exit the L1.2 link state (Richard Zhu) NVIDIA Tegra PCIe controller driver: - Export irq_domain_free_irqs() to allow PCI/MSI drivers that tear down MSI domains to be built as modules (Aaron Kling) - Allow pci-tegra to be built as a module (Aaron Kling) NVIDIA Tegra194 PCIe controller driver: - Relax Kconfig so tegra194 can be built for platforms beyond Tegra194 (Vidya Sagar) Qualcomm PCIe controller driver: - Merge SC8180x DT binding into SM8150 (Krzysztof Kozlowski) - Move SDX55, SDM845, QCS404, IPQ5018, IPQ6018, IPQ8074 Gen3, IPQ8074, IPQ4019, IPQ9574, APQ8064, MSM8996, APQ8084 to dedicated schema (Krzysztof Kozlowski) - Add DT binding and driver support for SA8255p Endpoint being configured by firmware (Mrinmay Sarkar) - Parse PERST# from all PCIe bridge nodes for future platforms that will have PERST# in Switch Downstream Ports as well as in Root Ports (Manivannan Sadhasivam) Renesas RZ/G3S PCIe controller driver: - Use pci_generic_config_write() since the writability provided by the custom wrapper is unnecessary (Claudiu Beznea) SOPHGO PCIe controller driver: - Disable ASPM L0s and L1 on Sophgo 2044 PCIe Root Ports (Inochi Amaoto) Synopsys DesignWare PCIe controller driver: - Extend PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() to return a pointer to the preceding Capability, to allow removal of Capabilities that are advertised but not fully implemented (Qiang Yu) - Remove MSI and MSI-X Capabilities in platforms that can't support them, so the PCI core automatically falls back to INTx (Qiang Yu) - Add ASPM L1.1 and L1.2 Substates context to debugfs ltssm_status for drivers that support this (Shawn Lin) - Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if link is not up to avoid an unnecessary timeout (Manivannan Sadhasivam) - Revert dw-rockchip, qcom, and DWC core changes that used link-up IRQs to trigger enumeration instead of waiting for link to be up because the PCI core doesn't allocate bus number space for hierarchies that might be attached (Niklas Cassel) - Make endpoint iATU entry for MSI permanent instead of programming it dynamically, which is slow and racy with respect to other concurrent traffic, e.g., eDMA (Koichiro Den) - Use iMSI-RX MSI target address when possible to fix endpoints using 32-bit MSI (Shawn Lin) - Allow DWC host controller driver probe to continue if device is not found or found but inactive; only fail when there's an error with the link (Manivannan Sadhasivam) - For controllers like NXP i.MX6QP and i.MX7D, where LTSSM registers are not accessible after PME_Turn_Off, simply wait 10ms instead of polling for L2/L3 Ready (Richard Zhu) - Use multiple iATU entries to map large bridge windows and DMA ranges when necessary instead of failing (Samuel Holland) - Add EPC dynamic_inbound_mapping feature bit for Endpoint Controllers that can update BAR inbound address translation without requiring EPF driver to clear/reset the BAR first, and advertise it for DWC-based Endpoints (Koichiro Den) - Add EPC subrange_mapping feature bit for Endpoint Controllers that can map multiple independent inbound regions in a single BAR, implement subrange mapping, advertise it for DWC-based Endpoints, and add Endpoint selftests for it (Koichiro Den) - Make resizable BARs work for Endpoint multi-PF configurations; previously it only worked for PF 0 (Aksh Garg) - Fix Endpoint non-PF 0 support for BAR configuration, ATU mappings, and Address Match Mode (Aksh Garg) - Set up iATU when ECAM is enabled; previously IO and MEM outbound windows weren't programmed, and ECAM-related iATU entries weren't restored after suspend/resume, so config accesses failed (Krishna Chaitanya Chundru) Miscellaneous: - Use system_percpu_wq and WQ_PERCPU to explicitly request per-CPU work so WQ_UNBOUND can eventually be removed (Marco Crivellari)" * tag 'pci-v7.0-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (176 commits) PCI/bwctrl: Disable BW controller on Intel P45 using a quirk PCI: Disable ACS SV for IDT 0x8090 switch PCI: Disable ACS SV for IDT 0x80b5 switch PCI: Cache ACS Capabilities register PCI: Enable ACS after configuring IOMMU for OF platforms PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404] PCI: Add ACS quirk for Qualcomm Hamoa & Glymur PCI: Use device_lock_assert() to verify device lock is held PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held PCI: Fix pci_slot_lock () device locking PCI: Fix pci_slot_trylock() error handling PCI: Mark Nvidia GB10 to avoid bus reset PCI: Mark ASM1164 SATA controller to avoid bus reset PCI: host-generic: Avoid reporting incorrect 'missing reg property' error PCI/PME: Replace RMW of Root Status register with direct write PCI/AER: Clear stale errors on reporting agents upon probe PCI: Don't claim disabled bridge windows PCI: rzg3s-host: Fix device node reference leak in rzg3s_pcie_host_parse_port() PCI: dwc: Fix missing iATU setup when ECAM is enabled PCI: dwc: Clean up iATU index usage in dw_pcie_iatu_setup() ...
2026-02-10Merge tag 'irq-msi-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the [PCI] MSI subsystem: - Add interrupt redirection infrastructure Some PCI controllers use a single demultiplexing interrupt for the MSI interrupts of subordinate devices. This prevents setting the interrupt affinity of device interrupts, which causes device interrupts to be delivered to a single CPU. That obviously is counterproductive for multi-queue devices and interrupt balancing. To work around this limitation the new infrastructure installs a dummy irq_set_affinity() callback which captures the affinity mask and picks a redirection target CPU out of the mask. When the PCI controller demultiplexes the interrupts it invokes a new handling function in the core, which either runs the interrupt handler in the context of the target CPU or delegates it to irq_work on the target CPU. - Utilize the interrupt redirection mechanism in the PCI DWC host controller driver. This allows affinity control for the subordinate device MSI interrupts instead of being randomly executed on the CPU which runs the demultiplex handler. - Replace the binary 64-bit MSI flag with a DMA mask Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. With the binary 64-bit flag there is no other choice than disabling 64-bit MSI support which leaves the device disfunctional. By using a DMA mask the address limit of a device can be described correctly which provides support for the above scenario. - Make use of the DMA mask based address limit in the hda/intel and radeon drivers to enable them on affected platforms - The usual small cleanups and improvements" * tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ALSA: hda/intel: Make MSI address limit based on the device DMA limit drm/radeon: Make MSI address limit based on the device DMA limit PCI/MSI: Check the device specific address mask in msi_verify_entries() PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask genirq/redirect: Prevent writing MSI message on affinity change PCI/MSI: Unmap MSI-X region on error genirq: Update effective affinity for redirected interrupts PCI: dwc: Enable MSI affinity support PCI: dwc: Code cleanup genirq: Add interrupt redirection infrastructure genirq/msi: Correct kernel-doc in <linux/msi.h>
2026-02-09Merge tag 'kthread-for-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks Pull kthread updates from Frederic Weisbecker: "The kthread code provides an infrastructure which manages the preferred affinity of unbound kthreads (node or custom cpumask) against housekeeping (CPU isolation) constraints and CPU hotplug events. One crucial missing piece is the handling of cpuset: when an isolated partition is created, deleted, or its CPUs updated, all the unbound kthreads in the top cpuset become indifferently affine to _all_ the non-isolated CPUs, possibly breaking their preferred affinity along the way. Solve this with performing the kthreads affinity update from cpuset to the kthreads consolidated relevant code instead so that preferred affinities are honoured and applied against the updated cpuset isolated partitions. The dispatch of the new isolated cpumasks to timers, workqueues and kthreads is performed by housekeeping, as per the nice Tejun's suggestion. As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set from boot defined domain isolation (through isolcpus=) and cpuset isolated partitions. Housekeeping cpumasks are now modifiable with a specific RCU based synchronization. A big step toward making nohz_full= also mutable through cpuset in the future" * tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (33 commits) doc: Add housekeeping documentation kthread: Document kthread_affine_preferred() kthread: Comment on the purpose and placement of kthread_affine_node() call kthread: Honour kthreads preferred affinity after cpuset changes sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management kthread: Include kthreadd to the managed affinity list kthread: Include unbound kthreads in the managed affinity list kthread: Refine naming of affinity related fields PCI: Remove superfluous HK_TYPE_WQ check sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() cpuset: Remove cpuset_cpu_is_isolated() timers/migration: Remove superfluous cpuset isolation test cpuset: Propagate cpuset isolation update to timers through housekeeping cpuset: Propagate cpuset isolation update to workqueue through housekeeping PCI: Flush PCI probe workqueue on cpuset isolated partition change sched/isolation: Flush vmstat workqueues on cpuset isolated partition change sched/isolation: Flush memcg workqueues on cpuset isolated partition change cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset ...
2026-02-06Merge branch 'pci/virtualization'Bjorn Helgaas
- Mark ASM1164 SATA controller to avoid bus reset since it fails to train the Link after reset (Alex Williamson) - Mark Nvidia GB10 Root Ports to avoid bus reset since they may fail to retrain the link after reset (Johnny-CC Chang) - Add lockdep and other lock assertions (Ilpo Järvinen) - Add ACS quirk for Qualcomm Hamoa & Glymur, which provides ACS-like features but doesn't advertise an ACS Capability (Krishna Chaitanya Chundru) - Add ACS quirk for Pericom PI7C9X2G404 switches, which fail under load when P2P Redirect Request is enabled (Nicolas Cavallari) - Remove an incorrect unlock in pci_slot_trylock() error handling (Jinhui Guo) - Lock the bridge device for slot reset (Keith Busch) - Enable ACS after IOMMU configuration on OF platforms so ACS is enabled an all devices; previously the first device enumeration (typically a Root Port) was omitted (Manivannan Sadhasivam) - Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to work around hardware erratum; previously ACS SV was temporarily disabled, which worked for enumeration but not after reset (Manivannan Sadhasivam) * pci/virtualization: PCI: Disable ACS SV for IDT 0x8090 switch PCI: Disable ACS SV for IDT 0x80b5 switch PCI: Cache ACS Capabilities register PCI: Enable ACS after configuring IOMMU for OF platforms PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404] PCI: Add ACS quirk for Qualcomm Hamoa & Glymur PCI: Use device_lock_assert() to verify device lock is held PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held PCI: Fix pci_slot_lock () device locking PCI: Fix pci_slot_trylock() error handling PCI: Mark Nvidia GB10 to avoid bus reset PCI: Mark ASM1164 SATA controller to avoid bus reset
2026-02-06Merge branch 'pci/resource'Bjorn Helgaas
- Build zero-sized resources when a BAR is larger than 4G but pci_bus_addr_t or resource_size_t can't represent 64-bit addresses (Ilpo Järvinen) - Fix bridge window alignment with optional resources, where we previously lost the additional alignment requirement (Ilpo Järvinen) - Stop over-estimating bridge window size since we now assign them without any gaps between them (Ilpo Järvinen) - Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening for nested bridges and endpoints (Ilpo Järvinen) - Remove old_size limit from bridge window sizing (Ilpo Järvinen) - Push realloc check into pbus_size_mem() to simplify callers (Ilpo Järvinen) - Pass bridge window resource to pbus_size_mem() to avoid looking it up again (Ilpo Järvinen) - Use res_to_dev_res() instead of open-coding the same search (Ilpo Järvinen) - Add pci_resource_is_bridge_win() helper (Ilpo Järvinen) - Add more logging of resource assignment (Ilpo Järvinen) - Add pbus_mem_size_optional() to handle sizes of optional resources (SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen) - Move CardBus code to setup-cardbus.c and only build it when CONFIG_CARDBUS is set (Ilpo Järvinen) - Use scnprintf() instead of sprintf() (Ilpo Järvinen) - Add pbus_validate_busn() for Bus Number validation (Ilpo Järvinen) - Don't claim disabled bridge windows to avoid spurious claim failures (Ilpo Järvinen) * pci/resource: PCI: Don't claim disabled bridge windows PCI: Move CardBus bridge scanning to setup-cardbus.c PCI: Add pbus_validate_busn() for Bus Number validation PCI: Add dword #defines for Bus Number + Secondary Latency Timer PCI: Use scnprintf() instead of sprintf() PCI: Handle CardBus-specific params in setup-cardbus.c PCI: Separate CardBus setup & build it only with CONFIG_CARDBUS PCI: Add 'pci' prefix to struct pci_dev_resource handling functions PCI: Use resource_assigned() in setup-bus.c algorithm resource: Mark res given to resource_assigned() as const PCI: Add pbus_mem_size_optional() to handle optional sizes PCI: Check invalid align earlier in pbus_size_mem() PCI: Log reset and restore of resources PCI: Add pci_resource_is_bridge_win() PCI: Fetch dev_res to local var in __assign_resources_sorted() PCI: Use res_to_dev_res() in reassign_resources_sorted() PCI: Pass bridge window resource to pbus_size_mem() PCI: Push realloc check into pbus_size_mem() PCI: Remove old_size limit from bridge window sizing resource: Increase MAX_IORES_LEVEL to 8 PCI: Stop over-estimating bridge window size PCI: Rewrite bridge window head alignment function PCI: Fix bridge window alignment with optional resources PCI: Use resource_set_range() that correctly sets ->end
2026-02-06Merge branch 'pci/pwrctrl'Bjorn Helgaas
- Rename pwrseq, tc9563, and slot driver structs, variables, and functions for consistency (Bjorn Helgaas) - Add power_on/off callbacks with generic signature to pwrseq, tc9563, and slot drivers so they can be used by pwrctrl core (Manivannan Sadhasivam) - Add interfaces to create and destroy pwrctrl devices (Krishna Chaitanya Chundru) - Add interfaces to power devices on and off (Manivannan Sadhasivam) - Switch to pwrctrl interfaces to create, destroy, and power on/off devices, calling them from host controller drivers instead of the PCI core (Manivannan Sadhasivam) - Drop qcom .assert_perst() callbacks since this is now done by the controller driver instead of the pwrctrl driver (Manivannan Sadhasivam) - Add PCIe M.2 connector support to the slot pwrctrl driver (Manivannan Sadhasivam) - Create pwrctrl devices for devicetree PCIe M.2 connector nodes (Manivannan Sadhasivam) * pci/pwrctrl: PCI/pwrctrl: Create pwrctrl device if graph port is found PCI/pwrctrl: Add PCIe M.2 connector support PCI: Drop the assert_perst() callback PCI: qcom: Drop the assert_perst() callbacks PCI/pwrctrl: Switch to pwrctrl create, power on/off, destroy APIs PCI/pwrctrl: Add APIs to power on/off pwrctrl devices PCI/pwrctrl: Add APIs to create, destroy pwrctrl devices PCI/pwrctrl: Add 'struct pci_pwrctrl::power_{on/off}' callbacks PCI/pwrctrl: pwrseq: Factor out power on/off code to helpers PCI/pwrctrl: slot: Factor out power on/off code to helpers PCI/pwrctrl: tc9563: Rename private struct and pointers for consistency PCI/pwrctrl: tc9563: Add local variables to reduce repetition PCI/pwrctrl: tc9563: Clean up whitespace PCI/pwrctrl: tc9563: Use put_device() instead of i2c_put_adapter() PCI/pwrctrl: slot: Rename private struct and pointers for consistency PCI/pwrctrl: pwrseq: Rename private struct and pointers for consistency # Conflicts: # drivers/pci/bus.c
2026-02-06Merge branch 'pci/iommu'Bjorn Helgaas
- Add PCI_BRIDGE_NO_ALIAS quirk for ASPEED AST1150, where VGA and USB are behind a PCIe-to-PCI bridge and share the same StreamID (Nirmoy Das) * pci/iommu: PCI: Add PCI_BRIDGE_NO_ALIAS quirk for ASPEED AST1150 PCI: Add ASPEED vendor ID to pci_ids.h
2026-02-06PCI/bwctrl: Disable BW controller on Intel P45 using a quirkIlpo Järvinen
The commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") was found to lead to a boot hang on a Intel P45 system. Testing without setting Link Bandwidth Management Interrupt Enable (LBMIE) and Link Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe r7.0, sec 7.5.3.7) in bwctrl allowed system to come up. P45 is a very old chipset and supports only up to gen2 PCIe, so not having bwctrl does not seem a huge deficiency. Add no_bw_notif in struct pci_dev and quirk Intel P45 Root Port with it. Reported-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://lore.kernel.org/linux-pci/aUCt1tHhm_-XIVvi@eggsbenedict/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://patch.msgid.link/20260116131513.2359-1-ilpo.jarvinen@linux.intel.com
2026-02-06PCI: Cache ACS Capabilities registerManivannan Sadhasivam
The ACS Capability register is read-only. Cache it to allow quirks to override it and to avoid re-reading it. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://patch.msgid.link/20260102-pci_acs-v3-2-72280b94d288@oss.qualcomm.com
2026-02-03PCI: Flush PCI probe workqueue on cpuset isolated partition changeFrederic Weisbecker
The HK_TYPE_DOMAIN housekeeping cpumask is now modifiable at runtime. In order to synchronize against PCI probe works and make sure that no asynchronous probing is still pending or executing on a newly isolated CPU, the housekeeping subsystem must flush the PCI probe works. However the PCI probe works can't be flushed easily since they are queued to the main per-CPU workqueue pool. Solve this with creating a PCI probe-specific pool and provide and use the appropriate flushing API. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Marco Crivellari <marco.crivellari@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: linux-pci@vger.kernel.org
2026-01-31PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address maskVivian Wang
Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. Currently, no_64bit_msi bit is set for these devices, meaning that only 32-bit MSI addresses are allowed for them. However, on some platforms the MSI doorbell address is above the 32-bit limit but within the addressable range of the device. As a first step to enable MSI on those combinations of devices and platforms, convert the boolean no_64bit_msi flag to a DMA mask and fixup the affected usage sites: - no_64bit_msi = 1 -> msi_addr_mask = DMA_BIT_MASK(32) - no_64bit_msi = 0 -> msi_addr_mask = DMA_BIT_MASK(64) - if (no_64bit_msi) -> if (msi_addr_mask < DMA_BIT_MASK(64)) Since no values other than DMA_BIT_MASK(32) and DMA_BIT_MASK(64) are used, this is functionally equivalent. This prepares for changing the binary decision between 32 and 64 bit to a DMA mask based decision which allows to support systems which have a DMA address space less than 64bit but a MSI doorbell address above the 32-bit limit. [ tglx: Massaged changelog ] Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> # ionic Reviewed-by: Thomas Gleixner <tglx@kernel.org> Acked-by: Takashi Iwai <tiwai@suse.de> # sound Link: https://patch.msgid.link/20260129-pci-msi-addr-mask-v4-1-70da998f2750@iscas.ac.cn
2026-01-27PCI: Separate CardBus setup & build it only with CONFIG_CARDBUSIlpo Järvinen
PCI bridge window setup code includes special code to handle CardBus bridges. CardBus has long since fallen out of favor and modern systems have no use for it. Move CardBus setup code to its own file and use existing CONFIG_CARDBUS to decide whether it should be built or not. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251219174036.16738-18-ilpo.jarvinen@linux.intel.com
2026-01-22PCI: Introduce pcie_is_cxl()Terry Bowman
CXL is a protocol that runs on top of PCIe electricals. Its error model also runs on top of the PCIe AER error model by standardizing "internal" errors as "CXL" errors. Linux has historically ignored internal errors. CXL protocol error handling is then a task of enhancing the PCIe AER core to understand that PCIe ports (upstream and downstream) and endpoints may throw internal errors that represent standard CXL protocol errors. The proposed method to make that determination is to teach 'struct pci_dev' to cache when its link has trained the CXL.mem and/or CXL.cache protocols and then treat all internal errors as CXL errors. A design goal is to not burden the PCIe AER core with CXL knowledge beyond just enough to forward error notifications to the CXL RAS core. The forwarded notification looks up a 'struct cxl_port' or 'struct cxl_dport' companion device to the PCI device. Introduce set_pcie_cxl() with logic checking for CXL.mem or CXL.cache status in the CXL Flex Bus DVSEC status register. The CXL Flex Bus DVSEC presence is used because it is required for all the CXL PCIe devices.[1] [1] CXL 3.1 Spec, 8.1.1 PCIe Designated Vendor-Specific Extended Capability (DVSEC) ID Assignment, Table 8-2 Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260114182055.46029-4-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-16PCI: Drop the assert_perst() callbackManivannan Sadhasivam
Now since all .assert_callback() implementations have been removed from the controller drivers, drop the .assert_callback callback from pci.h. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260115-pci-pwrctrl-rework-v5-14-9d26da3ce903@oss.qualcomm.com
2026-01-13PCI: Add PCI_BRIDGE_NO_ALIAS quirk for ASPEED AST1150Nirmoy Das
ASPEED BMC controllers have VGA and USB functions behind a PCIe-to-PCI bridge that causes them to share the same StreamID: [e0]---00.0-[e1-e2]----00.0-[e2]--+-00.0 ASPEED Graphics Family \-02.0 ASPEED USB Controller Both devices get StreamID 0x5e200 due to bridge aliasing, causing the USB controller to be rejected with 'Aliasing StreamID unsupported'. Per ASPEED, the AST1150 doesn't use a real PCI bus and always forwards the original Requester ID from downstream devices rather than replacing it with any alias. Add a new PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS flag and apply it to the AST1150. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://patch.msgid.link/20251217154529.377586-2-nirmoyd@nvidia.com
2026-01-12PCI: Provide pci_free_irq_vectors() stubBoqun Feng
473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") fixed a build error by providing Rust helpers when CONFIG_PCI_MSI is not set. However the Rust helpers rely on pci_free_irq_vectors(), which is only available when CONFIG_PCI=y. When CONFIG_PCI is not set, there is already a stub for pci_alloc_irq_vectors(). Add a similar stub for pci_free_irq_vectors(). Fixes: 473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") Reported-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Closes: https://lore.kernel.org/rust-for-linux/20251209014312.575940-1-fujita.tomonori@gmail.com/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512220740.4Kexm4dW-lkp@intel.com/ Reported-by: Liang Jie <liangjie@lixiang.com> Closes: https://lore.kernel.org/rust-for-linux/20251222034415.1384223-1-buaajxlj@163.com/ Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Drew Fustini <fustini@kernel.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20251226113938.52145-1-boqun.feng@gmail.com
2025-12-06Merge tag 'tsm-for-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm Pull PCIe Link Encryption and Device Authentication from Dan Williams: "New PCI infrastructure and one architecture implementation for PCIe link encryption establishment via platform firmware services. This work is the result of multiple vendors coming to consensus on some core infrastructure (thanks Alexey, Yilun, and Aneesh!), and three vendor implementations, although only one is included in this pull. The PCI core changes have an ack from Bjorn, the crypto/ccp/ changes have an ack from Tom, and the iommu/amd/ changes have an ack from Joerg. PCIe link encryption is made possible by the soup of acronyms mentioned in the shortlog below. Link Integrity and Data Encryption (IDE) is a protocol for installing keys in the transmitter and receiver at each end of a link. That protocol is transported over Data Object Exchange (DOE) mailboxes using PCI configuration requests. The aspect that makes this a "platform firmware service" is that the key provisioning and protocol is coordinated through a Trusted Execution Envrionment (TEE) Security Manager (TSM). That is either firmware running in a coprocessor (AMD SEV-TIO), or quasi-hypervisor software (Intel TDX Connect / ARM CCA) running in a protected CPU mode. Now, the only reason to ask a TSM to run this protocol and install the keys rather than have a Linux driver do the same is so that later, a confidential VM can ask the TSM directly "can you certify this device?". That precludes host Linux from provisioning its own keys, because host Linux is outside the trust domain for the VM. It also turns out that all architectures, save for one, do not publish a mechanism for an OS to establish keys in the root port. So "TSM-established link encryption" is the only cross-architecture path for this capability for the foreseeable future. This unblocks the other arch implementations to follow in v6.20/v7.0, once they clear some other dependencies, and it unblocks the next phase of work to implement the end-to-end flow of confidential device assignment. The PCIe specification calls this end-to-end flow Trusted Execution Environment (TEE) Device Interface Security Protocol (TDISP). In the meantime, Linux gets a link encryption facility which has practical benefits along the same lines as memory encryption. It authenticates devices via certificates and may protect against interposer attacks trying to capture clear-text PCIe traffic. Summary: - Introduce the PCI/TSM core for the coordination of device authentication, link encryption and establishment (IDE), and later management of the device security operational states (TDISP). Notify the new TSM core layer of PCI device arrival and departure - Add a low level TSM driver for the link encryption establishment capabilities of the AMD SEV-TIO architecture - Add a library of helpers TSM drivers to use for IDE establishment and the DOE transport - Add skeleton support for 'bind' and 'guest_request' operations in support of TDISP" * tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: (23 commits) crypto/ccp: Fix CONFIG_PCI=n build virt: Fix Kconfig warning when selecting TSM without VIRT_DRIVERS crypto/ccp: Implement SEV-TIO PCIe IDE (phase1) iommu/amd: Report SEV-TIO support psp-sev: Assign numbers to all status codes and add new ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public PCI/TSM: Add 'dsm' and 'bound' attributes for dependent functions PCI/TSM: Add pci_tsm_guest_req() for managing TDIs PCI/TSM: Add pci_tsm_bind() helper for instantiating TDIs PCI/IDE: Initialize an ID for all IDE streams PCI/IDE: Add Address Association Register setup for downstream MMIO resource: Introduce resource_assigned() for discerning active resources PCI/TSM: Drop stub for pci_tsm_doe_transfer() drivers/virt: Drop VIRT_DRIVERS build dependency PCI/TSM: Report active IDE streams PCI/IDE: Report available IDE streams PCI/IDE: Add IDE establishment helpers PCI: Establish document for PCI host bridge sysfs attributes PCI: Add PCIe Device 3 Extended Capability enumeration PCI/TSM: Establish Secure Sessions and Link Encryption ...
2025-12-04Merge tag 'pci-v6.19-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan Williams) - Switch vmd from custom domain number allocator to the common allocator to prevent a potential race with new non-VMD buses (Dan Williams) - Enable Precision Time Measurement (PTM) only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) Resource management: - Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) Power management and error handling: - For drivers using PCI legacy suspend, save config state at suspend so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - For devices with no driver or a driver that lacks power management, save config state at hibernate so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - Save device config space on device addition, before driver binding, so error recovery works more reliably (Lukas Wunner) - Drop pci_save_state() from several drivers that no longer need it since the PCI core always does it and pci_restore_state() no longer invalidates the saved state (Lukas Wunner) - Document use of pci_save_state() by drivers to capture the state they want restored during error recovery (Lukas Wunner) Power control: - Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) Endpoint framework: - Convert the endpoint doorbell test to use a threaded IRQ to fix a 'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri) - Add endpoint VNTB MSI doorbell support to reduce latency between host and endpoint (Frank Li) New native PCIe controller drivers: - Add CIX Sky1 host controller DT binding and driver (Hans Zhang) - Add NXP S32G host controller DT binding and driver (Vincent Guittot) - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu Beznea) - Add SpacemiT K1 host controller DT binding and driver (Alex Elder) Amlogic Meson PCIe controller driver: - Update DT binding to name DBI region 'dbi', not 'elbi', and update driver to support both (Manivannan Sadhasivam) Apple PCIe controller driver: - Move struct pci_host_bridge allocation from pci_host_common_init() to callers, which significantly simplifies pcie-apple (Marc Zyngier) Broadcom STB PCIe controller driver: - Disable advertising ASPM L0s support correctly (Jim Quinlan) - Add a panic/die handler to print diagnostic info in case PCIe caused an unrecoverable abort (Jim Quinlan) Cadence PCIe controller driver: - Add module support for Cadence platform host and endpoint controller driver (Manikandan K Pillai) - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare for new CIX Sky1 driver (Manikandan K Pillai) MediaTek PCIe controller driver: - Convert DT binding to YAML schema (Christian Marangi) - Add Airoha AN7583 DT compatible and driver support (Christian Marangi) Qualcomm PCIe controller driver: - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu) - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280, sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT schemas (Krzysztof Kozlowski) - Look up OPP using both frequency and data rate (not just frequency) so RPMh votes can account for both (Krishna Chaitanya Chundru) Rockchip DesignWare PCIe controller driver: - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi) STMicroelectronics STM32MP25 PCIe controller driver: - Fix a race between link training and endpoint register initialization (Christian Bruel) - Align endpoint allocations to match the ATU requirements (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Clear L1 PM Substate Capability 'Supported' bits unless glue driver says it's supported, which prevents users from enabling non-working L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas) - Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas) - Configure L1SS support in dw-rockchip when DT says 'supports-clkreq' (Shawn Lin) TI Keystone PCIe controller driver: - Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli)" * tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits) MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer PCI: sky1: Add PCIe host support for CIX Sky1 dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings PCI: cadence: Add support for High Perf Architecture (HPA) controller MAINTAINERS: Add NXP S32G PCIe controller driver maintainer PCI: s32g: Add NXP S32G PCIe controller driver (RC) PCI: dwc: Add register and bitfield definitions dt-bindings: PCI: s32g: Add NXP S32G PCIe controller PCI: Add Renesas RZ/G3S host controller driver PCI: host-generic: Move bridge allocation outside of pci_host_common_init() dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding PCI: Validate pci_rebar_size_supported() input Documentation: PCI: Amend error recovery doc with pci_save_state() rules treewide: Drop pci_save_state() after pci_restore_state() PCI/ERR: Ensure error recoverability at all times PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths PCI: dw-rockchip: Configure L1SS support PCI: tegra194: Remove unnecessary L1SS disable code ...
2025-12-03Merge branch 'pci/pwrctrl-tc9563'Bjorn Helgaas
- Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) * pci/pwrctrl-tc9563: PCI: pwrctrl: Add power control driver for TC9563 PCI: qcom: Implement .assert_perst() PCI: dwc: Implement .assert_perst() for dwc glue drivers PCI: Add .assert_perst() to control PCIe PERST# dt-bindings: PCI: Add binding for Toshiba TC9563 PCIe switch
2025-12-03Merge branch 'pci/controller/keystone'Bjorn Helgaas
- Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli) * pci/controller/keystone: PCI: keystone: Add support to build as a loadable module PCI: dwc: Export dw_pcie_allocate_domains() and dw_pcie_ep_raise_msix_irq() PCI: Export pci_get_host_bridge_device() for use by pci-keystone PCI: keystone: Exit ks_pcie_probe() for invalid mode
2025-12-03Merge branch 'pci/resource'Bjorn Helgaas
- Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) * pci/resource: PCI: Validate pci_rebar_size_supported() input PCI: Convert BAR sizes bitmasks to u64 drm/amdgpu: Use pci_rebar_get_max_size() drm/xe/vram: Use pci_rebar_get_max_size() PCI: Add pci_rebar_get_max_size() drm/xe/vram: Use PCI rebar helpers in resize_vram_bar() drm/i915/gt: Use pci_rebar_size_supported() PCI: Add pci_rebar_size_supported() helper PCI: Improve Resizable BAR functions kernel doc PCI: Move pci_rebar_size_to_bytes() and export it PCI: Move pci_rebar_bytes_to_size() and clean it up PCI: Move Resizable BAR code to rebar.c PCI: Prevent restoring assigned resources drm/amdgpu: Remove driver side BAR release before resize drm/i915: Remove driver side BAR release before resize drm/xe: Remove driver side BAR release before resize PCI: Add kerneldoc for pci_resize_resource() PCI: Fix restoring BARs on BAR resize rollback path PCI: Free saved list without holding pci_bus_sem PCI: Try BAR resize even when no window was released PCI: Change pci_dev variable from 'bridge' to 'dev' PCI/IOV: Adjust ->barsz[] when changing BAR size PCI: Prevent resource tree corruption when BAR resize fails
2025-12-03Merge branch 'pci/ptm'Bjorn Helgaas
- Enable PTM only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) * pci/ptm: PCI/PTM: Enable only if device advertises relevant role
2025-11-18PCI: Add .assert_perst() to control PCIe PERST#Krishna Chaitanya Chundru
Controller driver probes first, enables link training and scans the bus. When the PCI bridge is found, its child DT nodes will be scanned and pwrctrl devices will be created if needed. By the time pwrctrl driver probe gets called, link training is already enabled by controller driver. Certain devices like TC9563, which uses the PCI pwrctl framework, need to configure the device before the PCIe link is up. As the controller driver already enables link training as part of its probe, the moment device is powered on, controller and device participate in link training and link can come up immediately and may not have time to configure the device. So we need to stop the link training by using assert_perst() by asserting PERST# and de-asserting PERST# after device is configured. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251101-tc9563-v9-2-de3429f7787a@oss.qualcomm.com
2025-11-14PCI/IDE: Initialize an ID for all IDE streamsDan Williams
The PCIe spec defines two types of streams - selective and link. Each stream has an ID from the same bucket so a stream ID does not tell the type. The spec defines an "enable" bit for every stream and required stream IDs to be unique among all enabled stream but there is no such requirement for disabled streams. However, when IDE_KM is programming keys, an IDE-capable device needs to know the type of stream being programmed to write it directly to the hardware as keys are relatively large, possibly many of them and devices often struggle with keeping around rather big data not being used. Walk through all streams on a device and initialise the IDs to some unique number, both link and selective. The weakest part of this proposal is the host bridge ide_stream_ids_ida. Technically, a Stream ID only needs to be unique within a given partner pair. However, with "anonymous" / unassigned streams there is no convenient place to track the available ids. Proceed with an ida in the host bridge for now, but consider moving this tracking to be an ide_stream_ids_ida per device. Co-developed-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251113021446.436830-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-14PCI/IDE: Add Address Association Register setup for downstream MMIOXu Yilun
The address ranges for downstream Address Association Registers need to cover memory addresses for all functions (PFs/VFs/downstream devices) managed by a Device Security Manager (DSM). The proposed solution is get the memory (32-bit only) range and prefetchable-memory (64-bit capable) range from the immediate ancestor downstream port (either the direct-attach RP or deepest switch port when switch attached). Similar to RID association, address associations will be set by default if hardware sets 'Number of Address Association Register Blocks' in the 'Selective IDE Stream Capability Register' to a non-zero value. TSM drivers can opt-out of the settings by zero'ing out unwanted / unsupported address ranges. E.g. TDX Connect only supports prefetachable (64-bit capable) memory ranges for the Address Association setting. If the immediate downstream port provides both a memory range and prefetchable-memory range, but the IDE partner port only provides 1 Address Association Register block then the TSM driver can pick which range to associate, or let the PCI core prioritize memory. Note, the Address Association Register setup for upstream requests is still uncertain so is not included. Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Co-developed-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251114010227.567693-1-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-14PCI: Convert BAR sizes bitmasks to u64Ilpo Järvinen
PCIe r7.0, sec 7.8.6, defines resizable BAR sizes beyond the currently supported maximum of 128TB, which will require more than u32 to store the entire bitmask. Convert Resizable BAR related functions to use u64 bitmask for BAR sizes to make the typing more future-proof. The support for the larger BAR sizes themselves is not added at this point. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113180053.27944-12-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Add pci_rebar_get_max_size()Ilpo Järvinen
Add pci_rebar_get_max_size() to allow simplifying code that wants to know the maximum possible size for a Resizable BAR. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113180053.27944-9-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Add pci_rebar_size_supported() helperIlpo Järvinen
Many callers of pci_rebar_get_possible_sizes() are interested in finding out if a particular encoded BAR Size (PCIe r7.0, sec 7.8.6.3) is supported by the particular BAR. Add pci_rebar_size_supported() into PCI core to make it easy for the drivers to determine if the BAR size is supported or not. Use the new function in pci_resize_resource() and in pci_iov_vf_bar_set_size(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patch.msgid.link/20251113180053.27944-6-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Move pci_rebar_size_to_bytes() and export itIlpo Järvinen
pci_rebar_size_to_bytes() is in drivers/pci/pci.h but would be useful for endpoint drivers as well. Move the function to rebar.c and export it. In addition, convert the literal to where the number comes from (PCI_REBAR_MIN_SIZE). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113180053.27944-4-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Move pci_rebar_bytes_to_size() and clean it upIlpo Järvinen
Move pci_rebar_bytes_to_size() from include/linux/pci.h to rebar.c as it does not look very trivial and is not expected to be performance critical. Convert literals to use a newly added PCI_REBAR_MIN_SIZE define. Also add kernel doc for the function as the function is exported. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michael J. Ruhl <mjruhl@habana.ai> Link: https://patch.msgid.link/20251113180053.27944-3-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Fix restoring BARs on BAR resize rollback pathIlpo Järvinen
BAR resize operation is implemented in the pci_resize_resource() and pbus_reassign_bridge_resources() functions. pci_resize_resource() can be called either from __resource_resize_store() from sysfs or directly by the driver for the Endpoint Device. The pci_resize_resource() requires that caller has released the device resources that share the bridge window with the BAR to be resized as otherwise the bridge window is pinned in place and cannot be changed. pbus_reassign_bridge_resources() rolls back resources if the resize operation fails, but rollback is performed only for the bridge windows. Because releasing the device resources are done by the caller of the BAR resize interface, these functions performing the BAR resize do not have access to the device resources as they were before the resize. pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources() after rolling back the bridge windows as they were, however, it will not guarantee the resource are assigned due to differences in how FW and the kernel assign the resources (alignment of the start address and tail). To perform rollback robustly, the BAR resize interface has to be altered to also release the device resources that share the bridge window with the BAR to be resized. Also, remove restoring from the entries failed list as saved list should now contain both the bridge windows and device resources so the extra restore is duplicated work. Some drivers (currently only amdgpu) want to prevent releasing some resources. Add exclude_bars param to pci_resize_resource() and make amdgpu pass its register BAR (BAR 2 or 5), which should never be released during resize operation. Normally 64-bit prefetchable resources do not share a bridge window with the 32-bit only register BAR, but there are various fallbacks in the resource assignment logic which may make the resources share the bridge window in rare cases. This change (together with the driver side changes) is to counter the resource releases that had to be done to prevent resource tree corruption in the ("PCI: Release assigned resource before restoring them") change. As such, it likely restores functionality in cases where device resources were released to avoid resource tree conflicts which appeared to be "working" when such conflicts were not correctly detected by the kernel. Reported-by: Simon Richter <Simon.Richter@hogyros.de> Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/ Reported-by: Alex Bennée <alex.bennee@linaro.org> Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash amdgpu BAR selection from https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
2025-11-13PCI: Export pci_get_host_bridge_device() for use by pci-keystoneSiddharth Vadapalli
The pci-keystone.c driver uses the 'pci_get_host_bridge_device()' helper. Export it in preparation for enabling the pci-keystone.c driver to be built as a loadable module. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-2-s-vadapalli@ti.com
2025-11-12PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overriddenBjorn Helgaas
Defective devices sometimes advertise support for ASPM L0s or L1 states even if they don't work correctly. Cache the L0s Supported and L1 Supported bits early in enumeration so HEADER quirks can override the ASPM states advertised in Link Capabilities before pcie_aspm_cap_init() enables ASPM. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Link: https://patch.msgid.link/20251110222929.2140564-2-helgaas@kernel.org
2025-11-12PCI/PTM: Enable only if device advertises relevant roleMika Westerberg
We have a Switch Upstream Port (2b:00.0) that has a PTM Capability, but doesn't advertise support for any PTM roles: Capabilities: [220 v1] Precision Time Measurement PTMCap: Requester- Responder- Root- Linux enables PTM without looking into what roles it actually supports, and apparently the Port immediately sends PTM Requests even though it doesn't support the PTM Requester role. The messages include an invalid bus number, so the Root Port detects an ACS Violation (see the PCIe r7.0, sec 6.12.1.1, implementation note): pci 0000:2b:00.0: [8086:5786] type 01 class 0x060400 PCIe Switch Upstream Port pci 0000:2b:00.0: PTM enabled, 4ns granularity pcieport 0000:00:07.1: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:07.1 pcieport 0000:00:07.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID) pcieport 0000:00:07.1: device [8086:e44f] error status/mask=00200000/00000000 pcieport 0000:00:07.1: [21] ACSViol (First) pcieport 0000:00:07.1: AER: TLP Header: 0x34000000 0x00000052 0x00000000 0x00000000 The TLP Header shows a 4 DW header, no data (001b) Msg with Local routing (1 0100b) with Requester ID 0x0000 and PTM Request code (0x52). Fix this by enabling PTM only if the following conditions are true (see sec 6.21.1 figure 6-21): - Endpoint must advertise PTM Requester Capable - Switch Upstream Port must advertise PTM Responder Capable - Root Port must advertise PTM Root Capable Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: commit log, comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251112074614.1440266-1-mika.westerberg@linux.intel.com
2025-11-03PCI/IDE: Add IDE establishment helpersDan Williams
There are two components to establishing an encrypted link, provisioning the stream in Partner Port config-space, and programming the keys into the link layer via IDE_KM (IDE Key Management). This new library, drivers/pci/ide.c, enables the former. IDE_KM, via a TSM low-level driver, is saved for later. With the platform TSM implementations of SEV-TIO and TDX Connect in mind this library abstracts small differences in those implementations. For example, TDX Connect handles Root Port register setup while SEV-TIO expects System Software to update the Root Port registers. This is the rationale for fine-grained 'setup' + 'enable' verbs. The other design detail for TSM-coordinated IDE establishment is that the TSM may manage allocation of Stream IDs, this is why the Stream ID value is passed in to pci_ide_stream_setup(). The flow is: pci_ide_stream_alloc(): Allocate a Selective IDE Stream Register Block in each Partner Port (Endpoint + Root Port), and reserve a host bridge / platform stream slot. Gather Partner Port specific stream settings like Requester ID. pci_ide_stream_register(): Publish the stream in sysfs after allocating a Stream ID. In the TSM case the TSM allocates the Stream ID for the Partner Port pair. pci_ide_stream_setup(): Program the stream settings to a Partner Port. Caller is responsible for optionally calling this for the Root Port as well if the TSM implementation requires it. pci_ide_stream_enable(): Enable the stream after IDE_KM. In support of system administrators auditing where platform, Root Port, and Endpoint IDE stream resources are being spent, the allocated stream is reflected as a symlink from the host bridge to the endpoint with the name: stream%d.%d.%d Where the tuple of integers reflects the allocated platform, Root Port, and Endpoint stream index (Selective IDE Stream Register Block) values. Thanks to Wu Hao for a draft implementation of this infrastructure. Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: Samuel Ortiz <sameo@rivosinc.com> Co-developed-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251031212902.2256310-8-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03PCI: Add PCIe Device 3 Extended Capability enumerationDan Williams
PCIe r7.0 Section 7.7.9 Device 3 Extended Capability Structure, defines the canonical location for determining the Flit Mode of a device. This status is a dependency for PCIe IDE enabling. Add a new fm_enabled flag to 'struct pci_dev'. Cc: Lukas Wunner <lukas@wunner.de> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Samuel Ortiz <sameo@rivosinc.com> Cc: Alexey Kardashevskiy <aik@amd.com> Cc: Xu Yilun <yilun.xu@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251031212902.2256310-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03PCI/TSM: Establish Secure Sessions and Link EncryptionDan Williams
The PCIe 7.0 specification, section 11, defines the Trusted Execution Environment (TEE) Device Interface Security Protocol (TDISP). This protocol definition builds upon Component Measurement and Authentication (CMA), and link Integrity and Data Encryption (IDE). It adds support for assigning devices (PCI physical or virtual function) to a confidential VM such that the assigned device is enabled to access guest private memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV COVE, or ARM CCA. The "TSM" (TEE Security Manager) is a concept in the TDISP specification of an agent that mediates between a "DSM" (Device Security Manager) and system software in both a VMM and a confidential VM. A VMM uses TSM ABIs to setup link security and assign devices. A confidential VM uses TSM ABIs to transition an assigned device into the TDISP "RUN" state and validate its configuration. From a Linux perspective the TSM abstracts many of the details of TDISP, IDE, and CMA. Some of those details leak through at times, but for the most part TDISP is an internal implementation detail of the TSM. CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory to pci-sysfs. Consider that the TSM driver may itself be a PCI driver. Userspace can watch for the arrival of a "TSM" device, /sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has initialized TSM services. The operations that can be executed against a PCI device are split into two mutually exclusive operation sets, "Link" and "Security" (struct pci_tsm_{link,security}_ops). The "Link" operations manage physical link security properties and communication with the device's Device Security Manager firmware. These are the host side operations in TDISP. The "Security" operations coordinate the security state of the assigned virtual device (TDI). These are the guest side operations in TDISP. Only "link" (Secure Session and physical Link Encryption) operations are defined at this stage. There are placeholders for the device security (Trusted Computing Base entry / exit) operations. The locking allows for multiple devices to be executing commands simultaneously, one outstanding command per-device and an rwsem synchronizes the implementation relative to TSM registration/unregistration events. Thanks to Wu Hao for his work on an early draft of this support. Cc: Lukas Wunner <lukas@wunner.de> Cc: Samuel Ortiz <sameo@rivosinc.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alexey Kardashevskiy <aik@amd.com> Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Link: https://patch.msgid.link/20251031212902.2256310-5-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03PCI: Introduce pci_walk_bus_reverse(), for_each_pci_dev_reverse()Dan Williams
PCI/TSM, the PCI core functionality for the PCIe TEE Device Interface Security Protocol (TDISP), has a need to walk all subordinate functions of a Device Security Manager (DSM) to setup a device security context. A DSM is physical function 0 of multi-function or SR-IOV device endpoint, or it is an upstream switch port. In error scenarios or when a TEE Security Manager (TSM) device is removed it needs to unwind all established DSM contexts. Introduce reverse versions of PCI device iteration helpers to mirror the setup path and ensure that dependent children are handled before parents. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251031212902.2256310-4-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03PCI/IDE: Enumerate Selective Stream IDE capabilitiesDan Williams
Link encryption is a new PCIe feature enumerated by "PCIe r7.0 section 7.9.26 IDE Extended Capability". It is both a standalone port + endpoint capability, and a building block for the security protocol defined by "PCIe r7.0 section 11 TEE Device Interface Security Protocol (TDISP)". That protocol coordinates device security setup between a platform TSM (TEE Security Manager) and a device DSM (Device Security Manager). While the platform TSM can allocate resources like Stream ID and manage keys, it still requires system software to manage the IDE capability register block. Add register definitions and basic enumeration in preparation for Selective IDE Stream establishment. A follow on change selects the new CONFIG_PCI_IDE symbol. Note that while the IDE specification defines both a point-to-point "Link Stream" and a Root Port to endpoint "Selective Stream", only "Selective Stream" is considered for Linux as that is the predominant mode expected by Trusted Execution Environment Security Managers (TSMs), and it is the security model that limits the number of PCI components within the TCB in a PCIe topology with switches. Co-developed-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Link: https://patch.msgid.link/20251031212902.2256310-3-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-10-28PCI: Enable host bridge emulation for PCI_DOMAINS_GENERIC platformsDan Williams
The ability to emulate a host bridge is useful not only for hardware PCI controllers like CONFIG_VMD, or virtual PCI controllers like CONFIG_PCI_HYPERV, but also for test and development scenarios like CONFIG_SAMPLES_DEVSEC [1]. One stumbling block for defining CONFIG_SAMPLES_DEVSEC, a sample implementation of a platform TSM for PCI Device Security, is the need to accommodate PCI_DOMAINS_GENERIC architectures alongside x86 [2]. In support of supplementing the existing CONFIG_PCI_BRIDGE_EMUL infrastructure for host bridges: * Introduce pci_bus_find_emul_domain_nr() as a common way to find a free PCI domain number whether that is to reuse the existing dynamic allocation code in the !ACPI case, or to assign an unused domain above the last ACPI segment. * Convert pci-hyperv to the new allocator so that the PCI core can unconditionally assume that bridge->domain_nr != PCI_DOMAIN_NR_NOT_SET is the dynamically allocated case. A follow on patch can also convert vmd to the new scheme. Currently vmd is limited to CONFIG_PCI_DOMAINS_GENERIC=n (x86) so, unlike pci-hyperv, it does not immediately conflict with this new pci_bus_find_emul_domain_nr() mechanism. Link: http://lore.kernel.org/174107249038.1288555.12362100502109498455.stgit@dwillia2-xfh.jf.intel.com [1] Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Closes: http://lore.kernel.org/20250311144601.145736-3-suzuki.poulose@arm.com [2] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Rob Herring <robh@kernel.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Link: https://patch.msgid.link/20251024224622.1470555-2-dan.j.williams@intel.com
2025-10-03Merge branch 'pci/resource'Bjorn Helgaas
- Ensure relaxed tail alignment does not increase min_align when computing bridge window size, to fix a regression (Ilpo Järvinen) - Fix bridge window size computation to fix a regression for devices with undefined PCI class, e.g., Samsung [144d:a5a5] (Ilpo Järvinen) - Fix error handling during resource resize to fix a regression in amdgpu (Ilpo Järvinen) - Align m68k pcibios_enable_device() with other arches (Ilpo Järvinen) - Remove several sparc pcibios_enable_device() implementations that don't do anything beyond what pci_enable_resources() does (Ilpo Järvinen) - Remove mips pcibios_enable_resources() and use pci_enable_resources() instead (Ilpo Järvinen) - Refactor and simplify find_bus_resource_of_type() (Ilpo Järvinen) - Claim bridge windows before setting them up (Ilpo Järvinen) - Disable non-claimed bridge windows so the kernel's view matches the hardware configuration (Ilpo Järvinen) - Use pci_release_resource() instead of release_resource() to reduce code duplication and increase consistency (Ilpo Järvinen) - Enable bridges even if bridge window assignment fails (Ilpo Järvinen) - Preserve bridge window resource type flags when assignment fails because we may need it later (Ilpo Järvinen) - Add bridge window selection functions to make the selection consistent across the several places that do this (Ilpo Järvinen) - Warn if bridge window cannot be released when resizing BAR (Ilpo Järvinen) - Set up bridge resources before enumerating children so we can check whether child resources are inside bridge windows (Ilpo Järvinen) * pci/resource: PCI: Set up bridge resources earlier PCI: Don't print stale information about resource PCI: Alter misleading recursion to pci_bus_release_bridge_resources() PCI: Pass bridge window to pci_bus_release_bridge_resources() PCI: Add pci_setup_one_bridge_window() PCI: Refactor remove_dev_resources() to use pbus_select_window() PCI: Refactor distributing available memory to use loops PCI: Use pbus_select_window_for_type() during mem window sizing PCI: Use pbus_select_window() in space available checker PCI: Rename resource variable from r to res PCI: Use pbus_select_window_for_type() during IO window sizing PCI: Use pbus_select_window() during BAR resize PCI: Warn if bridge window cannot be released when resizing BAR PCI: Fix finding bridge window in pci_reassign_bridge_resources() PCI: Add bridge window selection functions PCI: Add defines for bridge window indexing PCI: Preserve bridge window resource type flags PCI: Enable bridge even if bridge window fails to assign PCI: Use pci_release_resource() instead of release_resource() PCI: Disable non-claimed bridge window PCI: Always claim bridge window before its setup PCI: Refactor find_bus_resource_of_type() logic checks PCI: Move find_bus_resource_of_type() earlier MIPS: PCI: Use pci_enable_resources() sparc/PCI: Remove pcibios_enable_device() as they do nothing extra m68k/PCI: Use pci_enable_resources() in pcibios_enable_device() PCI: Fix failure detection during resource resize PCI: Fix pdev_resources_assignable() disparity PCI: Ensure relaxed tail alignment does not increase min_align
2025-09-16PCI: Refactor distributing available memory to use loopsIlpo Järvinen
pci_bus_distribute_available_resources() and pci_bridge_distribute_available_resources() retain bridge window resources and related data needed for distributing the available window in independent variables for io, memory, and prefetchable memory windows. The code is essentially the same for all of them and therefore repeated three times with different variable names. Refactor pci_bus_distribute_available_resources() to take an array. This is complicated slightly by the function taking advantage of passing the struct as value, which cannot be done for arrays in C. Therefore, copy the data into a local array in the stack in the first loop. Variable names are (hopefully) improved slightly as well. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-21-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pci_release_resource() instead of release_resource()Ilpo Järvinen
A few places in setup-bus.c call release_resource() directly and end up duplicating functionality from pci_release_resource() such as parent check, logging, and clearing the resource. Worse yet, the way the resource is cleared is inconsistent between different sites. Convert release_resource() calls into pci_release_resource() to remove code duplication. This will also make the resource start, end, and flags behavior consistent, i.e., start address is cleared, and only IORESOURCE_UNSET is asserted for the resource. While at it, eliminate the unnecessary initialization of idx variable in pci_bridge_release_resources(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-9-ilpo.jarvinen@linux.intel.com
2025-08-14s390/pci: Use pci_uevent_ers() in PCI recoveryNiklas Schnelle
Issue uevents on s390 during PCI recovery using pci_uevent_ers() as done by EEH and AER PCIe recovery routines. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Link: https://patch.msgid.link/20250807-add_err_uevents-v5-2-adf85b0620b0@linux.ibm.com
2025-07-31Merge branch 'pci/resources'Bjorn Helgaas
- Restore VF resizable BAR state after reset (Michał Winiarski) - Add pci_resource_num_to_vf_bar() and pci_resource_num_from_vf_bar() to convert between VF BAR number and the dev->resource[] index (Michał Winiarski) - Allow IOV resources (VF BARs) to be resized (Michał Winiarski) - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał Winiarski) * pci/resources: PCI/IOV: Allow drivers to control VF BAR size PCI/IOV: Check that VF BAR fits within the reservation PCI/IOV: Allow IOV resources to be resized in pci_resize_resource() PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource PCI/IOV: Restore VF resizable BAR state after reset
2025-07-31Merge branch 'pci/hotplug'Bjorn Helgaas
- Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by misinterpreting a config read failure after a device has been removed (Lukas Wunner) - Avoid creating a useless PCIe port service device for pciehp if the slot is handled by the ACPI hotplug driver (Lukas Wunner) - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports (Lukas Wunner) - Simplify pci_bridge_d3_possible() and clarify comments (Lukas Wunner) * pci/hotplug: PCI: Move is_pciehp check out of pciehp_is_native() PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
2025-07-29PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable portsLukas Wunner
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and pcie_portdrv_remove() to determine whether runtime power management shall be enabled (on probe) or disabled (on remove) on a PCIe port. The underlying assumption is that pci_bridge_d3_possible() always returns the same value, else a runtime PM reference imbalance would occur. That assumption is not given if the PCIe port is inaccessible on remove due to hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which accesses Config Space to determine whether the port is Hot-Plug Capable. An inaccessible port returns "all ones", which is converted to "all zeroes" by pcie_capability_read_dword(). Hence the port no longer seems Hot-Plug Capable on remove even though it was on probe. The resulting runtime PM ref imbalance causes warning messages such as: pcieport 0000:02:04.0: Runtime PM usage count underflow! Avoid the Config Space access (and thus the runtime PM ref imbalance) by caching the Hot-Plug Capable bit in struct pci_dev. The struct already contains an "is_hotplug_bridge" flag, which however is not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI Hot-Plug bridges and ACPI slots. The flag identifies bridges which are allocated additional MMIO and bus number resources to allow for hierarchy expansion. The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of places to identify Hot-Plug Capable PCIe ports, even though the flag encompasses other devices. Subsequent commits replace these occurrences with the new flag to clearly delineate Hot-Plug Capable PCIe ports from other kinds of hotplug bridges. Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag and document the (non-obvious) requirement that pci_bridge_d3_possible() always returns the same value across the entire lifetime of a bridge, including its hot-removal. Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter") Reported-by: Laurent Bigonville <bigon@bigon.be> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216 Reported-by: Mario Limonciello <mario.limonciello@amd.com> Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/ Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Cc: stable@vger.kernel.org # v4.18+ Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.1752390101.git.lukas@wunner.de
2025-07-17PCI: Add pci_is_display() to check if device is a display controllerMario Limonciello
Several places in the kernel do class shifting to match whether a PCI device is display class. Add pci_is_display() for those places to use. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Daniel Dadap <ddadap@nvidia.com> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patch.msgid.link/20250717173812.3633478-2-superm1@kernel.org
2025-07-14PCI/IOV: Allow drivers to control VF BAR sizeMichał Winiarski
Drivers could leverage the fact that the VF BAR MMIO reservation is created for total number of VFs supported by the device by resizing the BAR to larger size when smaller number of VFs is enabled. Add pci_iov_vf_bar_set_size() to control the size and a pci_iov_vf_bar_get_sizes() helper to get the VF BAR sizes that will allow up to num_vfs to be successfully enabled with the current underlying reservation size. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250702093522.518099-6-michal.winiarski@intel.com