| Age | Commit message (Collapse) | Author |
|
Pull CXL updates from Dave Jiang:
- Introduce cxl_memdev_attach and pave way for soft reserved handling,
type2 accelerator enabling, and LSA 2.0 enabling. All these series
require the endpoint driver to settle before continuing the memdev
driver probe.
- Address CXL port error protocol handling and reporting.
The large patch series was split into three parts. The first two
parts are included here with the final part coming later.
The first part consists of a series of code refactoring to PCI AER
sub-system that addresses CXL and also CXL RAS code to prepare for
port error handling.
The second part refactors the CXL code to move management of
component registers to cxl_port objects to allow all CXL AER errors
to be handled through the cxl_port hierarchy.
- Provide AMD Zen5 platform address translation for CXL using ACPI
PRMT. This includes a conventions document to explain why this is
needed and how it's implemented.
- Misc CXL patches of fixes, cleanups, and updates. Including CXL
address translation for unaligned MOD3 regions.
[ TLA service: CXL is "Compute Express Link" ]
* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
cxl: Disable HPA/SPA translation handlers for Normalized Addressing
cxl/region: Factor out code into cxl_region_setup_poison()
cxl/atl: Lock decoders that need address translation
cxl: Enable AMD Zen5 address translation using ACPI PRMT
cxl/acpi: Prepare use of EFI runtime services
cxl: Introduce callback for HPA address ranges translation
cxl/region: Use region data to get the root decoder
cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
cxl/region: Separate region parameter setup and region construction
cxl: Simplify cxl_root_ops allocation and handling
cxl/region: Store HPA range in struct cxl_region
cxl/region: Store root decoder in struct cxl_region
cxl/region: Rename misleading variable name @hpa to @hpa_range
Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
cxl, doc: Moving conventions in separate files
cxl, doc: Remove isonum.txt inclusion
cxl/port: Unify endpoint and switch port lookup
cxl/port: Move endpoint component register management to cxl_port
cxl/port: Map Port RAS registers
cxl/port: Move dport RAS setup to dport add time
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Don't try to enable Extended Tags on VFs since that bit is Reserved
and causes misleading log messages (Håkon Bugge)
- Initialize Endpoint Read Completion Boundary to match Root Port,
regardless of ACPI _HPX (Håkon Bugge)
- Apply _HPX PCIe Setting Record only to AER configuration, and only
when OS owns PCIe hotplug but not AER, to avoid clobbering Extended
Tag and Relaxed Ordering settings (Håkon Bugge)
Resource management:
- Move CardBus code to setup-cardbus.c and only build it when
CONFIG_CARDBUS is set (Ilpo Järvinen)
- Fix bridge window alignment with optional resources, where
additional alignment requirement was previously lost (Ilpo
Järvinen)
- Stop over-estimating bridge window size since they are now assigned
without any gaps between them (Ilpo Järvinen)
- Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening
for nested bridges and endpoints (Ilpo Järvinen)
- Add pbus_mem_size_optional() to handle sizes of optional resources
(SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen)
- Don't claim disabled bridge windows to avoid spurious claim
failures (Ilpo Järvinen)
Driver binding:
- Fix device reference leak in pcie_port_remove_service() (Uwe
Kleine-König)
- Move pcie_port_bus_match() and pcie_port_bus_type to PCIe-specific
portdrv.c (Uwe Kleine-König)
- Convert portdrv to use pcie_port_bus_type.probe() and .remove()
callbacks so .probe() and .remove() can eventually be removed from
struct device_driver (Uwe Kleine-König)
Error handling:
- Clear stale errors on reporting agents upon probe so they don't
look like recent errors (Lukas Wunner)
- Add generic RAS tracepoint for hotplug events (Shuai Xue)
- Add RAS tracepoint for link speed changes (Shuai Xue)
Power management:
- Avoid redundant delay on transition from D3hot to D3cold if the
device was already in D3hot (Brian Norris)
- Prevent runtime suspend until devices are fully initialized to
avoid saving incompletely configured device state (Brian Norris)
Power control:
- Add power_on/off callbacks with generic signature to pwrseq,
tc9563, and slot drivers so they can be used by pwrctrl core
(Manivannan Sadhasivam)
- Add PCIe M.2 connector support to the slot pwrctrl driver
(Manivannan Sadhasivam)
- Switch to pwrctrl interfaces to create, destroy, and power on/off
devices, calling them from host controller drivers instead of the
PCI core (Manivannan Sadhasivam)
- Drop qcom .assert_perst() callbacks since this is now done by the
controller driver instead of the pwrctrl driver (Manivannan
Sadhasivam)
Virtualization:
- Remove an incorrect unlock in pci_slot_trylock() error handling
(Jinhui Guo)
- Lock the bridge device for slot reset (Keith Busch)
- Enable ACS after IOMMU configuration on OF platforms so ACS is
enabled an all devices; previously the first device enumerated
(typically a Root Port) didn't have ACS enabled (Manivannan
Sadhasivam)
- Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to
work around hardware erratum; previously ACS SV was only
temporarily disabled, which worked for enumeration but not after
reset (Manivannan Sadhasivam)
Peer-to-peer DMA:
- Release per-CPU pgmap ref when vm_insert_page() fails to avoid hang
when removing the PCI device (Hou Tao)
- Remove incorrect p2pmem_alloc_mmap() warning about page refcount
(Hou Tao)
Endpoint framework:
- Add configfs sub-groups synchronously to avoid NULL pointer
dereference when racing with removal (Liu Song)
- Fix swapped parameters in pci_{primary/secondary}_epc_epf_unlink()
functions (Manikanta Maddireddy)
ASPEED PCIe controller driver:
- Add ASPEED Root Complex DT binding and driver (Jacky Chou)
Freescale i.MX6 PCIe controller driver:
- Add DT binding and driver support for an optional external refclock
in addition to the refclock from the internal PLL (Richard Zhu)
- Fix CLKREQ# control so host asserts it during enumeration and
Endpoints can use it afterwards to exit the L1.2 link state
(Richard Zhu)
NVIDIA Tegra PCIe controller driver:
- Export irq_domain_free_irqs() to allow PCI/MSI drivers that tear
down MSI domains to be built as modules (Aaron Kling)
- Allow pci-tegra to be built as a module (Aaron Kling)
NVIDIA Tegra194 PCIe controller driver:
- Relax Kconfig so tegra194 can be built for platforms beyond
Tegra194 (Vidya Sagar)
Qualcomm PCIe controller driver:
- Merge SC8180x DT binding into SM8150 (Krzysztof Kozlowski)
- Move SDX55, SDM845, QCS404, IPQ5018, IPQ6018, IPQ8074 Gen3,
IPQ8074, IPQ4019, IPQ9574, APQ8064, MSM8996, APQ8084 to dedicated
schema (Krzysztof Kozlowski)
- Add DT binding and driver support for SA8255p Endpoint being
configured by firmware (Mrinmay Sarkar)
- Parse PERST# from all PCIe bridge nodes for future platforms that
will have PERST# in Switch Downstream Ports as well as in Root
Ports (Manivannan Sadhasivam)
Renesas RZ/G3S PCIe controller driver:
- Use pci_generic_config_write() since the writability provided by
the custom wrapper is unnecessary (Claudiu Beznea)
SOPHGO PCIe controller driver:
- Disable ASPM L0s and L1 on Sophgo 2044 PCIe Root Ports (Inochi
Amaoto)
Synopsys DesignWare PCIe controller driver:
- Extend PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() to return a
pointer to the preceding Capability, to allow removal of
Capabilities that are advertised but not fully implemented (Qiang
Yu)
- Remove MSI and MSI-X Capabilities in platforms that can't support
them, so the PCI core automatically falls back to INTx (Qiang Yu)
- Add ASPM L1.1 and L1.2 Substates context to debugfs ltssm_status
for drivers that support this (Shawn Lin)
- Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if
link is not up to avoid an unnecessary timeout (Manivannan
Sadhasivam)
- Revert dw-rockchip, qcom, and DWC core changes that used link-up
IRQs to trigger enumeration instead of waiting for link to be up
because the PCI core doesn't allocate bus number space for
hierarchies that might be attached (Niklas Cassel)
- Make endpoint iATU entry for MSI permanent instead of programming
it dynamically, which is slow and racy with respect to other
concurrent traffic, e.g., eDMA (Koichiro Den)
- Use iMSI-RX MSI target address when possible to fix endpoints using
32-bit MSI (Shawn Lin)
- Allow DWC host controller driver probe to continue if device is not
found or found but inactive; only fail when there's an error with
the link (Manivannan Sadhasivam)
- For controllers like NXP i.MX6QP and i.MX7D, where LTSSM registers
are not accessible after PME_Turn_Off, simply wait 10ms instead of
polling for L2/L3 Ready (Richard Zhu)
- Use multiple iATU entries to map large bridge windows and DMA
ranges when necessary instead of failing (Samuel Holland)
- Add EPC dynamic_inbound_mapping feature bit for Endpoint
Controllers that can update BAR inbound address translation without
requiring EPF driver to clear/reset the BAR first, and advertise it
for DWC-based Endpoints (Koichiro Den)
- Add EPC subrange_mapping feature bit for Endpoint Controllers that
can map multiple independent inbound regions in a single BAR,
implement subrange mapping, advertise it for DWC-based Endpoints,
and add Endpoint selftests for it (Koichiro Den)
- Make resizable BARs work for Endpoint multi-PF configurations;
previously it only worked for PF 0 (Aksh Garg)
- Fix Endpoint non-PF 0 support for BAR configuration, ATU mappings,
and Address Match Mode (Aksh Garg)
- Set up iATU when ECAM is enabled; previously IO and MEM outbound
windows weren't programmed, and ECAM-related iATU entries weren't
restored after suspend/resume, so config accesses failed (Krishna
Chaitanya Chundru)
Miscellaneous:
- Use system_percpu_wq and WQ_PERCPU to explicitly request per-CPU
work so WQ_UNBOUND can eventually be removed (Marco Crivellari)"
* tag 'pci-v7.0-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (176 commits)
PCI/bwctrl: Disable BW controller on Intel P45 using a quirk
PCI: Disable ACS SV for IDT 0x8090 switch
PCI: Disable ACS SV for IDT 0x80b5 switch
PCI: Cache ACS Capabilities register
PCI: Enable ACS after configuring IOMMU for OF platforms
PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404]
PCI: Add ACS quirk for Qualcomm Hamoa & Glymur
PCI: Use device_lock_assert() to verify device lock is held
PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held
PCI: Fix pci_slot_lock () device locking
PCI: Fix pci_slot_trylock() error handling
PCI: Mark Nvidia GB10 to avoid bus reset
PCI: Mark ASM1164 SATA controller to avoid bus reset
PCI: host-generic: Avoid reporting incorrect 'missing reg property' error
PCI/PME: Replace RMW of Root Status register with direct write
PCI/AER: Clear stale errors on reporting agents upon probe
PCI: Don't claim disabled bridge windows
PCI: rzg3s-host: Fix device node reference leak in rzg3s_pcie_host_parse_port()
PCI: dwc: Fix missing iATU setup when ECAM is enabled
PCI: dwc: Clean up iATU index usage in dw_pcie_iatu_setup()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI updates from Thomas Gleixner:
"Updates for the [PCI] MSI subsystem:
- Add interrupt redirection infrastructure
Some PCI controllers use a single demultiplexing interrupt for the
MSI interrupts of subordinate devices.
This prevents setting the interrupt affinity of device interrupts,
which causes device interrupts to be delivered to a single CPU.
That obviously is counterproductive for multi-queue devices and
interrupt balancing.
To work around this limitation the new infrastructure installs a
dummy irq_set_affinity() callback which captures the affinity mask
and picks a redirection target CPU out of the mask.
When the PCI controller demultiplexes the interrupts it invokes a
new handling function in the core, which either runs the interrupt
handler in the context of the target CPU or delegates it to
irq_work on the target CPU.
- Utilize the interrupt redirection mechanism in the PCI DWC host
controller driver.
This allows affinity control for the subordinate device MSI
interrupts instead of being randomly executed on the CPU which runs
the demultiplex handler.
- Replace the binary 64-bit MSI flag with a DMA mask
Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability,
but implement less than 64 address bits. This breaks on platforms
where such a device is assigned an MSI address higher than what's
supported.
With the binary 64-bit flag there is no other choice than disabling
64-bit MSI support which leaves the device disfunctional.
By using a DMA mask the address limit of a device can be described
correctly which provides support for the above scenario.
- Make use of the DMA mask based address limit in the hda/intel and
radeon drivers to enable them on affected platforms
- The usual small cleanups and improvements"
* tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ALSA: hda/intel: Make MSI address limit based on the device DMA limit
drm/radeon: Make MSI address limit based on the device DMA limit
PCI/MSI: Check the device specific address mask in msi_verify_entries()
PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask
genirq/redirect: Prevent writing MSI message on affinity change
PCI/MSI: Unmap MSI-X region on error
genirq: Update effective affinity for redirected interrupts
PCI: dwc: Enable MSI affinity support
PCI: dwc: Code cleanup
genirq: Add interrupt redirection infrastructure
genirq/msi: Correct kernel-doc in <linux/msi.h>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks
Pull kthread updates from Frederic Weisbecker:
"The kthread code provides an infrastructure which manages the
preferred affinity of unbound kthreads (node or custom cpumask)
against housekeeping (CPU isolation) constraints and CPU hotplug
events.
One crucial missing piece is the handling of cpuset: when an isolated
partition is created, deleted, or its CPUs updated, all the unbound
kthreads in the top cpuset become indifferently affine to _all_ the
non-isolated CPUs, possibly breaking their preferred affinity along
the way.
Solve this with performing the kthreads affinity update from cpuset to
the kthreads consolidated relevant code instead so that preferred
affinities are honoured and applied against the updated cpuset
isolated partitions.
The dispatch of the new isolated cpumasks to timers, workqueues and
kthreads is performed by housekeeping, as per the nice Tejun's
suggestion.
As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set
from boot defined domain isolation (through isolcpus=) and cpuset
isolated partitions. Housekeeping cpumasks are now modifiable with a
specific RCU based synchronization. A big step toward making
nohz_full= also mutable through cpuset in the future"
* tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (33 commits)
doc: Add housekeeping documentation
kthread: Document kthread_affine_preferred()
kthread: Comment on the purpose and placement of kthread_affine_node() call
kthread: Honour kthreads preferred affinity after cpuset changes
sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN
sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN
kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management
kthread: Include kthreadd to the managed affinity list
kthread: Include unbound kthreads in the managed affinity list
kthread: Refine naming of affinity related fields
PCI: Remove superfluous HK_TYPE_WQ check
sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated()
cpuset: Remove cpuset_cpu_is_isolated()
timers/migration: Remove superfluous cpuset isolation test
cpuset: Propagate cpuset isolation update to timers through housekeeping
cpuset: Propagate cpuset isolation update to workqueue through housekeeping
PCI: Flush PCI probe workqueue on cpuset isolated partition change
sched/isolation: Flush vmstat workqueues on cpuset isolated partition change
sched/isolation: Flush memcg workqueues on cpuset isolated partition change
cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset
...
|
|
- Mark ASM1164 SATA controller to avoid bus reset since it fails to train
the Link after reset (Alex Williamson)
- Mark Nvidia GB10 Root Ports to avoid bus reset since they may fail to
retrain the link after reset (Johnny-CC Chang)
- Add lockdep and other lock assertions (Ilpo Järvinen)
- Add ACS quirk for Qualcomm Hamoa & Glymur, which provides ACS-like
features but doesn't advertise an ACS Capability (Krishna Chaitanya
Chundru)
- Add ACS quirk for Pericom PI7C9X2G404 switches, which fail under load
when P2P Redirect Request is enabled (Nicolas Cavallari)
- Remove an incorrect unlock in pci_slot_trylock() error handling (Jinhui
Guo)
- Lock the bridge device for slot reset (Keith Busch)
- Enable ACS after IOMMU configuration on OF platforms so ACS is enabled an
all devices; previously the first device enumeration (typically a Root
Port) was omitted (Manivannan Sadhasivam)
- Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to work
around hardware erratum; previously ACS SV was temporarily disabled,
which worked for enumeration but not after reset (Manivannan Sadhasivam)
* pci/virtualization:
PCI: Disable ACS SV for IDT 0x8090 switch
PCI: Disable ACS SV for IDT 0x80b5 switch
PCI: Cache ACS Capabilities register
PCI: Enable ACS after configuring IOMMU for OF platforms
PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404]
PCI: Add ACS quirk for Qualcomm Hamoa & Glymur
PCI: Use device_lock_assert() to verify device lock is held
PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held
PCI: Fix pci_slot_lock () device locking
PCI: Fix pci_slot_trylock() error handling
PCI: Mark Nvidia GB10 to avoid bus reset
PCI: Mark ASM1164 SATA controller to avoid bus reset
|
|
- Build zero-sized resources when a BAR is larger than 4G but
pci_bus_addr_t or resource_size_t can't represent 64-bit addresses (Ilpo
Järvinen)
- Fix bridge window alignment with optional resources, where we previously
lost the additional alignment requirement (Ilpo Järvinen)
- Stop over-estimating bridge window size since we now assign them without
any gaps between them (Ilpo Järvinen)
- Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening for
nested bridges and endpoints (Ilpo Järvinen)
- Remove old_size limit from bridge window sizing (Ilpo Järvinen)
- Push realloc check into pbus_size_mem() to simplify callers (Ilpo
Järvinen)
- Pass bridge window resource to pbus_size_mem() to avoid looking it up
again (Ilpo Järvinen)
- Use res_to_dev_res() instead of open-coding the same search (Ilpo
Järvinen)
- Add pci_resource_is_bridge_win() helper (Ilpo Järvinen)
- Add more logging of resource assignment (Ilpo Järvinen)
- Add pbus_mem_size_optional() to handle sizes of optional resources
(SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen)
- Move CardBus code to setup-cardbus.c and only build it when
CONFIG_CARDBUS is set (Ilpo Järvinen)
- Use scnprintf() instead of sprintf() (Ilpo Järvinen)
- Add pbus_validate_busn() for Bus Number validation (Ilpo Järvinen)
- Don't claim disabled bridge windows to avoid spurious claim failures
(Ilpo Järvinen)
* pci/resource:
PCI: Don't claim disabled bridge windows
PCI: Move CardBus bridge scanning to setup-cardbus.c
PCI: Add pbus_validate_busn() for Bus Number validation
PCI: Add dword #defines for Bus Number + Secondary Latency Timer
PCI: Use scnprintf() instead of sprintf()
PCI: Handle CardBus-specific params in setup-cardbus.c
PCI: Separate CardBus setup & build it only with CONFIG_CARDBUS
PCI: Add 'pci' prefix to struct pci_dev_resource handling functions
PCI: Use resource_assigned() in setup-bus.c algorithm
resource: Mark res given to resource_assigned() as const
PCI: Add pbus_mem_size_optional() to handle optional sizes
PCI: Check invalid align earlier in pbus_size_mem()
PCI: Log reset and restore of resources
PCI: Add pci_resource_is_bridge_win()
PCI: Fetch dev_res to local var in __assign_resources_sorted()
PCI: Use res_to_dev_res() in reassign_resources_sorted()
PCI: Pass bridge window resource to pbus_size_mem()
PCI: Push realloc check into pbus_size_mem()
PCI: Remove old_size limit from bridge window sizing
resource: Increase MAX_IORES_LEVEL to 8
PCI: Stop over-estimating bridge window size
PCI: Rewrite bridge window head alignment function
PCI: Fix bridge window alignment with optional resources
PCI: Use resource_set_range() that correctly sets ->end
|
|
- Rename pwrseq, tc9563, and slot driver structs, variables, and functions
for consistency (Bjorn Helgaas)
- Add power_on/off callbacks with generic signature to pwrseq, tc9563, and
slot drivers so they can be used by pwrctrl core (Manivannan Sadhasivam)
- Add interfaces to create and destroy pwrctrl devices (Krishna Chaitanya
Chundru)
- Add interfaces to power devices on and off (Manivannan Sadhasivam)
- Switch to pwrctrl interfaces to create, destroy, and power on/off
devices, calling them from host controller drivers instead of the PCI
core (Manivannan Sadhasivam)
- Drop qcom .assert_perst() callbacks since this is now done by the
controller driver instead of the pwrctrl driver (Manivannan Sadhasivam)
- Add PCIe M.2 connector support to the slot pwrctrl driver (Manivannan
Sadhasivam)
- Create pwrctrl devices for devicetree PCIe M.2 connector nodes
(Manivannan Sadhasivam)
* pci/pwrctrl:
PCI/pwrctrl: Create pwrctrl device if graph port is found
PCI/pwrctrl: Add PCIe M.2 connector support
PCI: Drop the assert_perst() callback
PCI: qcom: Drop the assert_perst() callbacks
PCI/pwrctrl: Switch to pwrctrl create, power on/off, destroy APIs
PCI/pwrctrl: Add APIs to power on/off pwrctrl devices
PCI/pwrctrl: Add APIs to create, destroy pwrctrl devices
PCI/pwrctrl: Add 'struct pci_pwrctrl::power_{on/off}' callbacks
PCI/pwrctrl: pwrseq: Factor out power on/off code to helpers
PCI/pwrctrl: slot: Factor out power on/off code to helpers
PCI/pwrctrl: tc9563: Rename private struct and pointers for consistency
PCI/pwrctrl: tc9563: Add local variables to reduce repetition
PCI/pwrctrl: tc9563: Clean up whitespace
PCI/pwrctrl: tc9563: Use put_device() instead of i2c_put_adapter()
PCI/pwrctrl: slot: Rename private struct and pointers for consistency
PCI/pwrctrl: pwrseq: Rename private struct and pointers for consistency
# Conflicts:
# drivers/pci/bus.c
|
|
- Add PCI_BRIDGE_NO_ALIAS quirk for ASPEED AST1150, where VGA and USB are
behind a PCIe-to-PCI bridge and share the same StreamID (Nirmoy Das)
* pci/iommu:
PCI: Add PCI_BRIDGE_NO_ALIAS quirk for ASPEED AST1150
PCI: Add ASPEED vendor ID to pci_ids.h
|
|
The commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as
PCIe BW controller") was found to lead to a boot hang on a Intel P45
system. Testing without setting Link Bandwidth Management Interrupt Enable
(LBMIE) and Link Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe r7.0,
sec 7.5.3.7) in bwctrl allowed system to come up.
P45 is a very old chipset and supports only up to gen2 PCIe, so not having
bwctrl does not seem a huge deficiency.
Add no_bw_notif in struct pci_dev and quirk Intel P45 Root Port with it.
Reported-by: Adam Stylinski <kungfujesus06@gmail.com>
Link: https://lore.kernel.org/linux-pci/aUCt1tHhm_-XIVvi@eggsbenedict/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Adam Stylinski <kungfujesus06@gmail.com>
Link: https://patch.msgid.link/20260116131513.2359-1-ilpo.jarvinen@linux.intel.com
|
|
The ACS Capability register is read-only. Cache it to allow quirks to
override it and to avoid re-reading it.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://patch.msgid.link/20260102-pci_acs-v3-2-72280b94d288@oss.qualcomm.com
|
|
The HK_TYPE_DOMAIN housekeeping cpumask is now modifiable at runtime. In
order to synchronize against PCI probe works and make sure that no
asynchronous probing is still pending or executing on a newly isolated
CPU, the housekeeping subsystem must flush the PCI probe works.
However the PCI probe works can't be flushed easily since they are
queued to the main per-CPU workqueue pool.
Solve this with creating a PCI probe-specific pool and provide and use
the appropriate flushing API.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Marco Crivellari <marco.crivellari@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: linux-pci@vger.kernel.org
|
|
Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but
implement less than 64 address bits. This breaks on platforms where such
a device is assigned an MSI address higher than what's supported.
Currently, no_64bit_msi bit is set for these devices, meaning that only
32-bit MSI addresses are allowed for them. However, on some platforms the
MSI doorbell address is above the 32-bit limit but within the addressable
range of the device.
As a first step to enable MSI on those combinations of devices and
platforms, convert the boolean no_64bit_msi flag to a DMA mask and fixup
the affected usage sites:
- no_64bit_msi = 1 -> msi_addr_mask = DMA_BIT_MASK(32)
- no_64bit_msi = 0 -> msi_addr_mask = DMA_BIT_MASK(64)
- if (no_64bit_msi) -> if (msi_addr_mask < DMA_BIT_MASK(64))
Since no values other than DMA_BIT_MASK(32) and DMA_BIT_MASK(64) are used,
this is functionally equivalent.
This prepares for changing the binary decision between 32 and 64 bit to a
DMA mask based decision which allows to support systems which have a DMA
address space less than 64bit but a MSI doorbell address above the 32-bit
limit.
[ tglx: Massaged changelog ]
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com> # ionic
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Takashi Iwai <tiwai@suse.de> # sound
Link: https://patch.msgid.link/20260129-pci-msi-addr-mask-v4-1-70da998f2750@iscas.ac.cn
|
|
PCI bridge window setup code includes special code to handle CardBus
bridges. CardBus has long since fallen out of favor and modern systems have
no use for it.
Move CardBus setup code to its own file and use existing CONFIG_CARDBUS to
decide whether it should be built or not.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-18-ilpo.jarvinen@linux.intel.com
|
|
CXL is a protocol that runs on top of PCIe electricals. Its error model
also runs on top of the PCIe AER error model by standardizing "internal"
errors as "CXL" errors. Linux has historically ignored internal errors.
CXL protocol error handling is then a task of enhancing the PCIe AER
core to understand that PCIe ports (upstream and downstream) and
endpoints may throw internal errors that represent standard CXL protocol
errors.
The proposed method to make that determination is to teach 'struct
pci_dev' to cache when its link has trained the CXL.mem and/or CXL.cache
protocols and then treat all internal errors as CXL errors. A design
goal is to not burden the PCIe AER core with CXL knowledge beyond just
enough to forward error notifications to the CXL RAS core. The forwarded
notification looks up a 'struct cxl_port' or 'struct cxl_dport'
companion device to the PCI device.
Introduce set_pcie_cxl() with logic checking for CXL.mem or CXL.cache
status in the CXL Flex Bus DVSEC status register. The CXL Flex Bus DVSEC
presence is used because it is required for all the CXL PCIe devices.[1]
[1] CXL 3.1 Spec, 8.1.1 PCIe Designated Vendor-Specific Extended
Capability (DVSEC) ID Assignment, Table 8-2
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-4-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Now since all .assert_callback() implementations have been removed from the
controller drivers, drop the .assert_callback callback from pci.h.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260115-pci-pwrctrl-rework-v5-14-9d26da3ce903@oss.qualcomm.com
|
|
ASPEED BMC controllers have VGA and USB functions behind a PCIe-to-PCI
bridge that causes them to share the same StreamID:
[e0]---00.0-[e1-e2]----00.0-[e2]--+-00.0 ASPEED Graphics Family
\-02.0 ASPEED USB Controller
Both devices get StreamID 0x5e200 due to bridge aliasing, causing the USB
controller to be rejected with 'Aliasing StreamID unsupported'.
Per ASPEED, the AST1150 doesn't use a real PCI bus and always forwards
the original Requester ID from downstream devices rather than replacing
it with any alias.
Add a new PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS flag and apply it to the
AST1150.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://patch.msgid.link/20251217154529.377586-2-nirmoyd@nvidia.com
|
|
473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is
disabled") fixed a build error by providing Rust helpers when
CONFIG_PCI_MSI is not set. However the Rust helpers rely on
pci_free_irq_vectors(), which is only available when CONFIG_PCI=y.
When CONFIG_PCI is not set, there is already a stub for
pci_alloc_irq_vectors(). Add a similar stub for pci_free_irq_vectors().
Fixes: 473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled")
Reported-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Closes: https://lore.kernel.org/rust-for-linux/20251209014312.575940-1-fujita.tomonori@gmail.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512220740.4Kexm4dW-lkp@intel.com/
Reported-by: Liang Jie <liangjie@lixiang.com>
Closes: https://lore.kernel.org/rust-for-linux/20251222034415.1384223-1-buaajxlj@163.com/
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251226113938.52145-1-boqun.feng@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull PCIe Link Encryption and Device Authentication from Dan Williams:
"New PCI infrastructure and one architecture implementation for PCIe
link encryption establishment via platform firmware services.
This work is the result of multiple vendors coming to consensus on
some core infrastructure (thanks Alexey, Yilun, and Aneesh!), and
three vendor implementations, although only one is included in this
pull. The PCI core changes have an ack from Bjorn, the crypto/ccp/
changes have an ack from Tom, and the iommu/amd/ changes have an ack
from Joerg.
PCIe link encryption is made possible by the soup of acronyms
mentioned in the shortlog below. Link Integrity and Data Encryption
(IDE) is a protocol for installing keys in the transmitter and
receiver at each end of a link. That protocol is transported over Data
Object Exchange (DOE) mailboxes using PCI configuration requests.
The aspect that makes this a "platform firmware service" is that the
key provisioning and protocol is coordinated through a Trusted
Execution Envrionment (TEE) Security Manager (TSM). That is either
firmware running in a coprocessor (AMD SEV-TIO), or quasi-hypervisor
software (Intel TDX Connect / ARM CCA) running in a protected CPU
mode.
Now, the only reason to ask a TSM to run this protocol and install the
keys rather than have a Linux driver do the same is so that later, a
confidential VM can ask the TSM directly "can you certify this
device?".
That precludes host Linux from provisioning its own keys, because host
Linux is outside the trust domain for the VM. It also turns out that
all architectures, save for one, do not publish a mechanism for an OS
to establish keys in the root port. So "TSM-established link
encryption" is the only cross-architecture path for this capability
for the foreseeable future.
This unblocks the other arch implementations to follow in v6.20/v7.0,
once they clear some other dependencies, and it unblocks the next
phase of work to implement the end-to-end flow of confidential device
assignment. The PCIe specification calls this end-to-end flow Trusted
Execution Environment (TEE) Device Interface Security Protocol
(TDISP).
In the meantime, Linux gets a link encryption facility which has
practical benefits along the same lines as memory encryption. It
authenticates devices via certificates and may protect against
interposer attacks trying to capture clear-text PCIe traffic.
Summary:
- Introduce the PCI/TSM core for the coordination of device
authentication, link encryption and establishment (IDE), and later
management of the device security operational states (TDISP).
Notify the new TSM core layer of PCI device arrival and departure
- Add a low level TSM driver for the link encryption establishment
capabilities of the AMD SEV-TIO architecture
- Add a library of helpers TSM drivers to use for IDE establishment
and the DOE transport
- Add skeleton support for 'bind' and 'guest_request' operations in
support of TDISP"
* tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: (23 commits)
crypto/ccp: Fix CONFIG_PCI=n build
virt: Fix Kconfig warning when selecting TSM without VIRT_DRIVERS
crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)
iommu/amd: Report SEV-TIO support
psp-sev: Assign numbers to all status codes and add new
ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public
PCI/TSM: Add 'dsm' and 'bound' attributes for dependent functions
PCI/TSM: Add pci_tsm_guest_req() for managing TDIs
PCI/TSM: Add pci_tsm_bind() helper for instantiating TDIs
PCI/IDE: Initialize an ID for all IDE streams
PCI/IDE: Add Address Association Register setup for downstream MMIO
resource: Introduce resource_assigned() for discerning active resources
PCI/TSM: Drop stub for pci_tsm_doe_transfer()
drivers/virt: Drop VIRT_DRIVERS build dependency
PCI/TSM: Report active IDE streams
PCI/IDE: Report available IDE streams
PCI/IDE: Add IDE establishment helpers
PCI: Establish document for PCI host bridge sysfs attributes
PCI: Add PCIe Device 3 Extended Capability enumeration
PCI/TSM: Establish Secure Sessions and Link Encryption
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan
Williams)
- Switch vmd from custom domain number allocator to the common
allocator to prevent a potential race with new non-VMD buses (Dan
Williams)
- Enable Precision Time Measurement (PTM) only if device advertises
support for a relevant role, to prevent invalid PTM Requests that
cause ACS violations that are reported as AER Uncorrectable
Non-Fatal errors (Mika Westerberg)
Resource management:
- Prevent resource tree corruption when BAR resize fails (Ilpo
Järvinen)
- Restore BARs to the original size if a BAR resize fails (Ilpo
Järvinen)
- Remove BAR release from BAR resize attempts by the xe, i915, and
amdgpu drivers so the PCI core can restore BARs if the resize fails
(Ilpo Järvinen)
- Move Resizable BAR code to rebar.c (Ilpo Järvinen)
- Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo
Järvinen)
- Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo
Järvinen)
Power management and error handling:
- For drivers using PCI legacy suspend, save config state at suspend
so that state (not any earlier state from enumeration, probe, or
error recovery) will be restored when resuming (Lukas Wunner)
- For devices with no driver or a driver that lacks power management,
save config state at hibernate so that state (not any earlier state
from enumeration, probe, or error recovery) will be restored when
resuming (Lukas Wunner)
- Save device config space on device addition, before driver binding,
so error recovery works more reliably (Lukas Wunner)
- Drop pci_save_state() from several drivers that no longer need it
since the PCI core always does it and pci_restore_state() no longer
invalidates the saved state (Lukas Wunner)
- Document use of pci_save_state() by drivers to capture the state
they want restored during error recovery (Lukas Wunner)
Power control:
- Add a struct pci_ops.assert_perst() function pointer to
assert/deassert PCIe PERST# and implement it for the qcom driver
(Krishna Chaitanya Chundru)
- Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe
switch, which must be held in reset after poweron so the pwrctrl
driver can configure the switch via I2C before bringing up the
links (Krishna Chaitanya Chundru)
Endpoint framework:
- Convert the endpoint doorbell test to use a threaded IRQ to fix a
'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri)
- Add endpoint VNTB MSI doorbell support to reduce latency between
host and endpoint (Frank Li)
New native PCIe controller drivers:
- Add CIX Sky1 host controller DT binding and driver (Hans Zhang)
- Add NXP S32G host controller DT binding and driver (Vincent
Guittot)
- Add Renesas RZ/G3S host controller DT binding and driver (Claudiu
Beznea)
- Add SpacemiT K1 host controller DT binding and driver (Alex Elder)
Amlogic Meson PCIe controller driver:
- Update DT binding to name DBI region 'dbi', not 'elbi', and update
driver to support both (Manivannan Sadhasivam)
Apple PCIe controller driver:
- Move struct pci_host_bridge allocation from pci_host_common_init()
to callers, which significantly simplifies pcie-apple (Marc
Zyngier)
Broadcom STB PCIe controller driver:
- Disable advertising ASPM L0s support correctly (Jim Quinlan)
- Add a panic/die handler to print diagnostic info in case PCIe
caused an unrecoverable abort (Jim Quinlan)
Cadence PCIe controller driver:
- Add module support for Cadence platform host and endpoint
controller driver (Manikandan K Pillai)
- Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare
for new CIX Sky1 driver (Manikandan K Pillai)
MediaTek PCIe controller driver:
- Convert DT binding to YAML schema (Christian Marangi)
- Add Airoha AN7583 DT compatible and driver support (Christian
Marangi)
Qualcomm PCIe controller driver:
- Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu)
- Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280,
sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT
schemas (Krzysztof Kozlowski)
- Look up OPP using both frequency and data rate (not just frequency)
so RPMh votes can account for both (Krishna Chaitanya Chundru)
Rockchip DesignWare PCIe controller driver:
- Add Rockchip RK3528 compatible strings in DT binding (Yao Zi)
STMicroelectronics STM32MP25 PCIe controller driver:
- Fix a race between link training and endpoint register
initialization (Christian Bruel)
- Align endpoint allocations to match the ATU requirements (Christian
Bruel)
Synopsys DesignWare PCIe controller driver:
- Clear L1 PM Substate Capability 'Supported' bits unless glue driver
says it's supported, which prevents users from enabling non-working
L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)
- Remove now-superfluous L1SS disable code from tegra194 (Bjorn
Helgaas)
- Configure L1SS support in dw-rockchip when DT says
'supports-clkreq' (Shawn Lin)
TI Keystone PCIe controller driver:
- Fail the probe instead of silently succeeding if ks_pcie_of_data
didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)
- Make keystone buildable as a loadable module, except on ARM32 where
hook_fault_code() is __init (Siddharth Vadapalli)"
* tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits)
MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer
MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer
PCI: sky1: Add PCIe host support for CIX Sky1
dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings
PCI: cadence: Add support for High Perf Architecture (HPA) controller
MAINTAINERS: Add NXP S32G PCIe controller driver maintainer
PCI: s32g: Add NXP S32G PCIe controller driver (RC)
PCI: dwc: Add register and bitfield definitions
dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
PCI: Add Renesas RZ/G3S host controller driver
PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
PCI: Validate pci_rebar_size_supported() input
Documentation: PCI: Amend error recovery doc with pci_save_state() rules
treewide: Drop pci_save_state() after pci_restore_state()
PCI/ERR: Ensure error recoverability at all times
PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
PCI: dw-rockchip: Configure L1SS support
PCI: tegra194: Remove unnecessary L1SS disable code
...
|
|
- Add a struct pci_ops.assert_perst() function pointer to assert/deassert
PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya
Chundru)
- Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch,
which must be held in reset after poweron so the pwrctrl driver can
configure the switch via I2C before bringing up the links (Krishna
Chaitanya Chundru)
* pci/pwrctrl-tc9563:
PCI: pwrctrl: Add power control driver for TC9563
PCI: qcom: Implement .assert_perst()
PCI: dwc: Implement .assert_perst() for dwc glue drivers
PCI: Add .assert_perst() to control PCIe PERST#
dt-bindings: PCI: Add binding for Toshiba TC9563 PCIe switch
|
|
- Fail the probe instead of silently succeeding if ks_pcie_of_data
didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)
- Make keystone buildable as a loadable module, except on ARM32 where
hook_fault_code() is __init (Siddharth Vadapalli)
* pci/controller/keystone:
PCI: keystone: Add support to build as a loadable module
PCI: dwc: Export dw_pcie_allocate_domains() and dw_pcie_ep_raise_msix_irq()
PCI: Export pci_get_host_bridge_device() for use by pci-keystone
PCI: keystone: Exit ks_pcie_probe() for invalid mode
|
|
- Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen)
- Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen)
- Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu
drivers so the PCI core can restore BARs if the resize fails (Ilpo
Järvinen)
- Move Resizable BAR code to rebar.c (Ilpo Järvinen)
- Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen)
- Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen)
* pci/resource:
PCI: Validate pci_rebar_size_supported() input
PCI: Convert BAR sizes bitmasks to u64
drm/amdgpu: Use pci_rebar_get_max_size()
drm/xe/vram: Use pci_rebar_get_max_size()
PCI: Add pci_rebar_get_max_size()
drm/xe/vram: Use PCI rebar helpers in resize_vram_bar()
drm/i915/gt: Use pci_rebar_size_supported()
PCI: Add pci_rebar_size_supported() helper
PCI: Improve Resizable BAR functions kernel doc
PCI: Move pci_rebar_size_to_bytes() and export it
PCI: Move pci_rebar_bytes_to_size() and clean it up
PCI: Move Resizable BAR code to rebar.c
PCI: Prevent restoring assigned resources
drm/amdgpu: Remove driver side BAR release before resize
drm/i915: Remove driver side BAR release before resize
drm/xe: Remove driver side BAR release before resize
PCI: Add kerneldoc for pci_resize_resource()
PCI: Fix restoring BARs on BAR resize rollback path
PCI: Free saved list without holding pci_bus_sem
PCI: Try BAR resize even when no window was released
PCI: Change pci_dev variable from 'bridge' to 'dev'
PCI/IOV: Adjust ->barsz[] when changing BAR size
PCI: Prevent resource tree corruption when BAR resize fails
|
|
- Enable PTM only if device advertises support for a relevant role, to
prevent invalid PTM Requests that cause ACS violations that are reported
as AER Uncorrectable Non-Fatal errors (Mika Westerberg)
* pci/ptm:
PCI/PTM: Enable only if device advertises relevant role
|
|
Controller driver probes first, enables link training and scans the bus.
When the PCI bridge is found, its child DT nodes will be scanned and
pwrctrl devices will be created if needed. By the time pwrctrl driver probe
gets called, link training is already enabled by controller driver.
Certain devices like TC9563, which uses the PCI pwrctl framework, need to
configure the device before the PCIe link is up.
As the controller driver already enables link training as part of its
probe, the moment device is powered on, controller and device participate
in link training and link can come up immediately and may not have time to
configure the device.
So we need to stop the link training by using assert_perst() by asserting
PERST# and de-asserting PERST# after device is configured.
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251101-tc9563-v9-2-de3429f7787a@oss.qualcomm.com
|
|
The PCIe spec defines two types of streams - selective and link. Each
stream has an ID from the same bucket so a stream ID does not tell the
type. The spec defines an "enable" bit for every stream and required
stream IDs to be unique among all enabled stream but there is no such
requirement for disabled streams.
However, when IDE_KM is programming keys, an IDE-capable device needs
to know the type of stream being programmed to write it directly to
the hardware as keys are relatively large, possibly many of them and
devices often struggle with keeping around rather big data not being
used.
Walk through all streams on a device and initialise the IDs to some
unique number, both link and selective.
The weakest part of this proposal is the host bridge ide_stream_ids_ida.
Technically, a Stream ID only needs to be unique within a given partner
pair. However, with "anonymous" / unassigned streams there is no convenient
place to track the available ids. Proceed with an ida in the host bridge
for now, but consider moving this tracking to be an ide_stream_ids_ida per
device.
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251113021446.436830-6-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The address ranges for downstream Address Association Registers need to
cover memory addresses for all functions (PFs/VFs/downstream devices)
managed by a Device Security Manager (DSM). The proposed solution is get
the memory (32-bit only) range and prefetchable-memory (64-bit capable)
range from the immediate ancestor downstream port (either the direct-attach
RP or deepest switch port when switch attached).
Similar to RID association, address associations will be set by default if
hardware sets 'Number of Address Association Register Blocks' in the
'Selective IDE Stream Capability Register' to a non-zero value. TSM drivers
can opt-out of the settings by zero'ing out unwanted / unsupported address
ranges. E.g. TDX Connect only supports prefetachable (64-bit capable)
memory ranges for the Address Association setting.
If the immediate downstream port provides both a memory range and
prefetchable-memory range, but the IDE partner port only provides 1 Address
Association Register block then the TSM driver can pick which range to
associate, or let the PCI core prioritize memory.
Note, the Address Association Register setup for upstream requests is still
uncertain so is not included.
Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Co-developed-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251114010227.567693-1-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
PCIe r7.0, sec 7.8.6, defines resizable BAR sizes beyond the currently
supported maximum of 128TB, which will require more than u32 to store the
entire bitmask.
Convert Resizable BAR related functions to use u64 bitmask for BAR sizes to
make the typing more future-proof.
The support for the larger BAR sizes themselves is not added at this point.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-12-ilpo.jarvinen@linux.intel.com
|
|
Add pci_rebar_get_max_size() to allow simplifying code that wants to know
the maximum possible size for a Resizable BAR.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-9-ilpo.jarvinen@linux.intel.com
|
|
Many callers of pci_rebar_get_possible_sizes() are interested in finding
out if a particular encoded BAR Size (PCIe r7.0, sec 7.8.6.3) is supported
by the particular BAR.
Add pci_rebar_size_supported() into PCI core to make it easy for the
drivers to determine if the BAR size is supported or not.
Use the new function in pci_resize_resource() and in
pci_iov_vf_bar_set_size().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patch.msgid.link/20251113180053.27944-6-ilpo.jarvinen@linux.intel.com
|
|
pci_rebar_size_to_bytes() is in drivers/pci/pci.h but would be useful for
endpoint drivers as well.
Move the function to rebar.c and export it.
In addition, convert the literal to where the number comes from
(PCI_REBAR_MIN_SIZE).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113180053.27944-4-ilpo.jarvinen@linux.intel.com
|
|
Move pci_rebar_bytes_to_size() from include/linux/pci.h to rebar.c as it
does not look very trivial and is not expected to be performance critical.
Convert literals to use a newly added PCI_REBAR_MIN_SIZE define.
Also add kernel doc for the function as the function is exported.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <mjruhl@habana.ai>
Link: https://patch.msgid.link/20251113180053.27944-3-ilpo.jarvinen@linux.intel.com
|
|
BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can be
called either from __resource_resize_store() from sysfs or directly by the
driver for the Endpoint Device.
The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.
pbus_reassign_bridge_resources() rolls back resources if the resize
operation fails, but rollback is performed only for the bridge windows.
Because releasing the device resources are done by the caller of the BAR
resize interface, these functions performing the BAR resize do not have
access to the device resources as they were before the resize.
pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources()
after rolling back the bridge windows as they were, however, it will not
guarantee the resource are assigned due to differences in how FW and the
kernel assign the resources (alignment of the start address and tail).
To perform rollback robustly, the BAR resize interface has to be altered to
also release the device resources that share the bridge window with the BAR
to be resized.
Also, remove restoring from the entries failed list as saved list should
now contain both the bridge windows and device resources so the extra
restore is duplicated work.
Some drivers (currently only amdgpu) want to prevent releasing some
resources. Add exclude_bars param to pci_resize_resource() and make amdgpu
pass its register BAR (BAR 2 or 5), which should never be released during
resize operation. Normally 64-bit prefetchable resources do not share a
bridge window with the 32-bit only register BAR, but there are various
fallbacks in the resource assignment logic which may make the resources
share the bridge window in rare cases.
This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree corruption
in the ("PCI: Release assigned resource before restoring them") change. As
such, it likely restores functionality in cases where device resources were
released to avoid resource tree conflicts which appeared to be "working"
when such conflicts were not correctly detected by the kernel.
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash amdgpu BAR selection from
https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
|
|
The pci-keystone.c driver uses the 'pci_get_host_bridge_device()' helper.
Export it in preparation for enabling the pci-keystone.c driver to be built
as a loadable module.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251029080547.1253757-2-s-vadapalli@ti.com
|
|
Defective devices sometimes advertise support for ASPM L0s or L1 states
even if they don't work correctly.
Cache the L0s Supported and L1 Supported bits early in enumeration so
HEADER quirks can override the ASPM states advertised in Link Capabilities
before pcie_aspm_cap_init() enables ASPM.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20251110222929.2140564-2-helgaas@kernel.org
|
|
We have a Switch Upstream Port (2b:00.0) that has a PTM Capability, but
doesn't advertise support for any PTM roles:
Capabilities: [220 v1] Precision Time Measurement
PTMCap: Requester- Responder- Root-
Linux enables PTM without looking into what roles it actually supports, and
apparently the Port immediately sends PTM Requests even though it doesn't
support the PTM Requester role. The messages include an invalid bus number,
so the Root Port detects an ACS Violation (see the PCIe r7.0, sec 6.12.1.1,
implementation note):
pci 0000:2b:00.0: [8086:5786] type 01 class 0x060400 PCIe Switch Upstream Port
pci 0000:2b:00.0: PTM enabled, 4ns granularity
pcieport 0000:00:07.1: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:07.1
pcieport 0000:00:07.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
pcieport 0000:00:07.1: device [8086:e44f] error status/mask=00200000/00000000
pcieport 0000:00:07.1: [21] ACSViol (First)
pcieport 0000:00:07.1: AER: TLP Header: 0x34000000 0x00000052 0x00000000 0x00000000
The TLP Header shows a 4 DW header, no data (001b) Msg with Local routing
(1 0100b) with Requester ID 0x0000 and PTM Request code (0x52).
Fix this by enabling PTM only if the following conditions are true (see sec
6.21.1 figure 6-21):
- Endpoint must advertise PTM Requester Capable
- Switch Upstream Port must advertise PTM Responder Capable
- Root Port must advertise PTM Root Capable
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: commit log, comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251112074614.1440266-1-mika.westerberg@linux.intel.com
|
|
There are two components to establishing an encrypted link, provisioning
the stream in Partner Port config-space, and programming the keys into
the link layer via IDE_KM (IDE Key Management). This new library,
drivers/pci/ide.c, enables the former. IDE_KM, via a TSM low-level
driver, is saved for later.
With the platform TSM implementations of SEV-TIO and TDX Connect in mind
this library abstracts small differences in those implementations. For
example, TDX Connect handles Root Port register setup while SEV-TIO
expects System Software to update the Root Port registers. This is the
rationale for fine-grained 'setup' + 'enable' verbs.
The other design detail for TSM-coordinated IDE establishment is that
the TSM may manage allocation of Stream IDs, this is why the Stream ID
value is passed in to pci_ide_stream_setup().
The flow is:
pci_ide_stream_alloc():
Allocate a Selective IDE Stream Register Block in each Partner Port
(Endpoint + Root Port), and reserve a host bridge / platform stream
slot. Gather Partner Port specific stream settings like Requester ID.
pci_ide_stream_register():
Publish the stream in sysfs after allocating a Stream ID. In the TSM
case the TSM allocates the Stream ID for the Partner Port pair.
pci_ide_stream_setup():
Program the stream settings to a Partner Port. Caller is responsible
for optionally calling this for the Root Port as well if the TSM
implementation requires it.
pci_ide_stream_enable():
Enable the stream after IDE_KM.
In support of system administrators auditing where platform, Root Port,
and Endpoint IDE stream resources are being spent, the allocated stream
is reflected as a symlink from the host bridge to the endpoint with the
name:
stream%d.%d.%d
Where the tuple of integers reflects the allocated platform, Root Port,
and Endpoint stream index (Selective IDE Stream Register Block) values.
Thanks to Wu Hao for a draft implementation of this infrastructure.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251031212902.2256310-8-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
PCIe r7.0 Section 7.7.9 Device 3 Extended Capability Structure, defines the
canonical location for determining the Flit Mode of a device. This status
is a dependency for PCIe IDE enabling. Add a new fm_enabled flag to 'struct
pci_dev'.
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Cc: Alexey Kardashevskiy <aik@amd.com>
Cc: Xu Yilun <yilun.xu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251031212902.2256310-6-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The PCIe 7.0 specification, section 11, defines the Trusted Execution
Environment (TEE) Device Interface Security Protocol (TDISP). This
protocol definition builds upon Component Measurement and Authentication
(CMA), and link Integrity and Data Encryption (IDE). It adds support for
assigning devices (PCI physical or virtual function) to a confidential VM
such that the assigned device is enabled to access guest private memory
protected by technologies like Intel TDX, AMD SEV-SNP, RISCV COVE, or ARM
CCA.
The "TSM" (TEE Security Manager) is a concept in the TDISP specification
of an agent that mediates between a "DSM" (Device Security Manager) and
system software in both a VMM and a confidential VM. A VMM uses TSM ABIs
to setup link security and assign devices. A confidential VM uses TSM
ABIs to transition an assigned device into the TDISP "RUN" state and
validate its configuration. From a Linux perspective the TSM abstracts
many of the details of TDISP, IDE, and CMA. Some of those details leak
through at times, but for the most part TDISP is an internal
implementation detail of the TSM.
CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory
to pci-sysfs. Consider that the TSM driver may itself be a PCI driver.
Userspace can watch for the arrival of a "TSM" device,
/sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has
initialized TSM services.
The operations that can be executed against a PCI device are split into
two mutually exclusive operation sets, "Link" and "Security" (struct
pci_tsm_{link,security}_ops). The "Link" operations manage physical link
security properties and communication with the device's Device Security
Manager firmware. These are the host side operations in TDISP. The
"Security" operations coordinate the security state of the assigned
virtual device (TDI). These are the guest side operations in TDISP.
Only "link" (Secure Session and physical Link Encryption) operations are
defined at this stage. There are placeholders for the device security
(Trusted Computing Base entry / exit) operations.
The locking allows for multiple devices to be executing commands
simultaneously, one outstanding command per-device and an rwsem
synchronizes the implementation relative to TSM registration/unregistration
events.
Thanks to Wu Hao for his work on an early draft of this support.
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Link: https://patch.msgid.link/20251031212902.2256310-5-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
PCI/TSM, the PCI core functionality for the PCIe TEE Device Interface
Security Protocol (TDISP), has a need to walk all subordinate functions of
a Device Security Manager (DSM) to setup a device security context. A DSM
is physical function 0 of multi-function or SR-IOV device endpoint, or it
is an upstream switch port.
In error scenarios or when a TEE Security Manager (TSM) device is removed
it needs to unwind all established DSM contexts.
Introduce reverse versions of PCI device iteration helpers to mirror the
setup path and ensure that dependent children are handled before parents.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251031212902.2256310-4-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Link encryption is a new PCIe feature enumerated by "PCIe r7.0 section
7.9.26 IDE Extended Capability".
It is both a standalone port + endpoint capability, and a building block
for the security protocol defined by "PCIe r7.0 section 11 TEE Device
Interface Security Protocol (TDISP)". That protocol coordinates device
security setup between a platform TSM (TEE Security Manager) and a
device DSM (Device Security Manager). While the platform TSM can
allocate resources like Stream ID and manage keys, it still requires
system software to manage the IDE capability register block.
Add register definitions and basic enumeration in preparation for
Selective IDE Stream establishment. A follow on change selects the new
CONFIG_PCI_IDE symbol. Note that while the IDE specification defines
both a point-to-point "Link Stream" and a Root Port to endpoint
"Selective Stream", only "Selective Stream" is considered for Linux as
that is the predominant mode expected by Trusted Execution Environment
Security Managers (TSMs), and it is the security model that limits the
number of PCI components within the TCB in a PCIe topology with
switches.
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Link: https://patch.msgid.link/20251031212902.2256310-3-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The ability to emulate a host bridge is useful not only for hardware PCI
controllers like CONFIG_VMD, or virtual PCI controllers like
CONFIG_PCI_HYPERV, but also for test and development scenarios like
CONFIG_SAMPLES_DEVSEC [1].
One stumbling block for defining CONFIG_SAMPLES_DEVSEC, a sample
implementation of a platform TSM for PCI Device Security, is the need to
accommodate PCI_DOMAINS_GENERIC architectures alongside x86 [2].
In support of supplementing the existing CONFIG_PCI_BRIDGE_EMUL
infrastructure for host bridges:
* Introduce pci_bus_find_emul_domain_nr() as a common way to find a free
PCI domain number whether that is to reuse the existing dynamic
allocation code in the !ACPI case, or to assign an unused domain above
the last ACPI segment.
* Convert pci-hyperv to the new allocator so that the PCI core can
unconditionally assume that bridge->domain_nr != PCI_DOMAIN_NR_NOT_SET
is the dynamically allocated case.
A follow on patch can also convert vmd to the new scheme. Currently vmd is
limited to CONFIG_PCI_DOMAINS_GENERIC=n (x86) so, unlike pci-hyperv, it
does not immediately conflict with this new pci_bus_find_emul_domain_nr()
mechanism.
Link: http://lore.kernel.org/174107249038.1288555.12362100502109498455.stgit@dwillia2-xfh.jf.intel.com [1]
Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Closes: http://lore.kernel.org/20250311144601.145736-3-suzuki.poulose@arm.com [2]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Link: https://patch.msgid.link/20251024224622.1470555-2-dan.j.williams@intel.com
|
|
- Ensure relaxed tail alignment does not increase min_align when computing
bridge window size, to fix a regression (Ilpo Järvinen)
- Fix bridge window size computation to fix a regression for devices with
undefined PCI class, e.g., Samsung [144d:a5a5] (Ilpo Järvinen)
- Fix error handling during resource resize to fix a regression in amdgpu
(Ilpo Järvinen)
- Align m68k pcibios_enable_device() with other arches (Ilpo Järvinen)
- Remove several sparc pcibios_enable_device() implementations that don't
do anything beyond what pci_enable_resources() does (Ilpo Järvinen)
- Remove mips pcibios_enable_resources() and use pci_enable_resources()
instead (Ilpo Järvinen)
- Refactor and simplify find_bus_resource_of_type() (Ilpo Järvinen)
- Claim bridge windows before setting them up (Ilpo Järvinen)
- Disable non-claimed bridge windows so the kernel's view matches the
hardware configuration (Ilpo Järvinen)
- Use pci_release_resource() instead of release_resource() to reduce code
duplication and increase consistency (Ilpo Järvinen)
- Enable bridges even if bridge window assignment fails (Ilpo Järvinen)
- Preserve bridge window resource type flags when assignment fails because
we may need it later (Ilpo Järvinen)
- Add bridge window selection functions to make the selection consistent
across the several places that do this (Ilpo Järvinen)
- Warn if bridge window cannot be released when resizing BAR (Ilpo
Järvinen)
- Set up bridge resources before enumerating children so we can check
whether child resources are inside bridge windows (Ilpo Järvinen)
* pci/resource:
PCI: Set up bridge resources earlier
PCI: Don't print stale information about resource
PCI: Alter misleading recursion to pci_bus_release_bridge_resources()
PCI: Pass bridge window to pci_bus_release_bridge_resources()
PCI: Add pci_setup_one_bridge_window()
PCI: Refactor remove_dev_resources() to use pbus_select_window()
PCI: Refactor distributing available memory to use loops
PCI: Use pbus_select_window_for_type() during mem window sizing
PCI: Use pbus_select_window() in space available checker
PCI: Rename resource variable from r to res
PCI: Use pbus_select_window_for_type() during IO window sizing
PCI: Use pbus_select_window() during BAR resize
PCI: Warn if bridge window cannot be released when resizing BAR
PCI: Fix finding bridge window in pci_reassign_bridge_resources()
PCI: Add bridge window selection functions
PCI: Add defines for bridge window indexing
PCI: Preserve bridge window resource type flags
PCI: Enable bridge even if bridge window fails to assign
PCI: Use pci_release_resource() instead of release_resource()
PCI: Disable non-claimed bridge window
PCI: Always claim bridge window before its setup
PCI: Refactor find_bus_resource_of_type() logic checks
PCI: Move find_bus_resource_of_type() earlier
MIPS: PCI: Use pci_enable_resources()
sparc/PCI: Remove pcibios_enable_device() as they do nothing extra
m68k/PCI: Use pci_enable_resources() in pcibios_enable_device()
PCI: Fix failure detection during resource resize
PCI: Fix pdev_resources_assignable() disparity
PCI: Ensure relaxed tail alignment does not increase min_align
|
|
pci_bus_distribute_available_resources() and
pci_bridge_distribute_available_resources() retain bridge window resources
and related data needed for distributing the available window in
independent variables for io, memory, and prefetchable memory windows. The
code is essentially the same for all of them and therefore repeated three
times with different variable names.
Refactor pci_bus_distribute_available_resources() to take an array. This
is complicated slightly by the function taking advantage of passing the
struct as value, which cannot be done for arrays in C. Therefore, copy the
data into a local array in the stack in the first loop.
Variable names are (hopefully) improved slightly as well.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-21-ilpo.jarvinen@linux.intel.com
|
|
A few places in setup-bus.c call release_resource() directly and end up
duplicating functionality from pci_release_resource() such as parent check,
logging, and clearing the resource. Worse yet, the way the resource is
cleared is inconsistent between different sites.
Convert release_resource() calls into pci_release_resource() to remove code
duplication. This will also make the resource start, end, and flags
behavior consistent, i.e., start address is cleared, and only
IORESOURCE_UNSET is asserted for the resource.
While at it, eliminate the unnecessary initialization of idx variable in
pci_bridge_release_resources().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-9-ilpo.jarvinen@linux.intel.com
|
|
Issue uevents on s390 during PCI recovery using pci_uevent_ers() as done by
EEH and AER PCIe recovery routines.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20250807-add_err_uevents-v5-2-adf85b0620b0@linux.ibm.com
|
|
- Restore VF resizable BAR state after reset (Michał Winiarski)
- Add pci_resource_num_to_vf_bar() and pci_resource_num_from_vf_bar() to
convert between VF BAR number and the dev->resource[] index (Michał
Winiarski)
- Allow IOV resources (VF BARs) to be resized (Michał Winiarski)
- Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał
Winiarski)
* pci/resources:
PCI/IOV: Allow drivers to control VF BAR size
PCI/IOV: Check that VF BAR fits within the reservation
PCI/IOV: Allow IOV resources to be resized in pci_resize_resource()
PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource
PCI/IOV: Restore VF resizable BAR state after reset
|
|
- Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
misinterpreting a config read failure after a device has been removed
(Lukas Wunner)
- Avoid creating a useless PCIe port service device for pciehp if the slot
is handled by the ACPI hotplug driver (Lukas Wunner)
- Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports
(Lukas Wunner)
- Simplify pci_bridge_d3_possible() and clarify comments (Lukas Wunner)
* pci/hotplug:
PCI: Move is_pciehp check out of pciehp_is_native()
PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
|
|
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon@bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.1752390101.git.lukas@wunner.de
|
|
Several places in the kernel do class shifting to match whether a PCI
device is display class. Add pci_is_display() for those places to use.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-2-superm1@kernel.org
|
|
Drivers could leverage the fact that the VF BAR MMIO reservation is created
for total number of VFs supported by the device by resizing the BAR to
larger size when smaller number of VFs is enabled.
Add pci_iov_vf_bar_set_size() to control the size and a
pci_iov_vf_bar_get_sizes() helper to get the VF BAR sizes that will allow
up to num_vfs to be successfully enabled with the current underlying
reservation size.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250702093522.518099-6-michal.winiarski@intel.com
|