| Age | Commit message (Collapse) | Author |
|
The DmaGspMem pointer accessor methods (gsp_write_ptr, gsp_read_ptr,
cpu_read_ptr, cpu_write_ptr, advance_cpu_read_ptr,
advance_cpu_write_ptr) dereference a raw pointer to DMA memory, creating
an intermediate reference before calling volatile read/write methods.
This is undefined behavior since DMA memory can be concurrently modified
by the device.
Fix this by moving the implementations into a gsp_mem module in fw.rs
that uses the dma_read!() / dma_write!() macros, making the original
methods on DmaGspMem thin forwarding wrappers.
An alternative approach would have been to wrap the shared memory in
Opaque, but that would have required even more unsafe code.
Since the gsp_mem module lives in fw.rs (to access firmware-specific
binding field names), GspMem, Msgq and their relevant fields are
temporarily widened to pub(super). This will be reverted once IoView
projections are available.
Cc: Gary Guo <gary@garyguo.net>
Closes: https://lore.kernel.org/nouveau/DGUT14ILG35P.1UMNRKU93JUM1@kernel.org/
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260309225408.27714-1-dakr@kernel.org
[ Use pub(super) where possible; replace bitwise-and with modulo
operator analogous to [1]. - Danilo ]
Link: https://lore.kernel.org/all/20260129-nova-core-cmdq1-v3-1-2ede85493a27@nvidia.com/ [1]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
The `Cmdq::new` function was allocating a `PteArray` struct on the stack
and was causing a stack overflow with 8216 bytes.
Modify the `PteArray` to calculate and write the Page Table Entries
directly into the coherent DMA buffer one-by-one. This reduces the stack
usage quite a lot.
Reported-by: Gary Guo <gary@garyguo.net>
Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/509436-Nova/topic/.60Cmdq.3A.3Anew.60.20uses.20excessive.20stack.20size/near/570375549
Link: https://lore.kernel.org/rust-for-linux/CANiq72mAQxbRJZDnik3Qmd4phvFwPA01O2jwaaXRh_T+2=L-qA@mail.gmail.com/
Fixes: f38b4f105cfc ("gpu: nova-core: Create initial Gsp")
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Tim Kovalenko <tim.kovalenko@proton.me>
Link: https://patch.msgid.link/20260309-drm-rust-next-v4-4-4ef485b19a4c@proton.me
[ * Use PteArray::entry() in LogBuffer::new(),
* Add TODO comment to use IoView projections once available,
* Add PTE_ARRAY_SIZE constant to avoid duplication.
- Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Current `dma_read!`, `dma_write!` macros also use a custom
`addr_of!()`-based implementation for projecting pointers, which has
soundness issue as it relies on absence of `Deref` implementation on types.
It also has a soundness issue where it does not protect against unaligned
fields (when `#[repr(packed)]` is used) so it can generate misaligned
accesses.
This commit migrates them to use the general pointer projection
infrastructure, which handles these cases correctly.
As part of migration, the macro is updated to have an improved surface
syntax. The current macro have
dma_read!(a.b.c[d].e.f)
to mean `a.b.c` is a DMA coherent allocation and it should project into it
with `[d].e.f` and do a read, which is confusing as it makes the indexing
operator integral to the macro (so it will break if you have an array of
`CoherentAllocation`, for example).
This also is problematic as we would like to generalize
`CoherentAllocation` from just slices to arbitrary types.
Make the macro expects `dma_read!(path.to.dma, .path.inside.dma)` as the
canonical syntax. The index operator is no longer special and is just one
type of projection (in additional to field projection). Similarly, make
`dma_write!(path.to.dma, .path.inside.dma, value)` become the canonical
syntax for writing.
Another issue of the current macro is that it is always fallible. This
makes sense with existing design of `CoherentAllocation`, but once we
support fixed size arrays with `CoherentAllocation`, it is desirable to
have the ability to perform infallible indexing as well, e.g. doing a `[0]`
index of `[Foo; 2]` is okay and can be checked at build-time, so forcing
falliblity is non-ideal. To capture this, the macro is changed to use
`[idx]` as infallible projection and `[idx]?` as fallible index projection
(those syntax are part of the general projection infra). A benefit of this
is that while individual indexing operation may fail, the overall
read/write operation is not fallible.
Fixes: ad2907b4e308 ("rust: add dma coherent allocator abstraction")
Reviewed-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260302164239.284084-4-gary@kernel.org
[ Capitalize safety comments; slightly improve wording in doc-comments.
- Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"Bus:
- Ensure bus->match() is consistently called with the device lock
held
- Improve type safety of bus_find_device_by_acpi_dev()
Devtmpfs:
- Parse 'devtmpfs.mount=' boot parameter with kstrtoint() instead of
simple_strtoul()
- Avoid sparse warning by making devtmpfs_context_ops static
IOMMU:
- Do not register the qcom_smmu_tbu_driver in arm_smmu_device_probe()
MAINTAINERS:
- Add the new driver-core mailing list (driver-core@lists.linux.dev)
to all relevant entries
- Add missing tree location for "FIRMWARE LOADER (request_firmware)"
- Add driver-model documentation to the "DRIVER CORE" entry
- Add missing driver-core maintainers to the "AUXILIARY BUS" entry
Misc:
- Change return type of attribute_container_register() to void; it
has always been infallible
- Do not export sysfs_change_owner(), sysfs_file_change_owner() and
device_change_owner()
- Move devres_for_each_res() from the public devres header to
drivers/base/base.h
- Do not use a static struct device for the faux bus; allocate it
dynamically
Revocable:
- Patches for the revocable synchronization primitive have been
scheduled for v7.0-rc1, but have been reverted as they need some
more refinement
Rust:
- Device:
- Support dev_printk on all device types, not just the core Device
struct; remove now-redundant .as_ref() calls in dev_* print
calls
- Devres:
- Introduce an internal reference count in Devres<T> to avoid a
deadlock condition in case of (indirect) nesting
- DMA:
- Allow drivers to tune the maximum DMA segment size via
dma_set_max_seg_size()
- I/O:
- Introduce the concept of generic I/O backends to handle
different kinds of device shared memory through a common
interface.
This enables higher-level concepts such as register
abstractions, I/O slices, and field projections to be built
generically on top.
In a first step, introduce the Io, IoCapable<T>, and IoKnownSize
trait hierarchy for sharing a common interface supporting offset
validation and bound-checking logic between I/O backends.
- Refactor MMIO to use the common I/O backend infrastructure
- Misc:
- Add __rust_helper annotations to C helpers for inlining into
Rust code
- Use "kernel vertical" style for imports
- Replace kernel::c_str! with C string literals
- Update ARef imports to use sync::aref
- Use pin_init::zeroed() for struct auxiliary_device_id and
debugfs file_operations initialization
- Use LKMM atomic types in debugfs doc-tests
- Various minor comment and documentation fixes
- PCI:
- Implement PCI configuration space accessors using the common I/O
backend infrastructure
- Document pci::Bar device endianness assumptions
- SoC:
- Abstractions for struct soc_device and struct soc_device_attribute
- Sample driver for soc::Device"
* tag 'driver-core-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (79 commits)
rust: devres: fix race condition due to nesting
rust: dma: add missing __rust_helper annotations
samples: rust: pci: Remove some additional `.as_ref()` for `dev_*` print
Revert "revocable: Revocable resource management"
Revert "revocable: Add Kunit test cases"
Revert "selftests: revocable: Add kselftest cases"
driver core: remove device_change_owner() export
sysfs: remove exports of sysfs_*change_owner()
driver core: disable revocable code from build
revocable: Add KUnit test for concurrent access
revocable: fix SRCU index corruption by requiring caller-provided storage
revocable: Add KUnit test for provider lifetime races
revocable: Fix races in revocable_alloc() using RCU
driver core: fix inverted "locked" suffix of driver_match_device()
rust: io: move MIN_SIZE and io_addr_assert to IoKnownSize
rust: pci: re-export ConfigSpace
rust: dma: allow drivers to tune max segment size
gpu: tyr: remove redundant `.as_ref()` for `dev_*` print
rust: auxiliary: use `pin_init::zeroed()` for device ID
rust: debugfs: use pin_init::zeroed() for file_operations
...
|
|
Pull drm updates from Dave Airlie:
"Highlights:
- amdgpu support for lots of new IP blocks which means newer GPUs
- xe has a lot of SR-IOV and SVM improvements
- lots of intel display refactoring across i915/xe
- msm has more support for gen8 platforms
- Given up on kgdb/kms integration, it's too hard on modern hw
core:
- drop kgdb support
- replace system workqueue with percpu
- account for property blobs in memcg
- MAINTAINERS updates for xe + buddy
rust:
- Fix documentation for Registration constructors
- Use pin_init::zeroed() for fops initialization
- Annotate DRM helpers with __rust_helper
- Improve safety documentation for gem::Object::new()
- Update AlwaysRefCounted imports
- mm: Prevent integer overflow in page_align()
atomic:
- add drm_device pointer to drm_private_obj
- introduce gamma/degamma LUT size check
buddy:
- fix free_trees memory leak
- prevent BUG_ON
bridge:
- introduce drm_bridge_unplug/enter/exit
- add connector argument to .hpd_notify
- lots of recounting conversions
- convert rockchip inno hdmi to bridge
- lontium-lt9611uxc: switch to HDMI audio helpers
- dw-hdmi-qp: add support for HPD-less setups
- Algoltek AG6311 support
panels:
- edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
- st75751: add SPI support
- Sitronix ST7920, Samsung LTL106HL02
- LG LH546WF1-ED01, HannStar HSD156J
- BOE NV130WUM-T08
- Innolux G150XGE-L05
- Anbernic RG-DS
dma-buf:
- improve sg_table debugging
- add tracepoints
- call clear_page instead of memset
- start to introduce cgroup memory accounting in heaps
- remove sysfs stats
dma-fence:
- add new helpers
dp:
- mst: avoid oob access with vcpi=0
hdmi:
- limit infoframes exposure to userspace
gem:
- reduce page table overhead with THP
- fix leak in drm_gem_get_unmapped_area
gpuvm:
- API sanitation for rust bindings
sched:
- introduce new helpers
panic:
- report invalid panic modes
- add kunit tests
i915/xe display:
- Expose sharpness only if num_scalers is >= 2
- Add initial Xe3P_LPD for NVL
- BMG FBC support
- Add MTL+ platforms to support dpll framework
_ fix DIMM_S DRM decoding on ICL
- Return to using AUX interrupts
- PSR/Panel replay refactoring
- use consolidation HDMI tables
- Xe3_LPD CD2X dividier changes
xe:
- vfio: add vfio_pci for intel GPU
- multi queue support
- dynamic pagemaps and multi-device SVM
- expose temp attribs in hwmon
- NO_COMPRESSION bo flag
- expose MERT OA unit
- sysfs survivability refactor
- SRIOV PF: add MERT support
- enable SR-IOV VF migration
- Enable I2C/NVM on Crescent Island
- Xe3p page reclaimation support
- introduce SRIOV scheduler groups
- add SoC remappt support in system controller
- insert compiler barriers in GuC code
- define NVL GuC firmware
- handle GT resume failure
- fix drm scheduler layering violations
- enable GSC loading and PXP for PTL
- disable GuC Power DCC strategy on PTL
- unregister drm device on probe error
i915:
- move to kernel standard fault injection
- bump recommended GuC version for DG2 and MTL
amdgpu:
- SMUIO 15.x, PSP 15.x support
- IH 6.1.1/7.1 support
- MMHUB 3.4/4.2 support
- GC 11.5.4/12.1 support
- SDMA 6.1.4/7.1/7.11.4 support
- JPEG 5.3 support
- UserQ updates
- GC 9 gfx queue reset support
- TTM memory ops parallelization
- convert legacy logging to new helpers
- DC analog fixes
amdkfd:
- GC 11.5.4/12.1 suppport
- SDMA 6.1.4/7.1 support
- per context support
- increase kfd process hash table
- Reserved SDMA rework
radeon:
- convert legacy logging to new helpers
- use devm for i2c adapters
msm:
- GPU
- Document a612/RGMU dt bindings
- UBWC 6.0 support (for A840 / Kaanapali)
- a225 support
- DPU:
- Switch to use virtual planes by default
- Fix DSI CMD panels on DPU 3.x
- Rewrite format handling to remove intermediate representation
- Fix watchdog on DPU 8.x+
- Fix TE / Vsync source setting on DPU 8.x+
- Add 3D_Mux on SC7280
- Kaanapali platform support
- Fix UBWC register programming
- Make RM reserve DSPP-enabled mixers for CRTCs with LMs
- Gamma correction support
- DP:
- Enable support for eDP 1.4+ link rate tables
- Fix MDSS1 DP indices on SA8775P, making them to work
- Fix msm_dp_ctrl_config_msa() to work with LLVM 20
- DSI:
- Document QCS8300 as compatible with SA8775P
- Kaanapali platform support
- DSI PHY:
- switch to divider_determine_rate()
- MDP5:
- Drop support for MSM8998, SDM660 and SDM630 (switch over to DPU)
- MDSS:
- Kaanapali platform support
- Fixed UBWC register programming
nova-core:
- Prepare for Turing support. This includes parsing and handling
Turing-specific firmware headers and sections as well as a Turing
Falcon HAL implementation
- Get rid of the Result<impl PinInit<T, E>> anti-pattern
- Relocate initializer-specific code into the appropriate initializer
- Use CStr::from_bytes_until_nul() to remove custom helpers
- Improve handling of unexpected firmware values
- Clean up redundant debug prints
- Replace c_str!() with native Rust C-string literals
- Update nova-core task list
nova:
- Align GEM object size to system page size
tyr:
- Use generated uAPI bindings for GpuInfo
- Replace manual sleeps with read_poll_timeout()
- Replace c_str!() with native Rust C-string literals
- Suppress warnings for unread fields
- Fix incorrect register name in print statement
nouveau:
- fix big page table support races in PTE management
- improve reclocking on tegra 186+
amdxdna:
- fix suspend race conditions
- improve handling of zero tail pointers
- fix cu_idx overwritten during command setup
- enable hardware context priority
- remove NPU2 support
- update message buffer allocation requirements
- update firmware version check
ast:
- support imported cursor buffers
- big endian fixes
etnaviv:
- add PPU flop reset support
imagination:
- add AM62P support
- introduce hw version checks
ivpu:
- implement warm boot flow
panfrost:
- add bo sync ioctl
- add GPU_PM_RT support for RZ/G3E SoC
panthor:
- add bo sync ioctl
- enable timestamp propagation
- scheduler robustness improvements
- VM termination fixes
- huge page support
rockchip:
- RK3368 HDMI Support
- get rid of atomic_check fixups
- RK3506 support
- RK3576/RK3588 improved HPD handling
rz-du:
- RZ/V2H(P) MIPI-DSI Support
v3d:
- fix DMA segment size
- convert to new logging helpers
mediatek:
- move DP training to hotplug thread
- convert logging to new helpers
- add support for HS speed DSI
- Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support
atmel-hlcdc:
- switch to drmm resource
- support nomodeset
- use newer helpers
hisilicon:
- fix various DP bugs
renesas:
- fix kernel panic on reboot
exynos:
- fix vidi_connection_ioctl using wrong device
- fix vidi_connection deref user ptr
- fix concurrency regression with vidi_context
vkms:
- add configfs support for display configuration
* tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel: (1610 commits)
drm/xe/pm: Disable D3Cold for BMG only on specific platforms
drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
drm/xe: Fix kerneldoc for xe_migrate_exec_queue
drm/xe/query: Fix topology query pointer advance
drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
drm/xe/guc: Fix CFI violation in debugfs access.
accel/amdxdna: Move RPM resume into job run function
accel/amdxdna: Fix incorrect DPM level after suspend/resume
nouveau/vmm: start tracking if the LPT PTE is valid. (v6)
nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)
nouveau/vmm: rewrite pte tracker using a struct and bitfields.
accel/amdxdna: Fix incorrect error code returned for failed chain command
accel/amdxdna: Remove hardware context status
drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()
drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc
drm/i915/display: fix the pixel normalization handling for xe3p_lpd
drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
drm/exynos: vidi: fix to avoid directly dereferencing user pointer
drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
...
|
|
These imports are already in scope by importing `kernel::prelude::*` and
does not need to be imported separately.
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260123172007.136873-2-gary@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
`GspInitDone` has no payload whatsoever, so the unit type `()` is the
correct way to represent its message content. We can use it now that
`()` implements `FromBytes`.
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251215-transmute_unit-v4-2-477d71ec7c23@nvidia.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
On Turing and GA100 (i.e. the versions that use Libos v2), GSP-RM insists
that the 'size' parameter of the LibosMemoryRegionInitArgument struct be
aligned to 4KB. The logging buffers are already aligned to that size, so
only the GSP_ARGUMENTS_CACHED struct needs to be adjusted. Make that
adjustment by adding padding to the end of the struct.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260122222848.2555890-12-ttabi@nvidia.com
[acourbot@nvidia.com: GspArgumentsAligned -> GspArgumentsPadded]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Some GPUs do not support using DMA to transfer code/data from system
memory to Falcon memory, and instead must use programmed I/O (PIO).
Add a function to the Falcon HAL to indicate whether a given GPU's
Falcons support DMA for this purpose.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260122222848.2555890-10-ttabi@nvidia.com
[acourbot@nvidia.com: add short code to call into the HAL.]
[acourbot@nvidia.com: make `dma_load` private as per feedback.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The previous Io<SIZE> type combined both the generic I/O access helpers
and MMIO implementation details in a single struct. This coupling prevented
reusing the I/O helpers for other backends, such as PCI configuration
space.
Establish a clean separation between the I/O interface and concrete
backends by separating generic I/O helpers from MMIO implementation.
Introduce a new trait hierarchy to handle different access capabilities:
- IoCapable<T>: A marker trait indicating that a backend supports I/O
operations of a certain type (u8, u16, u32, or u64).
- Io trait: Defines fallible (try_read8, try_write8, etc.) and infallibile
(read8, write8, etc.) I/O methods with runtime bounds checking and
compile-time bounds checking.
- IoKnownSize trait: The marker trait for types support infallible I/O
methods.
Move the MMIO-specific logic into a dedicated Mmio<SIZE> type that
implements the Io traits. Rename IoRaw to MmioRaw and update consumers to
use the new types.
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260121202212.4438-3-zhiw@nvidia.com
[ Add #[expect(unused)] to define_{read,write}!(). - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
We need the drm-rust fixes from -rc5 in here for nova-core to build on
top of.
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Although the dev_xx!() macro calls do not technically require terminating
newlines for the format strings, they should be added anyway to maintain
consistency, both within Rust code and with the C versions.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260107201647.2490140-2-ttabi@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Change gpu_name() to return a Result instead of an Option. This avoids
silently discarding error information when parsing the GPU name string
from the GSP.
Update the callsite to log a warning with the error details on failure,
rather than just displaying "invalid GPU name".
Suggested-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260108005811.86014-2-jhubbard@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
The util.rs module contained a single helper function,
str_from_null_terminated(), which duplicated functionality that is now
available in core::ffi::CStr.
Specifically, CStr::from_bytes_until_nul() is available in the kernel's
minimum supported Rust version (1.78.0), so it time to stop using this
custom workaround.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260106035226.48853-2-jhubbard@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
In GspFirmware::new(), utilize pin_init_scope() to get rid of the Result
in the returned
Result<impl PinInit<T, Error>>
which is unnecessarily redundant.
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Link: https://patch.msgid.link/20251218155239.25243-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Update call sites to import `ARef` from `sync::aref`
instead of `types`.
This aligns with the ongoing effort to move `ARef` and
`AlwaysRefCounted` to sync.
Suggested-by: Benno Lossin <lossin@kernel.org>
Link: https://github.com/Rust-for-Linux/linux/issues/1173
Signed-off-by: Shankari Anand <shankari.ak0208@gmail.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20251123092438.182251-3-shankari.ak0208@gmail.com
[aliceryhl: keep trailing // at last import]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
|
|
We have an "bindings" alias to avoid having to mention the firmware
version again and again, and limit the diff when upgrading the firmware.
Use it where we neglected to.
Fixes: eaf0989c77e4 ("gpu: nova-core: Add bindings required by GSP sequencer")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251216-nova-fixes-v3-4-c7469a71f7c4@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Commit 4846300ba8f9 ("rust: derive `Zeroable` for all structs & unions
generated by bindgen where possible") automatically derives
`MaybeZeroable` for all bindings. This is better than selectively
deriving `Zeroable` as it ensures all types that can implement
`Zeroable` do.
Regenerate the nova-core bindings so they benefit from this, and remove
a now unneeded implementation of `Zeroable`.
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251216-nova-fixes-v3-3-c7469a71f7c4@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The size of messages' payload is miscalculated, leading to extra data
passed to the message handler. While this is not a problem with our
current set of commands, others with a variable-length payload may
misbehave. Fix this by introducing a method returning the payload size
and using it.
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251216-nova-fixes-v3-2-c7469a71f7c4@nvidia.com
[acourbot@nvidia.com: update `PANIC:` comments as pointed out by Joel.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Explicit padding is needed in order to avoid uninitialized bytes and
safely implement `AsBytes`. The `--explicit-padding` of bindgen was
omitted by mistake when these bindings were generated.
Fixes: 13f85988d4fa ("gpu: nova-core: gsp: Retrieve GSP static info to gather GPU information")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251216-nova-fixes-v3-1-c7469a71f7c4@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Commit 38b7cc448a5b ("gpu: nova-core: implement Display for Spec") in
drm-rust-next introduced some usage of the Display trait, but the
Display trait is being modified in the rust tree this cycle. Thus, to
avoid conflicts with the Rust tree, tweak how the formatting machinery
is used in a way where it works both with and without the changes in the
Rust tree.
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20251117-nova-fmt-rust-v1-1-651ca28cd98f@google.com
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
|
|
After GSP initialization is complete, retrieve the static configuration
information from GSP-RM. This information includes GPU name, capabilities,
memory configuration, and other properties. On some GPU variants, it is
also required to do this for initialization to complete.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: properly abstract the command's bindings, add
relevant methods, make str_from_null_terminated return an Option, fix
size of GPU name array.]
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-14-joelagnelf@nvidia.com>
|
|
This adds the GSP init done command to wait for GSP initialization
to complete. Once this command has been received the GSP is fully
operational and will respond properly to normal RPC commands.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: move new definitions to end of commands.rs, rename
to `wait_gsp_init_done` and remove timeout argument.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-13-joelagnelf@nvidia.com>
|
|
Implement core resume operation. This is the last step of the sequencer
resulting in resume of the GSP and proceeding to INIT_DONE stage of GSP
boot.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-12-joelagnelf@nvidia.com>
|
|
These opcodes implement various falcon-related boot operations: reset,
start, wait-for-halt.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-11-joelagnelf@nvidia.com>
|
|
Implement a sequencer opcode for delay operations.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-10-joelagnelf@nvidia.com>
|
|
These opcodes are used for register write, modify, poll and store (save)
sequencer operations.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
[acourbot@nvidia.com: apply Lyude's suggested fixes.]
Message-ID: <20251114195552.739371-9-joelagnelf@nvidia.com>
|
|
Implement the GSP sequencer which culminates in INIT_DONE message being
received from the GSP indicating that the GSP has successfully booted.
This is just initial sequencer support, the actual commands will be
added in the next patches.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: move GspSequencerInfo definition before its impl
blocks and rename it to GspSequence, adapt imports in sequencer.rs to
new formatting rules, remove `timeout` argument to harmonize with other
commands.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-8-joelagnelf@nvidia.com>
|
|
Add several firmware bindings required by GSP sequencer code.
Co-developed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: remove a couple stray lines/unwanted comment
changes.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-7-joelagnelf@nvidia.com>
|
|
Boot the GSP to the RISC-V active state. Completing the boot requires
running the CPU sequencer which will be added in a future commit.
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-15-8ae4058e3c0e@nvidia.com>
|
|
Add support for sending the SetRegistry command, which is critical to
GSP initialization.
The RM registry is serialized into a packed format and sent via the
command queue. For now only three parameters which are required to boot
GSP are hardcoded. In the future a kernel module parameter will be added
to enable other parameters to be added.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
[acourbot@nvidia.com: split into its own patch.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-12-8ae4058e3c0e@nvidia.com>
|
|
Add support for sending the SetSystemInfo command, which provides
required hardware information to the GSP and is critical to its
initialization.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-11-8ae4058e3c0e@nvidia.com>
|
|
Initialise the GSP resource manager arguments (rmargs) which provides
initialisation parameters to the GSP firmware during boot. The rmargs
structure contains arguments to configure the GSP message/command queue
location.
These are mapped for coherent DMA and added to the libos data structure
for access when booting GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-10-8ae4058e3c0e@nvidia.com>
|
|
This commit introduces core infrastructure for handling GSP command and
message queues in the nova-core driver. The command queue system enables
bidirectional communication between the host driver and GSP firmware
through a remote message passing interface.
The interface is based on passing serialised data structures over a ring
buffer with separate transmit and receive queues. Commands are sent by
writing to the CPU transmit queue and waiting for completion via the
receive queue.
To ensure safety mutable or immutable (depending on whether it is a send
or receive operation) references are taken on the command queue when
allocating the message to write/read to. This ensures message memory
remains valid and the command queue can't be mutated whilst an operation
is in progress.
Currently this is only used by the probe() routine and therefore can
only used by a single thread of execution. Locking to enable safe access
from multiple threads will be introduced in a future series when that
becomes necessary.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-9-8ae4058e3c0e@nvidia.com>
|
|
Derive the Zeroable trait for existing bindgen generated bindings. This
is safe because all bindgen generated types are simple integer types for
which any bit pattern, including all zeros, is valid.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-7-8ae4058e3c0e@nvidia.com>
|
|
The GSP requires some pieces of metadata to boot. These are passed in a
struct which the GSP transfers via DMA. Create this struct and get a
handle to it for future use when booting the GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-5-8ae4058e3c0e@nvidia.com>
|
|
The GSP requires several areas of memory to operate. Each of these have
their own simple embedded page tables. Set these up and map them for DMA
to/from GSP using CoherentAllocation's. Return the DMA handle describing
where each of these regions are for future use when booting GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-4-8ae4058e3c0e@nvidia.com>
|
|
Compute more of the required FB layout information to boot the GSP
firmware.
This information is dependent on the firmware itself, so first we need
to import and abstract the required firmware bindings in the `nvfw`
module.
Then, a new FB HAL method is introduced in `fb::hal` that uses these
bindings and hardware information to compute the correct layout
information.
This information is then used in `fb` and the result made visible in
`FbLayout`.
These 3 things are grouped into the same patch to avoid lots of unused
warnings that would be tedious to work around. As they happen in
different files, they should not be too difficult to track separately.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-1-8ae4058e3c0e@nvidia.com>
|
|
As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.
This does that for the entire nova-core driver.
[1] https://docs.kernel.org/rust/coding-guidelines.html#imports
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove imports already in prelude as pointed out
by Danilo.]
[acourbot@nvidia.com: remove a few unneeded trailing `//`.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251107021006.434109-1-jhubbard@nvidia.com>
|
|
Interacting with the GSP currently requires using definitions from C
header files. Rust definitions for the types needed for Nova core will
be generated using the Rust bindgen tool. This patch adds the base
module to allow inclusion of the generated bindings. The generated
bindings themselves are added by subsequent patches when they are first
used.
Currently we only intend to support a single firmware version, 570.144,
with these bindings. Longer term we intend to move to a more stable GSP
interface that isn't tied to specific firmware versions.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: adapt the bindings module comment a bit]
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-10-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The GSP firmware is a binary blob that is verified, loaded, and run by
the GSP bootloader. Its presentation is a bit peculiar as the GSP
bootloader expects to be given a DMA address to a 3-levels page table
mapping the GSP firmware at address 0 of its own address space.
Prepare such a structure containing the DMA-mapped firmware as well as
the DMA-mapped page tables, and a way to obtain the DMA handle of the
level 0 page table.
Then, move the GSP firmware instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.
As we are performing the required ELF section parsing and radix3 page
table building, remove these items from the TODO file.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-7-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The Booter signed firmware is an essential part of bringing up the GSP
on Turing and Ampere. It is loaded on the sec2 falcon core and is
responsible for loading and running the RISC-V GSP bootloader into the
GSP core.
Add support for parsing the Booter firmware loaded from userspace, patch
its signatures, and store it into a form that is ready to be loaded and
executed on the sec2 falcon.
Then, move the Booter instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.
We do not run Booter yet, as its own payload (the GSP bootloader and
firmware image) still need to be prepared.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-6-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Right now the GSP boot code is very incomplete and limited to running
FRTS, so having it in `Gpu::new` is not a big constraint.
However, this will change as we add more steps of the GSP boot process,
and not all GPU families follow the same procedure, so having these
steps in a dedicated method is the logical construct.
There is also the fact the GSP will require its own runtime data, and
while it won't immediately need to be pinned, we want to be ready for
the time where it will - most likely when it starts using mutexes.
Thus, add an empty `Gsp` type that is pinned inside `Gpu` and
initialized using a pin initializer. This sets the constraint we need to
observe from the start, and could spare us some costly refactoring down
the road.
Then, move the code related to GSP boot to the `gsp::boot` module, as
part of the `Gsp` implementation.
Doing so allows us to make `Gpu::new` return a fallible `impl PinInit`
instead of a `Result.` This is more idiomatic when working with pinned
objects, and sets up the pinned initialization pattern we want to
preserve as the code grows more complex.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250913-nova_firmware-v6-2-9007079548b0@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|