summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe/xe_pcode_api.h
AgeCommit message (Collapse)Author
2026-01-12drm/xe/hwmon: Expose GPU PCIe temperatureKarthik Poosa
Expose GPU PCIe average temperature and its limits via hwmon sysfs entry temp5_xxx. Update Xe hwmon sysfs documentation for this. v2: Update kernel version in Xe hwmon documentation. (Raag) v3: - Address review comments from Raag. - Remove redundant debug log. - Update kernel version in Xe hwmon documentation. (Raag) v4: - Address review comments from Raag. - Group new temperature attributes with existing temperature attributes as per channel index in Xe hwmon documentation. - Use TEMP_MASK instead of TEMP_MASK_MAILBOX. - Add PCIE_SENSOR_MASK which uses REG_FIELD_GET as replacement of PCIE_SENSOR_SHIFT. v5: - Address review comments from Raag. - Use REG_FIELD_GET to get PCIe temperature. - Move PCIE_SENSOR_GROUP_ID and PCIE_SENSOR_MASK to xe_pcode_api.h - Cosmetic change. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20260112203521.1014388-4-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-01-12drm/xe/hwmon: Expose memory controller temperatureKarthik Poosa
Expose GPU memory controller average temperature and its limits under temp4_xxx. Update Xe hwmon documentation for this. v2: - Rephrase commit message. (Badal) - Update kernel version in Xe hwmon documentation. (Raag) v3: - Update kernel version in Xe hwmon documentation. - Address review comments from Raag. - Remove obvious comments. - Remove redundant debug logs. - Remove unnecessary checks. - Avoid magic numbers. - Add new comments. - Use temperature sensors count to make memory controller visible. - Use temperature limits of package for memory controller. v4: - Address review comments from Raag. - Group new temperature attributes with existing temperature attributes as per channel index in Xe hwmon documentation. - Use DIV_ROUND_UP to calculate dwords needed for temperature limits. - Minor aesthetic refinements. - Remove unused TEMP_MASK_MAILBOX. v5: - Use REG_FIELD_GET to get count from READ_THERMAL_DATA output. (Raag) - Change count print from decimal to hexadecimal. - Cosmetic changes. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20260112203521.1014388-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-01-12drm/xe/hwmon: Expose temperature limitsKarthik Poosa
Read temperature limits using pcode mailbox and expose shutdown temperature limit as tempX_emergency, critical temperature limit as tempX_crit and GPU max temperature limit as temp2_max. Update Xe hwmon documentation with above entries. v2: - Resolve a documentation warning. - Address below review comments from Raag. - Update date and kernel version in Xe hwmon documentation. - Remove explicit disable of has_mbx_thermal_info for unsupported platforms. - Remove unnecessary default case in switches. - Remove obvious comments. - Use TEMP_LIMIT_MAX to compute number of dwords needed in xe_hwmon_thermal_info. - Remove THERMAL_LIMITS_DWORDS macro. - Use has_mbx_thermal_info for checking thermal mailbox support. v3: - Address below minor comments. (Raag) - Group new temperature attributes with existing temperature attributes as per channel index in Xe hwmon documentation. - Rename enums of xe_temp_limit to improve clarity. - Use DIV_ROUND_UP to calculate dwords needed for temperature limits. - Use return instead of breaks in xe_hwmon_temp_read. - Minor aesthetic refinements. v4: - Remove a redundant break. (Raag) - Update drm_dbg to drm_warn to inform user of unavailability for thermal mailbox on expected platforms. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20260112203521.1014388-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-12-09drm/xe/xe_survivability: Add support for survivability mode v2Riana Tauro
v2 survivability breadcrumbs introduces a new mode called SPI Flash Descriptor Override mode (FDO). This is enabled by PCODE when MEI itself fails and firmware cannot be updated via MEI using igsc. This mode provides the ability to update the firmware directly via SPI driver. Xe KMD initializes the nvm aux driver if FDO mode is enabled. Userspace should check FDO mode entry in survivability info sysfs before using the SPI driver to update firmware. /sys/bus/pci/devices/<device>/survivability_info/fdo_mode v2 also supports survivability mode for critical boot errors. Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251208084539.3652902-6-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12drm/xe/pcode: Rework error mappingLucas De Marchi
The sparse array used for error decoding from is unnecessarily big. It should be better handled by a switch statement that will also allow us to more easily improve this code. Add a CASE_ERR() macro to keep the table compact and use it instead of the 256-entries array, which saves some space: $ bloat-o-meter xe_pcode.o.old xe_pcode.o add/remove: 0/1 grow/shrink: 2/0 up/down: 190/-4096 (-3906) Function old new delta __pcode_mailbox_rw 363 465 +102 __pcode_mailbox_rw.cold 58 146 +88 err_decode 4096 - -4096 Total: Before=7890, After=3984, chg -49.51% Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251110-pcode-errmap-v2-1-cb18c8f54238@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-09drm/xe: Expose fan control and voltage regulator versionRaag Jadav
Add sysfs attributes for late binding features which expose bound version to the user. v2: Rework attribute and macro naming (Badal) v3: Drop fancy formatting (Rodrigo) v4: Form version string using local variables (Rodrigo) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250709164224.2676086-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-30drm/xe/hwmon: Add support to manage power limits though mailboxKarthik Poosa
Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Expose PCIe link downgrade attributesRaag Jadav
Expose sysfs attributes for PCIe link downgrade capability and status. v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal) Rework macros and their naming (Rodrigo) v3: Use sysfs_create_files() (Riana) Fix checkpatch warning (Riana) v4: s/downspeed/downgrade (Lucas, Rodrigo, Riana) v5: Use PCIe Gen agnostic naming (Rodrigo) v6: s/pcie_gen/auto_link (Lucas) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-3-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-14drm/xe/hwmon: expose fan speedRaag Jadav
Add hwmon support for fan1_input, fan2_input and fan3_input attributes, which will expose fan speed of respective channels in RPM when supported by hardware. With this in place we can monitor fan speed using lm-sensors tool. v2: Rely on platform checks instead of mailbox error (Aravind, Rodrigo) v3: Introduce has_fan_control flag (Rodrigo) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250312085909.755073-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-01-28drm/xe: Add functions and sysfs for boot survivabilityRiana Tauro
Boot Survivability is a software based workflow for recovering a system in a failed boot state. Here system recoverability is concerned with recovering the firmware responsible for boot. This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware to be flashed through mei-gsc and collect telemetry. The driver's probe flow is modified such that it enters survivability mode when pcode initialization is incomplete and boot status denotes a failure. In this mode, drm card is not exposed and presence of survivability_mode entry in PCI sysfs is used to indicate survivability mode and provide additional information required for debug This patch adds initialization functions and exposes admin readable sysfs entries The new sysfs will have the below layout /sys/bus/.../bdf ├── survivability_mode v2: reorder headers fix doc remove survivability info and use mode to display information use separate function for logging survivability information for critical error (Rodrigo) v3: use for loop use dev logs instead of drm use helper function for aux history(Rodrigo) remove unnecessary error check of greater than max_scratch as we are reading only 3 bit v4: fix checkpatch warnings fix space (Rodrigo) rename register Signed-off-by: Riana Tauro <riana.tauro@intel.com> Acked-by: Ashwin Kumar Kulkarni <ashwin.kumar.kulkarni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-2-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-01-09drm/xe: Add vram frequency sysfs attributesSujaritha Sundaresan
Add vram frequency sysfs attributes under the below hierarchy; /device/tile#/memory/freq0 |-max_freq |-min_freq v2: Drop "vram" from attribute names (Rodrigo) v3: Add documentation for new sysfs (Riana) Drop prefix from XEHP_PCODE_FREQUENCY_CONFIG (Riana) v4: Create sysfs under tile#/freq0 after removal of physical_memsize attrbute v5: Revert back to creating sysfs under tile#/memory/freq0 Remove definition of GT_FREQUENCY_MULTIPLIER (Rodrigo) v6: Rename attributes to max/min_freq (Anshuman) Fix review comments (Rodrigo) v7: Make documentation more verbose Move sysfs to separate file (Anshuman) v8: Fix platform specific conditions and add kernel doc (Anshuman) Fix typos and remove redundant headers (Riana) v9: Fix typo (Riana) Change function name to include "sysfs" (Lucas) Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://lore.kernel.org/r/20240109110418.2065101-1-sujaritha.sundaresan@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/hwmon: Expose card reactive critical powerBadal Nilawar
Expose the card reactive critical (I1) power. I1 is exposed as power1_crit in microwatts (typically for client products) or as curr1_crit in milliamperes (typically for server). v2: Move PCODE_MBOX macro to pcode file (Riana) v3: s/IS_DG2/(gt_to_xe(gt)->info.platform == XE_DG2) v4: Fix review comments (Andi) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20230925081842.3566834-3-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Use XE_REG/XE_REG_MCRLucas De Marchi
These should replace the _MMIO() and MCR_REG() from i915, with the goal of being more extensible, allowing to pass the additional fields for struct xe_reg and struct xe_reg_mcr. Replace all uses of _MMIO() and MCR_REG() in xe. Since the RTP, reg-save-restore and WA infra are not ready to use the new type, just undef the macro like was done for the i915 types previously. That conversion will come later. v2: Remove MEDIA_SOFT_SCRATCH_COUNT/MEDIA_SOFT_SCRATCH re-added by mistake (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230427223256.1432787-8-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Do not spread i915_reg_defs.h includeLucas De Marchi
Reduce the use of i915_reg_defs.h so it can be encapsulated in a single place. 1) If it was being included by mistake, remove 2) If it was included for FIELD_GET()/FIELD_PREP()/GENMASK() and the like, just include <linux/bitfield.h> 3) If it was included to be able to define additional registers, move the registers to the relavant headers (regs/xe_regs.h or regs/xe_gt_regs.h) v2: - Squash commit fixing i915_reg_defs.h include and with the one introducing regs/xe_reg_defs.h - Remove more cases of i915_reg_defs.h being used when all it was needed was linux/bitfield.h (Matt Roper) - Move some registers to the corresponding regs/*.h file (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo squashed here the removal of the i915 include]
2023-12-12drm/xe: Introduce a new DRM driver for Intel GPUsMatthew Brost
Xe, is a new driver for Intel GPUs that supports both integrated and discrete platforms starting with Tiger Lake (first Intel Xe Architecture). The code is at a stage where it is already functional and has experimental support for multiple platforms starting from Tiger Lake, with initial support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan drivers), as well as in NEO (for OpenCL and Level0). The new Xe driver leverages a lot from i915. As for display, the intent is to share the display code with the i915 driver so that there is maximum reuse there. But it is not added in this patch. This initial work is a collaboration of many people and unfortunately the big squashed patch won't fully honor the proper credits. But let's get some git quick stats so we can at least try to preserve some of the credits: Co-developed-by: Matthew Brost <matthew.brost@intel.com> Co-developed-by: Matthew Auld <matthew.auld@intel.com> Co-developed-by: Matt Roper <matthew.d.roper@intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Co-developed-by: Francois Dugast <francois.dugast@intel.com> Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com> Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com> Co-developed-by: Nirmoy Das <nirmoy.das@intel.com> Co-developed-by: Jani Nikula <jani.nikula@intel.com> Co-developed-by: José Roberto de Souza <jose.souza@intel.com> Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Co-developed-by: Dave Airlie <airlied@redhat.com> Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com>