summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/fpsimd.c
AgeCommit message (Collapse)Author
2025-12-15arm64/efi: Remove unneeded SVE/SME fallback preserve/store handlingArd Biesheuvel
Since commit 7137a203b251 ("arm64/fpsimd: Permit kernel mode NEON with IRQs off"), the only condition under which the fallback path is taken for FP/SIMD preserve/restore across a EFI runtime call is when it is called from hardirq or NMI context. In practice, this only happens when the EFI pstore driver is called to dump the kernel log buffer into a EFI variable under a panic, oops or emergency_restart() condition, and none of these can be expected to result in a return to user space for the task in question. This means that the existing EFI-specific logic for preserving and restoring SVE/SME state is pointless, and can be removed. Instead, kill the task, so that an exceedingly unlikely inadvertent return to user space does not proceed with a corrupted FP/SIMD state. Also, retain the preserve and restore of the base FP/SIMD state, as that might belong to kernel mode use of FP/SIMD. (Note that EFI runtime calls are never invoked reentrantly, even in this case, and so any interrupted kernel mode FP/SIMD usage will be unrelated to EFI) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-12-02Merge tag 'fpsimd-on-stack-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers: "This is a core arm64 change. However, I was asked to take this because most uses of kernel-mode FPSIMD are in crypto or CRC code. In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction" * tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits) lib/crypto: arm64: Move remaining algorithms to scoped ksimd API lib/crypto: arm/blake2b: Move to scoped ksimd API arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack arm64/fpu: Enforce task-context only for generic kernel mode FPU net/mlx5: Switch to more abstract scoped ksimd guard API on arm64 arm64/xorblocks: Switch to 'ksimd' scoped guard API crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API crypto/arm64: polyval - Switch to 'ksimd' scoped guard API crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API raid6: Move to more abstract 'ksimd' guard API crypto: aegis128-neon - Move to more abstract 'ksimd' guard API crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API ...
2025-11-12arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stackArd Biesheuvel
Commit aefbab8e77eb16b5 ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch") added a 'kernel_fpsimd_state' field to struct thread_struct, which is the arch-specific portion of struct task_struct, and is allocated for each task in the system. The size of this field is 528 bytes, resulting in non-negligible bloat of task_struct, and the resulting memory overhead may impact performance on systems with many processes. This allocation is only used if the task is scheduled out or interrupted by a softirq while using the FP/SIMD unit in kernel mode, and so it is possible to transparently allocate this buffer on the caller's stack instead. So tweak the 'ksimd' scoped guard implementation so that a stack buffer is allocated and passed to both kernel_neon_begin() and kernel_neon_end(), and either record it in the task struct, or use it directly to preserve the task mode kernel FP/SIMD when running in softirq context. Passing the address to both functions, and checking the addresses for consistency ensures that callers of the updated bare begin/end API use it in a manner that is consistent with the new context switch semantics. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-11arm64/fpsimd: Permit kernel mode NEON with IRQs offArd Biesheuvel
Currently, may_use_simd() will return false when called from a context where IRQs are disabled. One notable case where this happens is when calling the ResetSystem() EFI runtime service from the reboot/poweroff code path. For this case alone, there is a substantial amount of FP/SIMD support code to handle the corner case where a EFI runtime service is invoked with IRQs disabled. The only reason kernel mode SIMD is not allowed when IRQs are disabled is that re-enabling softirqs in this case produces a noisy diagnostic when lockdep is enabled. The warning is valid, in the sense that delivering pending softirqs over the back of the call to local_bh_enable() is problematic when IRQs are disabled. While the API lacks a facility to simply mask and unmask softirqs without triggering their delivery, disabling softirqs is not needed to begin with when IRQs are disabled, given that softirqs are only every taken asynchronously over the back of a hard IRQ. So dis/enable softirq processing conditionally, based on whether IRQs are enabled, and relax the check in may_use_simd(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-11arm64/fpsimd: Don't warn when EFI execution context is preemptibleArd Biesheuvel
Kernel mode FP/SIMD no longer requires preemption to be disabled, so only warn on uses of FP/SIMD from preemptible context if the fallback path is taken for cases where kernel mode NEON would not be allowed otherwise. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-09-16arm64/fpsimd: simplify sme_setup()Yury Norov (NVIDIA)
The function checks info->vq_map for emptiness right before calling find_last_bit(). We can use the find_last_bit() output and save on bitmap_empty() call, which is O(N). Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-05-27Merge branch 'for-next/sme-fixes' into for-next/coreWill Deacon
* for-next/sme-fixes: (35 commits) arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected arm64/fpsimd: ptrace: Gracefully handle errors arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state arm64/fpsimd: ptrace: Do not present register data for inactive mode arm64/fpsimd: ptrace: Save task state before generating SVE header arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data arm64/fpsimd: Make clone() compatible with ZA lazy saving arm64/fpsimd: Clear PSTATE.SM during clone() arm64/fpsimd: Consistently preserve FPSIMD state during clone() arm64/fpsimd: Remove redundant task->mm check arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return() arm64/fpsimd: Add task_smstop_sm() arm64/fpsimd: Factor out {sve,sme}_state_size() helpers arm64/fpsimd: Clarify sve_sync_*() functions arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE arm64/fpsimd: signal: Consistently read FPSIMD context arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only arm64/fpsimd: Do not discard modified SVE state ...
2025-05-08arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid stateMark Rutland
Currently, vec_set_vector_length() can manipulate a task into an invalid state as a result of a prctl/ptrace syscall which changes the SVE/SME vector length, resulting in several problems: (1) When changing the SVE vector length, if the task initially has PSTATE.ZA==1, and sve_alloc() fails to allocate memory, the task will be left with PSTATE.ZA==1 and sve_state==NULL. This is not a legitimate state, and could result in a subsequent null pointer dereference. (2) When changing the SVE vector length, if the task initially has PSTATE.SM==1, the task will be left with PSTATE.SM==1 and fp_type==FP_STATE_FPSIMD. Streaming mode state always needs to be saved in SVE format, so this is not a legitimate state. Attempting to restore this state may cause a task to erroneously inherit stale streaming mode predicate registers and FFR contents, behaving non-deterministically and potentially leaving information from another task. While in this state, reads of the NT_ARM_SSVE regset will indicate that the registers are not stored in SVE format. For the NT_ARM_SSVE regset specifically, debuggers interpret this as meaning that PSTATE.SM==0. (3) When changing the SME vector length, if the task initially has PSTATE.SM==1, the lower 128 bits of task's streaming mode vector state will be migrated to non-streaming mode, rather than these bits being zeroed as is usually the case for changes to PSTATE.SM. To fix the first issue, we can eagerly allocate the new sve_state and sme_state before modifying the task. This makes it possible to handle memory allocation failure without modifying the task state at all, and removes the need to clear TIF_SVE and TIF_SME. To fix the second issue, we either need to clear PSTATE.SM or not change the saved fp_type. Given we're going to eagerly allocate sve_state and sme_state, the simplest option is to preserve PSTATE.SM and the saves fp_type, and consistently truncate the SVE state. This ensures that the task always stays in a valid state, and by virtue of not exiting streaming mode, this also sidesteps the third issue. I believe these changes should not be problematic for realistic usage: * When the SVE/SME vector length is changed via prctl(), syscall entry will have cleared PSTATE.SM. Unless the task's state has been manipulated via ptrace after entry, the task will have PSTATE.SM==0. * When the SVE/SME vector length is changed via a write to the NT_ARM_SVE or NT_ARM_SSVE regsets, PSTATE.SM will be forced immediately after the length change, and new vector state will be copied from userspace. * When the SME vector length is changed via a write to the NT_ARM_ZA regset, the (S)SVE state is clobbered today, so anyone who cares about the specific state would need to install this after writing to the NT_ARM_ZA regset. As we need to free the old SVE state while TIF_SVE may still be set, we cannot use sve_free(), and using kfree() directly makes it clear that the free pairs with the subsequent assignment. As this leaves sve_free() unused, I've removed the existing sve_free() and renamed __sve_free() to mirror sme_free(). Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-16-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale dataMark Rutland
The SVE/SME vector lengths can be changed via prctl/ptrace syscalls. Changes to the SVE/SME vector lengths are documented as preserving the lower 128 bits of the Z registers (i.e. the bits shared with the FPSIMD V registers). To ensure this, vec_set_vector_length() explicitly copies register values from a task's saved SVE state to its saved FPSIMD state when dropping the task to FPSIMD-only. The logic for this was not updated when when FPSIMD/SVE state tracking was changed across commits: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type") bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state") 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Since the last commit above, a task's FPSIMD/SVE state may be stored in FPSIMD format while TIF_SVE is set, and the stored SVE state is stale. When vec_set_vector_length() encounters this case, it will erroneously clobber the live FPSIMD state with stale SVE state by using sve_to_fpsimd(). Fix this by using fpsimd_sync_from_effective_state() instead. Related issues with streaming mode state will be addressed in subsequent patches. Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-15-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: Add task_smstop_sm()Mark Rutland
In a few places we want to transition a task from streaming mode to non-streaming mode, e.g. signal delivery where we historically tried to use an SMSTOP SM instruction. Add a new helper to manipulate a task's state in the same way as an SMSTOP SM instruction. I have not added a corresponding helper to simulate the effects of SMSTART SM. Only ptrace transitions a task into streaming mode, and ptrace has distinct semantics for such transitions. Per ARM DDI 0487 L.a, section B1.4.6: | RRSWFQ | When the Effective value of PSTATE.SM is changed by any method from 0 | to 1, an entry to Streaming SVE mode is performed, and all implemented | bits of Streaming SVE register state are set to zero. | RKFRQZ | When the Effective value of PSTATE.SM is changed by any method from 1 | to 0, an exit from Streaming SVE mode is performed, and in the | newly-entered mode, all implemented bits of the SVE scalable vector | registers, SVE predicate registers, and FFR, are set to zero. Per ARM DDI 0487 L.a, section C5.2.9: | On entry to or exit from Streaming SVE mode, FPMR is set to 0 Per ARM DDI 0487 L.a, section C5.2.10: | On entry to or exit from Streaming SVE mode, FPSR.{IOC, DZC, OFC, UFC, | IXC, IDC, QC} are set to 1 and the remaining bits are set to 0. This means bits 0, 1, 2, 3, 4, 7, and 27 respectively, i.e. 0x0800009f Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-9-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: Factor out {sve,sme}_state_size() helpersMark Rutland
In subsequent patches we'll need to determine the SVE/SME state size for a given SVE VL and SME VL regardless of whether a task is currently configured with those VLs. Split the sizing logic out of sve_state_size() and sme_state_size() so that we don't need to open-code this logic elsewhere. At the same time, apply minor cleanups: * Move sve_state_size() into fpsimd.h, matching the placement of sme_state_size(). * Remove the feature checks from sve_state_size(). We only call sve_state_size() when at least one of SVE and SME are supported, and when either of the two is not supported, the task's corresponding SVE/SME vector length will be zero. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: Clarify sve_sync_*() functionsMark Rutland
The sve_sync_{to,from}_fpsimd*() functions are intended to extract/insert the currently effective FPSIMD state of a task regardless of whether the task's state is saved in FPSIMD format or SVE format. Historically they were only used by ptrace, but sve_sync_to_fpsimd() is now used more widely, and sve_sync_from_fpsimd_zeropad() may be used more widely in future. When FPSIMD/SVE state tracking was changed across commits: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type") bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state") 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") ... sve_sync_to_fpsimd() was updated to consider task->thread.fp_type rather than the task's TIF_SVE and PSTATE.SM, but (apparently due to an oversight) sve_sync_from_fpsimd_zeropad() was left as-is, leaving the two inconsistent. Due to this, sve_sync_from_fpsimd_zeropad() may copy state from task->thread.uw.fpsimd_state into task->thread.sve_state when task->thread.fp_type == FP_STATE_FPSIMD. This is redundant (but benign) as task->thread.uw.fpsimd_state is the effective state that will be restored, and task->thread.sve_state will not be consumed. For consistency, and to avoid the redundant work, it better for sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type alone, matching sve_sync_to_fpsimd(). The naming of both functions is somehat unfortunate, as it is unclear when and why they copy state. It would be better to describe them in terms of the effective state. Considering all of the above, clean this up: * Adjust sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type. * Update comments to clarify the intended semantics/usage. I've removed the description that task->thread.sve_state must have been allocated, as this is only necessary when task->thread.fp_type == FP_STATE_SVE, which itself implies that task->thread.sve_state must have been allocated. * Rename the functions to more clearly indicate when/why they copy state: - sve_sync_to_fpsimd() => fpsimd_sync_from_effective_state() - sve_sync_from_fpsimd_zeropad => fpsimd_sync_to_effective_state_zeropad() Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVEMark Rutland
Partial writes to the NT_ARM_SVE and NT_ARM_SSVE regsets using an payload are handled inconsistently and non-deterministically. A comment within sve_set_common() indicates that we intended that a partial write would preserve any effective FPSIMD/SVE state which was not overwritten, but this has never worked consistently, and during syscalls the FPSIMD vector state may be non-deterministically preserved and may be erroneously migrated between streaming and non-streaming SVE modes. The simplest fix is to handle a partial write by consistently zeroing the remaining state. As detailed below I do not believe this will adversely affect any real usage. Neither GDB nor LLDB attempt partial writes to these regsets, and the documentation (in Documentation/arch/arm64/sve.rst) has always indicated that state preservation was not guaranteed, as is says: | The effect of writing a partial, incomplete payload is unspecified. When the logic was originally introduced in commit: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") ... there were two potential behaviours, depending on TIF_SVE: * When TIF_SVE was clear, all SVE state would be zeroed, excluding the low 128 bits of vectors shared with FPSIMD, FPSR, and FPCR. * When TIF_SVE was set, all SVE state would be zeroed, including the low 128 bits of vectors shared with FPSIMD, but excluding FPSR and FPCR. Note that as writing to NT_ARM_SVE would set TIF_SVE, partial writes to NT_ARM_SVE would not be idempotent, and if a first write preserved the low 128 bits, a subsequent (potentially identical) partial write would discard the low 128 bits. When support for the NT_ARM_SSVE regset was added in commit: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") ... the above behaviour was retained for writes to the NT_ARM_SVE regset, though writes to the NT_ARM_SSVE would always zero the SVE registers and would not inherit FPSIMD register state. This happened as fpsimd_sync_to_sve() only copied the FPSIMD regs when TIF_SVE was clear and PSTATE.SM==0. Subsequently, when FPSIMD/SVE state tracking was changed across commits: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type") bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state") 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") ... there was no corresponding update to the ptrace code, nor to fpsimd_sync_to_sve(), which stil considers TIF_SVE and PSTATE.SM rather than the saved fp_type. The saved state can be in the FPSIMD format regardless of whether TIF_SVE is set or clear, and the saved type can change non-deterministically during syscalls. Consequently a subsequent partial write to the NT_ARM_SVE or NT_ARM_SSVE regsets may non-deterministically preserve the FPSIMD state, and may migrate this state between streaming and non-streaming modes. Clean this up by never attempting to preserve ANY state when writing an SVE payload to the NT_ARM_SVE/NT_ARM_SSVE regsets, zeroing all relevant state including FPSR and FPCR. This simplifies the code, makes the behaviour deterministic, and avoids migrating state between streaming and non-streaming modes. As above, I do not believe this should adversely affect existing userspace applications. At the same time, remove fpsimd_sync_to_sve(). It is no longer used, doesn't do what its documentation implies, and gets in the way of other cleanups and fixes. Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: Do not discard modified SVE stateMark Rutland
Historically SVE state was discarded deterministically early in the syscall entry path, before ptrace is notified of syscall entry. This permitted ptrace to modify SVE state before and after the "real" syscall logic was executed, with the modified state being retained. This behaviour was changed by commit: 8c845e2731041f0f ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") That commit was intended to speed up workloads that used SVE by opportunistically leaving SVE enabled when returning from a syscall. The syscall entry logic was modified to truncate the SVE state without disabling userspace access to SVE, and fpsimd_save_user_state() was modified to discard userspace SVE state whenever in_syscall(current_pt_regs()) is true, i.e. when current_pt_regs()->syscallno != NO_SYSCALL. Leaving SVE enabled opportunistically resulted in a couple of changes to userspace visible behaviour which weren't described at the time, but are logical consequences of opportunistically leaving SVE enabled: * Signal handlers can observe the type of saved state in the signal's sve_context record. When the kernel only tracks FPSIMD state, the 'vq' field is 0 and there is no space allocated for register contents. When the kernel tracks SVE state, the 'vq' field is non-zero and the register contents are saved into the record. As a result of the above commit, 'vq' (and the presence of SVE register state) is non-deterministically zero or non-zero for a period of time after a syscall. The effective register state is still deterministic. Hopefully no-one relies on this being deterministic. In general, handlers for asynchronous events cannot expect a deterministic state. * Similarly to signal handlers, ptrace requests can observe the type of saved state in the NT_ARM_SVE and NT_ARM_SSVE regsets, as this is exposed in the header flags. As a result of the above commit, this is now in a non-deterministic state after a syscall. The effective register state is still deterministic. Hopefully no-one relies on this being deterministic. In general, debuggers would have to handle this changing at arbitrary points during program flow. Discarding the SVE state within fpsimd_save_user_state() resulted in other changes to userspace visible behaviour which are not desirable: * A ptrace tracer can modify (or create) a tracee's SVE state at syscall entry or syscall exit. As a result of the above commit, the tracee's SVE state can be discarded non-deterministically after modification, rather than being retained as it previously was. Note that for co-operative tracer/tracee pairs, the tracer may (re)initialise the tracee's state arbitrarily after the tracee sends itself an initial SIGSTOP via a syscall, so this affects realistic design patterns. * The current_pt_regs()->syscallno field can be modified via ptrace, and can be altered even when the tracee is not really in a syscall, causing non-deterministic discarding to occur in situations where this was not previously possible. Further, using current_pt_regs()->syscallno in this way is unsound: * There are data races between readers and writers of the current_pt_regs()->syscallno field. The current_pt_regs()->syscallno field is written in interruptible task context using plain C accesses, and is read in irq/softirq context using plain C accesses. These accesses are subject to data races, with the usual concerns with tearing, etc. * Writes to current_pt_regs()->syscallno are subject to compiler reordering. As current_pt_regs()->syscallno is written with plain C accesses, the compiler is free to move those writes arbitrarily relative to anything which doesn't access the same memory location. In theory this could break signal return, where prior to restoring the SVE state, restore_sigframe() calls forget_syscall(). If the write were hoisted after restore of some SVE state, that state could be discarded unexpectedly. In practice that reordering cannot happen in the absence of LTO (as cross compilation-unit function calls happen prevent this reordering), and that reordering appears to be unlikely in the presence of LTO. Additionally, since commit: f130ac0ae4412dbe ("arm64: syscall: unmask DAIF earlier for SVCs") ... DAIF is unmasked before el0_svc_common() sets regs->syscallno to the real syscall number. Consequently state may be saved in SVE format prior to this point. Considering all of the above, current_pt_regs()->syscallno should not be used to infer whether the SVE state can be discarded. Luckily we can instead use cpu_fp_state::to_save to track when it is safe to discard the SVE state: * At syscall entry, after the live SVE register state is truncated, set cpu_fp_state::to_save to FP_STATE_FPSIMD to indicate that only the FPSIMD portion is live and needs to be saved. * At syscall exit, once the task's state is guaranteed to be live, set cpu_fp_state::to_save to FP_STATE_CURRENT to indicate that TIF_SVE must be considered to determine which state needs to be saved. * Whenever state is modified, it must be saved+flushed prior to manipulation. The state will be truncated if necessary when it is saved, and reloading the state will set fp_state::to_save to FP_STATE_CURRENT, preventing subsequent discarding. This permits SVE state to be discarded *only* when it is known to have been truncated (and the non-FPSIMD portions must be zero), and ensures that SVE state is retained after it is explicitly modified. For backporting, note that this fix depends on the following commits: * b2482807fbd4 ("arm64/sme: Optimise SME exit on syscall entry") * f130ac0ae441 ("arm64: syscall: unmask DAIF earlier for SVCs") * 929fa99b1215 ("arm64/fpsimd: signal: Always save+flush state early") Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Fixes: f130ac0ae441 ("arm64: syscall: unmask DAIF earlier for SVCs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-30arm64/fpsimd: Avoid warning when sve_to_fpsimd() is unusedMark Rutland
Historically fpsimd_to_sve() and sve_to_fpsimd() were (conditionally) called by functions which were defined regardless of CONFIG_ARM64_SVE. Hence it was necessary that both fpsimd_to_sve() and sve_to_fpsimd() were always defined and not guarded by ifdeffery. As a result of the removal of fpsimd_signal_preserve_current_state() in commit: 929fa99b1215966f ("arm64/fpsimd: signal: Always save+flush state early") ... sve_to_fpsimd() has no callers when CONFIG_ARM64_SVE=n, resulting in a build-time warnign that it is unused: | arch/arm64/kernel/fpsimd.c:676:13: warning: unused function 'sve_to_fpsimd' [-Wunused-function] | 676 | static void sve_to_fpsimd(struct task_struct *task) | | ^~~~~~~~~~~~~ | 1 warning generated. In contrast, fpsimd_to_sve() still has callers which are defined when CONFIG_ARM64_SVE=n, and it would be awkward to hide this behind ifdeffery and/or to use stub functions. For now, suppress the warning by marking both fpsimd_to_sve() and sve_to_fpsimd() as 'static inline', as we usually do for stub functions. The compiler will no longer warn if either function is unused. Aside from suppressing the warning, there should be no functional change as a result of this patch. Link: https://lore.kernel.org/linux-arm-kernel/20250429194600.GA26883@willie-the-truck/ Reported-by: Will Deacon <will@kernel.org> Fixes: 929fa99b1215 ("arm64/fpsimd: signal: Always save+flush state early") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250430173240.4023627-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-29arm64/fpsimd: Avoid unnecessary per-CPU buffers for EFI runtime callsArd Biesheuvel
The EFI specification has some elaborate rules about which runtime services may be called while another runtime service call is already in progress. In Linux, however, for simplicity, all EFI runtime service invocations are serialized via the efi_runtime_lock semaphore. This implies that calls to the helper pair arch_efi_call_virt_setup() and arch_efi_call_virt_teardown() are serialized too, and are guaranteed not to nest. Furthermore, the arm64 arch code has its own spinlock to serialize use of the EFI runtime stack, of which only a single instance exists. This all means that the FP/SIMD and SVE state preserve/restore logic in __efi_fpsimd_begin() and __efi_fpsimd_end() are also serialized, and only a single instance of the associated per-CPU variables can ever be in use at the same time. There is therefore no need at all for per-CPU variables here, and they can all be replaced with singleton instances. This saves a non-trivial amount of memory on systems with many CPUs. To be more robust against potential future changes in the core EFI code that may invalidate the reasoning above, move the invocations of __efi_fpsimd_begin() and __efi_fpsimd_end() into the critical section covered by the efi_rt_lock spinlock. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250318132421.3155799-2-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-09arm64/fpsimd: signal: Always save+flush state earlyMark Rutland
There are several issues with the way the native signal handling code manipulates FPSIMD/SVE/SME state, described in detail below. These issues largely result from races with preemption and inconsistent handling of live state vs saved state. Known issues with native FPSIMD/SVE/SME state management include: * On systems with FPMR, the code to save/restore the FPMR accesses the register while it is not owned by the current task. Consequently, this may corrupt the FPMR of the current task and/or may corrupt the FPMR of an unrelated task. The FPMR save/restore has been broken since it was introduced in commit: 8c46def44409fc91 ("arm64/signal: Add FPMR signal handling") * On systems with SME, setup_return() modifies both the live register state and the saved state register state regardless of whether the task's state is live, and without holding the cpu fpsimd context. Consequently: - This may corrupt the state an unrelated task which has PSTATE.SM set and/or PSTATE.ZA set. - The task may enter the signal handler in streaming mode, and or with ZA storage enabled unexpectedly. - The task may enter the signal handler in non-streaming SVE mode with stale SVE register state, which may have been inherited from streaming SVE mode unexpectedly. Where the streaming and non-streaming vector lengths differ, this may be packed into registers arbitrarily. This logic has been broken since it was introduced in commit: 40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals") Further incorrect manipulation of state was added in commits: ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode") baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") * Several restoration functions use fpsimd_flush_task_state() to discard the live FPSIMD/SVE/SME while the in-memory copy is stale. When a subset of the FPSIMD/SVE/SME state is restored, the remainder may be non-deterministically reset to a stale snapshot from some arbitrary point in the past. This non-deterministic discarding was introduced in commit: 8cd969d28fd2848d ("arm64/sve: Signal handling support") As of that commit, when TIF_SVE was initially clear, failure to restore the SVE signal frame could reset the FPSIMD registers to a stale snapshot. The pattern of discarding unsaved state was subsequently copied into restoration functions for some new state in commits: 39782210eb7e8763 ("arm64/sme: Implement ZA signal handling") ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT") * On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be loaded onto the CPU redundantly. Either restore_fpsimd_context() or restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state via fpsimd_update_current_state() before restore_za_context() and restore_zt_context() each discard the state via fpsimd_flush_task_state(). This is purely redundant work, and not a functional bug. To fix these issues, rework the native signal handling code to always save+flush the current task's FPSIMD/SVE/SME state before manipulating that state. This avoids races with preemption and ensures that state is manipulated consistently regardless of whether it happened to be live prior to manipulation. This largely involes: * Using fpsimd_save_and_flush_current_state() to save+flush the state for both signal delivery and signal return, before the state is manipulated in any way. * Removing fpsimd_signal_preserve_current_state() and updating preserve_fpsimd_context() to explicitly ensure that the FPSIMD state is up-to-date, as preserve_fpsimd_context() is the only consumer of the FPSIMD state during signal delivery. * Modifying fpsimd_update_current_state() to not reload the FPSIMD state onto the CPU. Ideally we'd remove fpsimd_update_current_state() entirely, but I've left that for subsequent patches as there are a number of of other problems with the FPSIMD<->SVE conversion helpers that should be addressed at the same time. For now I've removed the misleading comment. For setup_return(), we need to decide (for ABI reasons) whether signal delivery should have all the side-effects of an SMSTOP. For now I've left a TODO comment, as there are other questions in this area that I'll address with subsequent patches. Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling") Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals") Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode") Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support") Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling") Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Add fpsimd_save_and_flush_current_state()Mark Rutland
When the current task's FPSIMD/SVE/SME state may be live on *any* CPU in the system, special care must be taken when manipulating that state, as this manipulation can race with preemption and/or asynchronous usage of FPSIMD/SVE/SME (e.g. kernel-mode NEON in softirq handlers). Even when manipulation is is protected with get_cpu_fpsimd_context() and get_cpu_fpsimd_context(), the logic necessary when the state is live on the current CPU can be wildly different from the logic necessary when the state is not live on the current CPU. A number of historical and extant issues result from failing to handle these cases consistetntly and/or correctly. To make it easier to get such manipulation correct, add a new fpsimd_save_and_flush_current_state() helper function, which ensures that the current task's state has been saved to memory and any stale state on any CPU has been "flushed" such that is not live on any CPU in the system. This will allow code to safely manipulate the saved state without risk of races. Subsequent patches will use the new function. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-11-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Fix merging of FPSIMD state during signal returnMark Rutland
For backwards compatibility reasons, when a signal return occurs which restores SVE state, the effective lower 128 bits of each of the SVE vector registers are restored from the corresponding FPSIMD vector register in the FPSIMD signal frame, overriding the values in the SVE signal frame. This is intended to be the case regardless of streaming mode. To make this happen, restore_sve_fpsimd_context() uses fpsimd_update_current_state() to merge the lower 128 bits from the FPSIMD signal frame into the SVE register state. Unfortunately, fpsimd_update_current_state() performs this merging dependent upon TIF_SVE, which is not always correct for streaming SVE register state: * When restoring non-streaming SVE register state there is no observable problem, as the signal return code configures TIF_SVE and the saved fp_type to match before calling fpsimd_update_current_state(), which observes either: - TIF_SVE set AND fp_type == FP_STATE_SVE - TIF_SVE clear AND fp_type == FP_STATE_FPSIMD * On systems which have SME but not SVE, TIF_SVE cannot be set. Thus the merging will never happen for the streaming SVE register state. * On systems which have SVE and SME, TIF_SVE can be set and cleared independently of PSTATE.SM. Thus the merging may or may not happen for streaming SVE register state. As TIF_SVE can be cleared non-deterministically during syscalls (including at the start of sigreturn()), the merging may occur non-deterministically from the perspective of userspace. This logic has been broken since its introduction in commit: 85ed24dad2904f7c ("arm64/sme: Implement streaming SVE signal handling") ... at which point both fpsimd_signal_preserve_current_state() and fpsimd_update_current_state() only checked TIF SVE. When PSTATE.SM==1 and TIF_SVE was clear, signal delivery would place stale FPSIMD state into the FPSIMD signal frame, and signal return would not merge this into the restored register state. Subsequently, signal delivery was fixed as part of commit: 61da7c8e2a602f66 ("arm64/signal: Don't assume that TIF_SVE means we saved SVE state") ... but signal restore was not given a corresponding fix, and when TIF_SVE was clear, signal restore would still fail to merge the FPSIMD state into the restored SVE register state. The 'Fixes' tag did not indicate that this had been broken since its introduction. Fix this by merging the FPSIMD state dependent upon the saved fp_type, matching what we (currently) do during signal delivery. As described above, when backporting this commit, it will also be necessary to backport commit: 61da7c8e2a602f66 ("arm64/signal: Don't assume that TIF_SVE means we saved SVE state") ... and prior to commit: baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") ... it will be necessary for fpsimd_signal_preserve_current_state() and fpsimd_update_current_state() to consider both TIF_SVE and thread_sm_enabled(&current->thread), in place of the saved fp_type. Fixes: 85ed24dad290 ("arm64/sme: Implement streaming SVE signal handling") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Reset FPMR upon exec()Mark Rutland
An exec() is expected to reset all FPSIMD/SVE/SME state, and barring special handling of the vector lengths, the state is expected to reset to zero. This reset is handled in fpsimd_flush_thread(), which the core exec() code calls via flush_thread(). When support was added for FPMR, no logic was added to fpsimd_flush_thread() to reset the FPMR value, and thus it is erroneously inherited across an exec(). Add the missing reset of FPMR. Fixes: 203f2b95a882 ("arm64/fpsimd: Support FEAT_FPMR") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOPMark Rutland
On system with SME, a thread's kernel FPSIMD state may be erroneously clobbered during a context switch immediately after that state is restored. Systems without SME are unaffected. If the CPU happens to be in streaming SVE mode before a context switch to a thread with kernel FPSIMD state, fpsimd_thread_switch() will restore the kernel FPSIMD state using fpsimd_load_kernel_state() while the CPU is still in streaming SVE mode. When fpsimd_thread_switch() subsequently calls fpsimd_flush_cpu_state(), this will execute an SMSTOP, causing an exit from streaming SVE mode. The exit from streaming SVE mode will cause the hardware to reset a number of FPSIMD/SVE/SME registers, clobbering the FPSIMD state. Fix this by calling fpsimd_flush_cpu_state() before restoring the kernel FPSIMD state. Fixes: e92bee9f861b ("arm64/fpsimd: Avoid erroneous elide of user state reload") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Don't corrupt FPMR when streaming mode changesMark Brown
When the effective value of PSTATE.SM is changed from 0 to 1 or from 1 to 0 by any method, an entry or exit to/from streaming SVE mode is performed, and hardware automatically resets a number of registers. As of ARM DDI 0487 L.a, this means: * All implemented bits of the SVE vector registers are set to zero. * All implemented bits of the SVE predicate registers are set to zero. * All implemented bits of FFR are set to zero, if FFR is implemented in the new mode. * FPSR is set to 0x0000_0000_0800_009f. * FPMR is set to 0, if FPMR is implemented. Currently task_fpsimd_load() restores FPMR before restoring SVCR (which is an accessor for PSTATE.{SM,ZA}), and so the restored value of FPMR may be clobbered if the restored value of PSTATE.SM happens to differ from the initial value of PSTATE.SM. Fix this by moving the restore of FPMR later. Note: this was originally posted as [1]. Fixes: 203f2b95a882 ("arm64/fpsimd: Support FEAT_FPMR") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-arm-kernel/20241204-arm64-sme-reenable-v2-2-bae87728251d@kernel.org/ [ Rutland: rewrite commit message ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250409164010.3480271-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Discard stale CPU state when handling SME trapsMark Brown
The logic for handling SME traps manipulates saved FPSIMD/SVE/SME state incorrectly, and a race with preemption can result in a task having TIF_SME set and TIF_FOREIGN_FPSTATE clear even though the live CPU state is stale (e.g. with SME traps enabled). This can result in warnings from do_sme_acc() where SME traps are not expected while TIF_SME is set: | /* With TIF_SME userspace shouldn't generate any traps */ | if (test_and_set_thread_flag(TIF_SME)) | WARN_ON(1); This is very similar to the SVE issue we fixed in commit: 751ecf6afd6568ad ("arm64/sve: Discard stale CPU state when handling SVE traps") The race can occur when the SME trap handler is preempted before and after manipulating the saved FPSIMD/SVE/SME state, starting and ending on the same CPU, e.g. | void do_sme_acc(unsigned long esr, struct pt_regs *regs) | { | // Trap on CPU 0 with TIF_SME clear, SME traps enabled | // task->fpsimd_cpu is 0. | // per_cpu_ptr(&fpsimd_last_state, 0) is task. | | ... | | // Preempted; migrated from CPU 0 to CPU 1. | // TIF_FOREIGN_FPSTATE is set. | | get_cpu_fpsimd_context(); | | /* With TIF_SME userspace shouldn't generate any traps */ | if (test_and_set_thread_flag(TIF_SME)) | WARN_ON(1); | | if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { | unsigned long vq_minus_one = | sve_vq_from_vl(task_get_sme_vl(current)) - 1; | sme_set_vq(vq_minus_one); | | fpsimd_bind_task_to_cpu(); | } | | put_cpu_fpsimd_context(); | | // Preempted; migrated from CPU 1 to CPU 0. | // task->fpsimd_cpu is still 0 | // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then: | // - Stale HW state is reused (with SME traps enabled) | // - TIF_FOREIGN_FPSTATE is cleared | // - A return to userspace skips HW state restore | } Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set by calling fpsimd_flush_task_state() to detach from the saved CPU state. This ensures that a subsequent context switch will not reuse the stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the new state to be reloaded from memory prior to a return to userspace. Note: this was originallly posted as [1]. Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-arm-kernel/20241204-arm64-sme-reenable-v2-1-bae87728251d@kernel.org/ [ Rutland: rewrite commit message ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Remove opportunistic freeing of SME stateMark Rutland
When a task's SVE vector length (NSVL) is changed, and the task happens to have SVCR.{SM,ZA}=={0,0}, vec_set_vector_length() opportunistically frees the task's sme_state and clears TIF_SME. The opportunistic freeing was added with no rationale in commit: d4d5be94a8787242 ("arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes") That commit fixed an unrelated problem where the task's sve_state was freed while it could be used to store streaming mode register state, where the fix was to re-allocate the task's sve_state. There is no need to free and/or reallocate the task's sme_state when the SVE vector length changes, and there is no need to clear TIF_SME. Given the SME vector length (SVL) doesn't change, the task's sme_state remains correctly sized. Remove the unnecessary opportunistic freeing of the task's sme_state when the task's SVE vector length is changed. The task's sme_state is still freed when the SME vector length is changed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Remove redundant SVE trap manipulationMark Rutland
When task_fpsimd_load() loads the saved FPSIMD/SVE/SME state, it configures EL0 SVE traps by calling sve_user_{enable,disable}(). This is unnecessary, and this is suspicious/confusing as task_fpsimd_load() does not configure EL0 SME traps. All calls to task_fpsimd_load() are followed by a call to fpsimd_bind_task_to_cpu(), where the latter configures traps for SVE and SME dependent upon the current values of TIF_SVE and TIF_SME, overriding any trap configuration performed by task_fpsimd_load(). The calls to sve_user_{enable,disable}() calls in task_fpsimd_load() have been redundant (though benign) since they were introduced in commit: a0136be443d51803 ("arm64/fpsimd: Load FP state based on recorded data type") Remove the unnecessary and confusing SVE trap manipulation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Remove unused fpsimd_force_sync_to_sve()Mark Rutland
There have been no users of fpsimd_force_sync_to_sve() since commit: bbc6172eefdb276b ("arm64/fpsimd: SME no longer requires SVE register state") Remove fpsimd_force_sync_to_sve(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09arm64/fpsimd: Avoid RES0 bits in the SME trap handlerMark Rutland
The SME trap handler consumes RES0 bits from the ESR when determining the reason for the trap, and depends upon those bits reading as zero. This may break in future when those RES0 bits are allocated a meaning and stop reading as zero. For SME traps taken with ESR_ELx.EC == 0b011101, the specific reason for the trap is indicated by ESR_ELx.ISS.SMTC ("SME Trap Code"). This field occupies bits [2:0] of ESR_ELx.ISS, and as of ARM DDI 0487 L.a, bits [24:3] of ESR_ELx.ISS are RES0. ESR_ELx.ISS itself occupies bits [24:0] of ESR_ELx. Extract the SMTC field specifically, matching the way we handle ESR_ELx fields elsewhere, and ensuring that the handler is future-proof. Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-14Merge tag 'kvmarm-fixes-6.14-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.14, take #2 - Large set of fixes for vector handling, specially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours. - Fix use of kernel VAs at EL2 when emulating timers with nVHE. - Small set of pKVM improvements and cleanups.
2025-02-13KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME stateMark Rutland
There are several problems with the way hyp code lazily saves the host's FPSIMD/SVE state, including: * Host SVE being discarded unexpectedly due to inconsistent configuration of TIF_SVE and CPACR_ELx.ZEN. This has been seen to result in QEMU crashes where SVE is used by memmove(), as reported by Eric Auger: https://issues.redhat.com/browse/RHEL-68997 * Host SVE state is discarded *after* modification by ptrace, which was an unintentional ptrace ABI change introduced with lazy discarding of SVE state. * The host FPMR value can be discarded when running a non-protected VM, where FPMR support is not exposed to a VM, and that VM uses FPSIMD/SVE. In these cases the hyp code does not save the host's FPMR before unbinding the host's FPSIMD/SVE/SME state, leaving a stale value in memory. Avoid these by eagerly saving and "flushing" the host's FPSIMD/SVE/SME state when loading a vCPU such that KVM does not need to save any of the host's FPSIMD/SVE/SME state. For clarity, fpsimd_kvm_prepare() is removed and the necessary call to fpsimd_save_and_flush_cpu_state() is placed in kvm_arch_vcpu_load_fp(). As 'fpsimd_state' and 'fpmr_ptr' should not be used, they are set to NULL; all uses of these will be removed in subsequent patches. Historical problems go back at least as far as v5.17, e.g. erroneous assumptions about TIF_SVE being clear in commit: 8383741ab2e773a9 ("KVM: arm64: Get rid of host SVE tracking/saving") ... and so this eager save+flush probably needs to be backported to ALL stable trees. Fixes: 93ae6b01bafee8fa ("KVM: arm64: Discard any SVE state when entering KVM guests") Fixes: 8c845e2731041f0f ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Fixes: ef3be86021c3bdf3 ("KVM: arm64: Add save/restore support for FPMR") Reported-by: Eric Auger <eauger@redhat.com> Reported-by: Wilco Dijkstra <wilco.dijkstra@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Eric Auger <eric.auger@redhat.com> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Fuad Tabba <tabba@google.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250210195226.1215254-2-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-28treewide: const qualify ctl_tables where applicableJoel Granados
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function. Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers. Created this by running an spatch followed by a sed command: Spatch: virtual patch @ depends on !(file in "net") disable optional_qualifier @ identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@ + const struct ctl_table table_name [] = { ... }; sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c Reviewed-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/ Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-11-18Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for running Linux in a protected VM under the Arm Confidential Compute Architecture (CCA) - Guarded Control Stack user-space support. Current patches follow the x86 ABI of implicitly creating a shadow stack on clone(). Subsequent patches (already on the list) will add support for clone3() allowing finer-grained control of the shadow stack size and placement from libc - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are getting close with the upcoming dpISA support) - Other arch features: - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only exposed to user; uaccess support not merged yet) - MTE: hugetlbfs support and the corresponding kselftests - Optimise CRC32 using the PMULL instructions - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG - Optimise the kernel TLB flushing to use the range operations - POE/pkey (permission overlays): further cleanups after bringing the signal handler in line with the x86 behaviour for 6.12 - arm64 perf updates: - Support for the NXP i.MX91 PMU in the existing IMX driver - Support for Ampere SoCs in the Designware PCIe PMU driver - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC - Support for Samsung's 'Mongoose' CPU PMU - Support for PMUv3.9 finer-grained userspace counter access control - Switch back to platform_driver::remove() now that it returns 'void' - Add some missing events for the CXL PMU driver - Miscellaneous arm64 fixes/cleanups: - Page table accessors cleanup: type updates, drop unused macros, reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity check addresses before runtime P4D/PUD folding - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the FEAT_ECV for the generic timers) allowing Linux to boot with firmware deployments that don't set SCTLR_EL3.ECVEn - ACPI/arm64: tighten the check for the array of platform timer structures and adjust the error handling procedure in gtdt_parse_timer_block() - Optimise the cache flush for the uprobes xol slot (skip if no change) and other uprobes/kprobes cleanups - Fix the context switching of tpidrro_el0 when kpti is enabled - Dynamic shadow call stack fixes - Sysreg updates - Various arm64 kselftest improvements * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits) arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled kselftest/arm64: Try harder to generate different keys during PAC tests kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() arm64/ptrace: Clarify documentation of VL configuration via ptrace kselftest/arm64: Corrupt P0 in the irritator when testing SSVE acpi/arm64: remove unnecessary cast arm64/mm: Change protval as 'pteval_t' in map_range() kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c kselftest/arm64: Add FPMR coverage to fp-ptrace kselftest/arm64: Expand the set of ZA writes fp-ptrace does kselftets/arm64: Use flag bits for features in fp-ptrace assembler code kselftest/arm64: Enable build of PAC tests with LLVM=1 kselftest/arm64: Check that SVCR is 0 in signal handlers selftests/mm: Fix unused function warning for aarch64_write_signal_pkey() kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests kselftest/arm64: Fix build with stricter assemblers arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux() arm64/scs: Deal with 64-bit relative offsets in FDE frames ...
2024-11-06arm64/sve: Discard stale CPU state when handling SVE trapsMark Brown
The logic for handling SVE traps manipulates saved FPSIMD/SVE state incorrectly, and a race with preemption can result in a task having TIF_SVE set and TIF_FOREIGN_FPSTATE clear even though the live CPU state is stale (e.g. with SVE traps enabled). This has been observed to result in warnings from do_sve_acc() where SVE traps are not expected while TIF_SVE is set: | if (test_and_set_thread_flag(TIF_SVE)) | WARN_ON(1); /* SVE access shouldn't have trapped */ Warnings of this form have been reported intermittently, e.g. https://lore.kernel.org/linux-arm-kernel/CA+G9fYtEGe_DhY2Ms7+L7NKsLYUomGsgqpdBj+QwDLeSg=JhGg@mail.gmail.com/ https://lore.kernel.org/linux-arm-kernel/000000000000511e9a060ce5a45c@google.com/ The race can occur when the SVE trap handler is preempted before and after manipulating the saved FPSIMD/SVE state, starting and ending on the same CPU, e.g. | void do_sve_acc(unsigned long esr, struct pt_regs *regs) | { | // Trap on CPU 0 with TIF_SVE clear, SVE traps enabled | // task->fpsimd_cpu is 0. | // per_cpu_ptr(&fpsimd_last_state, 0) is task. | | ... | | // Preempted; migrated from CPU 0 to CPU 1. | // TIF_FOREIGN_FPSTATE is set. | | get_cpu_fpsimd_context(); | | if (test_and_set_thread_flag(TIF_SVE)) | WARN_ON(1); /* SVE access shouldn't have trapped */ | | sve_init_regs() { | if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { | ... | } else { | fpsimd_to_sve(current); | current->thread.fp_type = FP_STATE_SVE; | } | } | | put_cpu_fpsimd_context(); | | // Preempted; migrated from CPU 1 to CPU 0. | // task->fpsimd_cpu is still 0 | // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then: | // - Stale HW state is reused (with SVE traps enabled) | // - TIF_FOREIGN_FPSTATE is cleared | // - A return to userspace skips HW state restore | } Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set by calling fpsimd_flush_task_state() to detach from the saved CPU state. This ensures that a subsequent context switch will not reuse the stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the new state to be reloaded from memory prior to a return to userspace. Fixes: cccb78ce89c4 ("arm64/sve: Rework SVE access trap to convert state in registers") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20241030-arm64-fpsimd-foreign-flush-v1-1-bd7bd66905a2@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-11-04arm64/fpsimd: Fix a typoChristophe JAILLET
s/FPSMID/FPSIMD/ M and I swapped. Fix it. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/2cbcb42615e9265bccc9b746465d7998382e605d.1730539907.git.christophe.jaillet@wanadoo.fr Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-24sysctl: treewide: constify the ctl_table argument of proc_handlersJoel Granados
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-05-22arm64/fpsimd: Avoid erroneous elide of user state reloadArd Biesheuvel
TIF_FOREIGN_FPSTATE is a 'convenience' flag that should reflect whether the current CPU holds the most recent user mode FP/SIMD state of the current task. It combines two conditions: - whether the current CPU's FP/SIMD state belongs to the task; - whether that state is the most recent associated with the task (as a task may have executed on other CPUs as well). When a task is scheduled in and TIF_KERNEL_FPSTATE is set, it means the task was in a kernel mode NEON section when it was scheduled out, and so the kernel mode FP/SIMD state is restored. Since this implies that the current CPU is *not* holding the most recent user mode FP/SIMD state of the current task, the TIF_FOREIGN_FPSTATE flag is set too, so that the user mode FP/SIMD state is reloaded from memory when returning to userland. However, the task may be scheduled out after completing the kernel mode NEON section, but before returning to userland. When this happens, the TIF_FOREIGN_FPSTATE flag will not be preserved, but will be set as usual the next time the task is scheduled in, and will be based on the above conditions. This means that, rather than setting TIF_FOREIGN_FPSTATE when scheduling in a task with TIF_KERNEL_FPSTATE set, the underlying state should be updated so that TIF_FOREIGN_FPSTATE will assume the expected value as a result. So instead, call fpsimd_flush_cpu_state(), which takes care of this. Closes: https://lore.kernel.org/all/cb8822182231850108fa43e0446a4c7f@kernel.org Reported-by: Johannes Nixdorf <mixi@shadowice.org> Fixes: aefbab8e77eb ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch") Cc: Mark Brown <broonie@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Janne Grunau <j@jannau.net> Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Janne Grunau <j@jannau.net> Tested-by: Johannes Nixdorf <mixi@shadowice.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240522091335.335346-2-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-22Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"Will Deacon
This reverts commit b8995a18417088bb53f87c49d200ec72a9dd4ec1. Ard managed to reproduce the dm-crypt corruption problem and got to the bottom of it, so re-apply the problematic patch in preparation for fixing things properly. Cc: stable@vger.kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-05-17Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"Will Deacon
This reverts commit 2632e25217696712681dd1f3ecc0d71624ea3b23. Johannes (and others) report data corruption with dm-crypt on Apple M1 which has been bisected to this change. Revert the offending commit while we figure out what's going on. Cc: stable@vger.kernel.org Reported-by: Johannes Nixdorf <mixi@shadowice.org> Link: https://lore.kernel.org/all/D1B7GPIR9K1E.5JFV37G0YTIF@shadowice.org/ Signed-off-by: Will Deacon <will@kernel.org>
2024-03-14Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The major features are support for LPA2 (52-bit VA/PA with 4K and 16K pages), the dpISA extension and Rust enabled on arm64. The changes are mostly contained within the usual arch/arm64/, drivers/perf, the arm64 Documentation and kselftests. The exception is the Rust support which touches some generic build files. Summary: - Reorganise the arm64 kernel VA space and add support for LPA2 (at stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address range with 4KB and 16KB pages - Enable Rust on arm64 - Support for the 2023 dpISA extensions (data processing ISA), host only - arm64 perf updates: - StarFive's StarLink (integrates one or more CPU cores with a shared L3 memory system) PMU support - Enable HiSilicon Erratum 162700402 quirk for HIP09 - Several updates for the HiSilicon PCIe PMU driver - Arm CoreSight PMU support - Convert all drivers under drivers/perf/ to use .remove_new() - Miscellaneous: - Don't enable workarounds for "rare" errata by default - Clean up the DAIF flags handling for EL0 returns (in preparation for NMI support) - Kselftest update for ptrace() - Update some of the sysreg field definitions - Slight improvement in the code generation for inline asm I/O accessors to permit offset addressing - kretprobes: acquire regs via a BRK exception (previously done via a trampoline handler) - SVE/SME cleanups, comment updates - Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously disabled due to gcc silently ignoring -falign-functions=N)" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits) Revert "mm: add arch hook to validate mmap() prot flags" Revert "arm64: mm: add support for WXN memory translation attribute" Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512" ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu ...
2024-03-07Merge branches 'for-next/reorg-va-space', 'for-next/rust-for-arm64', ↵Catalin Marinas
'for-next/misc', 'for-next/daif-cleanup', 'for-next/kselftest', 'for-next/documentation', 'for-next/sysreg' and 'for-next/dpisa', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (39 commits) docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx() drivers/perf: hisi_pcie: Relax the check on related events drivers/perf: hisi_pcie: Check the target filter properly drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth drivers/perf: hisi_pcie: Fix incorrect counting under metric mode drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val() drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter() drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09 perf/arm_cspmu: Add devicetree support dt-bindings/perf: Add Arm CoreSight PMU perf/arm_cspmu: Simplify counter reset perf/arm_cspmu: Simplify attribute groups perf/arm_cspmu: Simplify initialisation ... * for-next/reorg-va-space: : Reorganise the arm64 kernel VA space in preparation for LPA2 support : (52-bit VA/PA). arm64: kaslr: Adjust randomization range dynamically arm64: mm: Reclaim unused vmemmap region for vmalloc use arm64: vmemmap: Avoid base2 order of struct page size to dimension region arm64: ptdump: Discover start of vmemmap region at runtime arm64: ptdump: Allow all region boundaries to be defined at boot time arm64: mm: Move fixmap region above vmemmap region arm64: mm: Move PCI I/O emulation region above the vmemmap region * for-next/rust-for-arm64: : Enable Rust support for arm64 arm64: rust: Enable Rust support for AArch64 rust: Refactor the build target to allow the use of builtin targets * for-next/misc: : Miscellaneous arm64 patches ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 arm64: Remove enable_daif macro arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception arm64: cpufeatures: Clean up temporary variable to simplify code arm64: Update setup_arch() comment on interrupt masking arm64: remove unnecessary ifdefs around is_compat_task() arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values arm64/sve: Document that __SVE_VQ_MAX is much larger than needed arm64: make member of struct pt_regs and it's offset macro in the same order arm64: remove unneeded BUILD_BUG_ON assertion arm64: kretprobes: acquire the regs via a BRK exception arm64: io: permit offset addressing arm64: errata: Don't enable workarounds for "rare" errata by default * for-next/daif-cleanup: : Clean up DAIF handling for EL0 returns arm64: Unmask Debug + SError in do_notify_resume() arm64: Move do_notify_resume() to entry-common.c arm64: Simplify do_notify_resume() DAIF masking * for-next/kselftest: : Miscellaneous arm64 kselftest patches kselftest/arm64: Test that ptrace takes effect in the target process * for-next/documentation: : arm64 documentation patches arm64/sme: Remove spurious 'is' in SME documentation arm64/fp: Clarify effect of setting an unsupported system VL arm64/sme: Fix cut'n'paste in ABI document arm64/sve: Remove bitrotted comment about syscall behaviour * for-next/sysreg: : sysreg updates arm64/sysreg: Update ID_AA64DFR0_EL1 register arm64/sysreg: Update ID_DFR0_EL1 register fields arm64/sysreg: Add register fields for ID_AA64DFR1_EL1 * for-next/dpisa: : Support for 2023 dpISA extensions kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature
2024-03-07arm64/fpsimd: Support FEAT_FPMRMark Brown
FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the FP8 related features added to the architecture at the same time. Detect support for this register and context switch it for EL0 when present. Due to the sharing of responsibility for saving floating point state between the host kernel and KVM FP8 support is not yet implemented in KVM and a stub similar to that used for SVCR is provided for FPMR in order to avoid bisection issues. To make it easier to share host state with the hypervisor we store FPMR as a hardened usercopy field in uw (along with some padding). Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-3-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-22arm64/sme: Ensure that all fields in SMCR_EL1 are set to known valuesMark Brown
At present nothing in our CPU initialisation code ever sets unknown fields in SMCR_EL1 to known values, all updates to SMCR_EL1 are read/modify/write sequences. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-2-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-22arm64/sve: Ensure that all fields in ZCR_EL1 are set to known valuesMark Brown
At present nothing in our CPU initialisation code ever sets unknown fields in ZCR_EL1 to known values, all updates to ZCR_EL1 are read/modify/write sequences for LEN. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-1-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-20arm64/sme: Restore SMCR_EL1.EZT0 on exit from suspendMark Brown
The fields in SMCR_EL1 reset to an architecturally UNKNOWN value. Since we do not otherwise manage the traps configured in this register at runtime we need to reconfigure them after a suspend in case nothing else was kind enough to preserve them for us. Do so for SMCR_EL1.EZT0. Fixes: d4913eee152d ("arm64/sme: Add basic enumeration for SME2") Reported-by: Jackson Cooper-Driver <Jackson.Cooper-Driver@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-sme-resume-v3-2-17e05e493471@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-02-20arm64/sme: Restore SME registers on exit from suspendMark Brown
The fields in SMCR_EL1 and SMPRI_EL1 reset to an architecturally UNKNOWN value. Since we do not otherwise manage the traps configured in this register at runtime we need to reconfigure them after a suspend in case nothing else was kind enough to preserve them for us. The vector length will be restored as part of restoring the SME state for the next SME using task. Fixes: a1f4ccd25cc2 ("arm64/sme: Provide Kconfig for SME") Reported-by: Jackson Cooper-Driver <Jackson.Cooper-Driver@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-sme-resume-v3-1-17e05e493471@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09arm64/signal: Don't assume that TIF_SVE means we saved SVE stateMark Brown
When we are in a syscall we will only save the FPSIMD subset even though the task still has access to the full register set, and on context switch we will only remove TIF_SVE when loading the register state. This means that the signal handling code should not assume that TIF_SVE means that the register state is stored in SVE format, it should instead check the format that was recorded during save. Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240130-arm64-sve-signal-regs-v2-1-9fc6f9502782@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-01-18arm64/sme: Always exit sme_alloc() early with existing storageMark Brown
When sme_alloc() is called with existing storage and we are not flushing we will always allocate new storage, both leaking the existing storage and corrupting the state. Fix this by separating the checks for flushing and for existing storage as we do for SVE. Callers that reallocate (eg, due to changing the vector length) should call sme_free() themselves. Fixes: 5d0a8d2fba50 ("arm64/ptrace: Ensure that SME is set up for target when writing SSVE state") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240115-arm64-sme-flush-v1-1-7472bd3459b7@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-01-18arm64/fpsimd: Remove spurious check for SVE supportMark Brown
There is no need to check for SVE support when changing vector lengths, even if the system is SME only we still need SVE storage for the streaming SVE state. Fixes: d4d5be94a878 ("arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240115-arm64-sve-enabled-check-v1-1-a26360b00f6d@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-01-04Merge branch 'for-next/fpsimd' into for-next/coreWill Deacon
* for-next/fpsimd: arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch arm64: fpsimd: Drop unneeded 'busy' flag
2023-12-13arm64: Cleanup system cpucap handlingMark Rutland
Recent changes to remove cpus_have_const_cap() introduced new users of cpus_have_cap() in the period between detecting system cpucaps and patching alternatives. It would be preferable to defer these until after the relevant cpucaps have been patched so that these can use the usual feature check helper functions, which is clearer and has less risk of accidental usage of code relying upon an alternative which has not yet been patched. This patch reworks the system-wide cpucap detection and patching to minimize this transient period: * The detection, enablement, and patching of system cpucaps is moved into a new setup_system_capabilities() function so that these can be grouped together more clearly, with no other functions called in the period between detection and patching. This is called from setup_system_features() before the subsequent checks that depend on the cpucaps. The logging of TTBR0 PAN and cpucaps with a mask is also moved here to keep these as close as possible to update_cpu_capabilities(). At the same time, comments are corrected and improved to make the intent clearer. * As hyp_mode_check() only tests system register values (not hwcaps) and must be called prior to patching, the call to hyp_mode_check() is moved before the call to setup_system_features(). * In setup_system_features(), the use of system_uses_ttbr0_pan() is restored, now that this occurs after alternatives are patched. This is a partial revert of commit: 53d62e995d9eaed1 ("arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN") * In sve_setup() and sme_setup(), the use of system_supports_sve() and system_supports_sme() respectively are restored, now that these occur after alternatives are patched. This is a partial revert of commit: a76521d160284a1e ("arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231212170910.3745497-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm64: fpsimd: Implement lazy restore for kernel mode FPSIMDArd Biesheuvel
Now that kernel mode FPSIMD state is context switched along with other task state, we can enable the existing logic that keeps track of which task's FPSIMD state the CPU is holding in its registers. If it is the context of the task that we are switching to, we can elide the reload of the FPSIMD state from memory. Note that we also need to check whether the FPSIMD state on this CPU is the most recent: if a task gets migrated away and back again, the state in memory may be more recent than the state in the CPU. So add another CPU id field to task_struct to keep track of this. (We could reuse the existing CPU id field used for user mode context, but that might result in user state to be discarded unnecessarily, given that two distinct CPUs could be holding the most recent user mode state and the most recent kernel mode state) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20231208113218.3001940-9-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>