summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2026-02-12Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...
2026-02-12Merge tag 'mm-stable-2026-02-11-19-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...
2026-02-12cgroup: fix race between task migration and iterationQingye Zhao
When a task is migrated out of a css_set, cgroup_migrate_add_task() first moves it from cset->tasks to cset->mg_tasks via: list_move_tail(&task->cg_list, &cset->mg_tasks); If a css_task_iter currently has it->task_pos pointing to this task, css_set_move_task() calls css_task_iter_skip() to keep the iterator valid. However, since the task has already been moved to ->mg_tasks, the iterator is advanced relative to the mg_tasks list instead of the original tasks list. As a result, remaining tasks on cset->tasks, as well as tasks queued on cset->mg_tasks, can be skipped by iteration. Fix this by calling css_set_skip_task_iters() before unlinking task->cg_list from cset->tasks. This advances all active iterators to the next task on cset->tasks, so iteration continues correctly even when a task is concurrently being migrated. This race is hard to hit in practice without instrumentation, but it can be reproduced by artificially slowing down cgroup_procs_show(). For example, on an Android device a temporary /sys/kernel/cgroup/cgroup_test knob can be added to inject a delay into cgroup_procs_show(), and then: 1) Spawn three long-running tasks (PIDs 101, 102, 103). 2) Create a test cgroup and move the tasks into it. 3) Enable a large delay via /sys/kernel/cgroup/cgroup_test. 4) In one shell, read cgroup.procs from the test cgroup. 5) Within the delay window, in another shell migrate PID 102 by writing it to a different cgroup.procs file. Under this setup, cgroup.procs can intermittently show only PID 101 while skipping PID 103. Once the migration completes, reading the file again shows all tasks as expected. Note that this change does not allow removing the existing css_set_skip_task_iters() call in css_set_move_task(). The new call in cgroup_migrate_add_task() only handles iterators that are racing with migration while the task is still on cset->tasks. Iterators may also start after the task has been moved to cset->mg_tasks. If we dropped css_set_skip_task_iters() from css_set_move_task(), such iterators could keep task_pos pointing to a migrating task, causing css_task_iter_advance() to malfunction on the destination css_set, up to and including crashes or infinite loops. The race window between migration and iteration is very small, and css_task_iter is not on a hot path. In the worst case, when an iterator is positioned on the first thread of the migrating process, cgroup_migrate_add_task() may have to skip multiple tasks via css_set_skip_task_iters(). However, this only happens when migration and iteration actually race, so the performance impact is negligible compared to the correctness fix provided here. Fixes: b636fd38dc40 ("cgroup: Implement css_task_iter_skip()") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Qingye Zhao <zhaoqingye@honor.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-11Merge tag 'net-next-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...
2026-02-11tracing: Fix indentation of return statement in print_trace_fmt()Haoyang LIU
The return statement inside the nested if block in print_trace_fmt() is not properly indented, making the code structure unclear. This was flagged by smatch as a warning. Add proper indentation to the return statement to match the kernel coding style and improve readability. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260210153903.8041-1-tttturtleruss@gmail.com Signed-off-by: Haoyang LIU <tttturtleruss@gmail.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-11Merge tag 'pci-v7.0-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Don't try to enable Extended Tags on VFs since that bit is Reserved and causes misleading log messages (Håkon Bugge) - Initialize Endpoint Read Completion Boundary to match Root Port, regardless of ACPI _HPX (Håkon Bugge) - Apply _HPX PCIe Setting Record only to AER configuration, and only when OS owns PCIe hotplug but not AER, to avoid clobbering Extended Tag and Relaxed Ordering settings (Håkon Bugge) Resource management: - Move CardBus code to setup-cardbus.c and only build it when CONFIG_CARDBUS is set (Ilpo Järvinen) - Fix bridge window alignment with optional resources, where additional alignment requirement was previously lost (Ilpo Järvinen) - Stop over-estimating bridge window size since they are now assigned without any gaps between them (Ilpo Järvinen) - Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening for nested bridges and endpoints (Ilpo Järvinen) - Add pbus_mem_size_optional() to handle sizes of optional resources (SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen) - Don't claim disabled bridge windows to avoid spurious claim failures (Ilpo Järvinen) Driver binding: - Fix device reference leak in pcie_port_remove_service() (Uwe Kleine-König) - Move pcie_port_bus_match() and pcie_port_bus_type to PCIe-specific portdrv.c (Uwe Kleine-König) - Convert portdrv to use pcie_port_bus_type.probe() and .remove() callbacks so .probe() and .remove() can eventually be removed from struct device_driver (Uwe Kleine-König) Error handling: - Clear stale errors on reporting agents upon probe so they don't look like recent errors (Lukas Wunner) - Add generic RAS tracepoint for hotplug events (Shuai Xue) - Add RAS tracepoint for link speed changes (Shuai Xue) Power management: - Avoid redundant delay on transition from D3hot to D3cold if the device was already in D3hot (Brian Norris) - Prevent runtime suspend until devices are fully initialized to avoid saving incompletely configured device state (Brian Norris) Power control: - Add power_on/off callbacks with generic signature to pwrseq, tc9563, and slot drivers so they can be used by pwrctrl core (Manivannan Sadhasivam) - Add PCIe M.2 connector support to the slot pwrctrl driver (Manivannan Sadhasivam) - Switch to pwrctrl interfaces to create, destroy, and power on/off devices, calling them from host controller drivers instead of the PCI core (Manivannan Sadhasivam) - Drop qcom .assert_perst() callbacks since this is now done by the controller driver instead of the pwrctrl driver (Manivannan Sadhasivam) Virtualization: - Remove an incorrect unlock in pci_slot_trylock() error handling (Jinhui Guo) - Lock the bridge device for slot reset (Keith Busch) - Enable ACS after IOMMU configuration on OF platforms so ACS is enabled an all devices; previously the first device enumerated (typically a Root Port) didn't have ACS enabled (Manivannan Sadhasivam) - Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to work around hardware erratum; previously ACS SV was only temporarily disabled, which worked for enumeration but not after reset (Manivannan Sadhasivam) Peer-to-peer DMA: - Release per-CPU pgmap ref when vm_insert_page() fails to avoid hang when removing the PCI device (Hou Tao) - Remove incorrect p2pmem_alloc_mmap() warning about page refcount (Hou Tao) Endpoint framework: - Add configfs sub-groups synchronously to avoid NULL pointer dereference when racing with removal (Liu Song) - Fix swapped parameters in pci_{primary/secondary}_epc_epf_unlink() functions (Manikanta Maddireddy) ASPEED PCIe controller driver: - Add ASPEED Root Complex DT binding and driver (Jacky Chou) Freescale i.MX6 PCIe controller driver: - Add DT binding and driver support for an optional external refclock in addition to the refclock from the internal PLL (Richard Zhu) - Fix CLKREQ# control so host asserts it during enumeration and Endpoints can use it afterwards to exit the L1.2 link state (Richard Zhu) NVIDIA Tegra PCIe controller driver: - Export irq_domain_free_irqs() to allow PCI/MSI drivers that tear down MSI domains to be built as modules (Aaron Kling) - Allow pci-tegra to be built as a module (Aaron Kling) NVIDIA Tegra194 PCIe controller driver: - Relax Kconfig so tegra194 can be built for platforms beyond Tegra194 (Vidya Sagar) Qualcomm PCIe controller driver: - Merge SC8180x DT binding into SM8150 (Krzysztof Kozlowski) - Move SDX55, SDM845, QCS404, IPQ5018, IPQ6018, IPQ8074 Gen3, IPQ8074, IPQ4019, IPQ9574, APQ8064, MSM8996, APQ8084 to dedicated schema (Krzysztof Kozlowski) - Add DT binding and driver support for SA8255p Endpoint being configured by firmware (Mrinmay Sarkar) - Parse PERST# from all PCIe bridge nodes for future platforms that will have PERST# in Switch Downstream Ports as well as in Root Ports (Manivannan Sadhasivam) Renesas RZ/G3S PCIe controller driver: - Use pci_generic_config_write() since the writability provided by the custom wrapper is unnecessary (Claudiu Beznea) SOPHGO PCIe controller driver: - Disable ASPM L0s and L1 on Sophgo 2044 PCIe Root Ports (Inochi Amaoto) Synopsys DesignWare PCIe controller driver: - Extend PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() to return a pointer to the preceding Capability, to allow removal of Capabilities that are advertised but not fully implemented (Qiang Yu) - Remove MSI and MSI-X Capabilities in platforms that can't support them, so the PCI core automatically falls back to INTx (Qiang Yu) - Add ASPM L1.1 and L1.2 Substates context to debugfs ltssm_status for drivers that support this (Shawn Lin) - Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if link is not up to avoid an unnecessary timeout (Manivannan Sadhasivam) - Revert dw-rockchip, qcom, and DWC core changes that used link-up IRQs to trigger enumeration instead of waiting for link to be up because the PCI core doesn't allocate bus number space for hierarchies that might be attached (Niklas Cassel) - Make endpoint iATU entry for MSI permanent instead of programming it dynamically, which is slow and racy with respect to other concurrent traffic, e.g., eDMA (Koichiro Den) - Use iMSI-RX MSI target address when possible to fix endpoints using 32-bit MSI (Shawn Lin) - Allow DWC host controller driver probe to continue if device is not found or found but inactive; only fail when there's an error with the link (Manivannan Sadhasivam) - For controllers like NXP i.MX6QP and i.MX7D, where LTSSM registers are not accessible after PME_Turn_Off, simply wait 10ms instead of polling for L2/L3 Ready (Richard Zhu) - Use multiple iATU entries to map large bridge windows and DMA ranges when necessary instead of failing (Samuel Holland) - Add EPC dynamic_inbound_mapping feature bit for Endpoint Controllers that can update BAR inbound address translation without requiring EPF driver to clear/reset the BAR first, and advertise it for DWC-based Endpoints (Koichiro Den) - Add EPC subrange_mapping feature bit for Endpoint Controllers that can map multiple independent inbound regions in a single BAR, implement subrange mapping, advertise it for DWC-based Endpoints, and add Endpoint selftests for it (Koichiro Den) - Make resizable BARs work for Endpoint multi-PF configurations; previously it only worked for PF 0 (Aksh Garg) - Fix Endpoint non-PF 0 support for BAR configuration, ATU mappings, and Address Match Mode (Aksh Garg) - Set up iATU when ECAM is enabled; previously IO and MEM outbound windows weren't programmed, and ECAM-related iATU entries weren't restored after suspend/resume, so config accesses failed (Krishna Chaitanya Chundru) Miscellaneous: - Use system_percpu_wq and WQ_PERCPU to explicitly request per-CPU work so WQ_UNBOUND can eventually be removed (Marco Crivellari)" * tag 'pci-v7.0-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (176 commits) PCI/bwctrl: Disable BW controller on Intel P45 using a quirk PCI: Disable ACS SV for IDT 0x8090 switch PCI: Disable ACS SV for IDT 0x80b5 switch PCI: Cache ACS Capabilities register PCI: Enable ACS after configuring IOMMU for OF platforms PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404] PCI: Add ACS quirk for Qualcomm Hamoa & Glymur PCI: Use device_lock_assert() to verify device lock is held PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held PCI: Fix pci_slot_lock () device locking PCI: Fix pci_slot_trylock() error handling PCI: Mark Nvidia GB10 to avoid bus reset PCI: Mark ASM1164 SATA controller to avoid bus reset PCI: host-generic: Avoid reporting incorrect 'missing reg property' error PCI/PME: Replace RMW of Root Status register with direct write PCI/AER: Clear stale errors on reporting agents upon probe PCI: Don't claim disabled bridge windows PCI: rzg3s-host: Fix device node reference leak in rzg3s_pcie_host_parse_port() PCI: dwc: Fix missing iATU setup when ECAM is enabled PCI: dwc: Clean up iATU index usage in dw_pcie_iatu_setup() ...
2026-02-11Merge tag 'printk-for-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Check all mandatory callbacks when registering nbcon consoles - Fix some compiler warnings * tag 'printk-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsnprintf: drop __printf() attributes on binary printing functions printf: convert test_hashed into macro printk: nbcon: Check for device_{lock,unlock} callbacks
2026-02-11Merge tag 'kbuild-7.0-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nathan Chancellor: "Kbuild: - Drop '*_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) - Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) - Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) - Update gendwarfksyms documentation with required dependencies (Jihan LIN) - Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) - Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) - Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) - Performance and usability improvements to scripts/make_fit.py (Simon Glass) - Minor various clean ups and fixes Kconfig: - Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) - Support depends on FOO if BAR as syntactic sugar for depends on FOO || !BAR (Nicolas Pitre, Graham Roff) - Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli)" * tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits) kbuild: remove dependency of run-command on config scripts/make_fit: Compress dtbs in parallel scripts/make_fit: Support a few more parallel compressors kbuild: Support a FIT_EXTRA_ARGS environment variable scripts/make_fit: Move dtb processing into a function scripts/make_fit: Support an initial ramdisk scripts/make_fit: Speed up operation rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option MAINTAINERS: Add scripts/install.sh into Kbuild entry modpost: Amend ppc64 save/restfpr symnames for -Os build MIPS: tools: relocs: Ship a definition of R_MIPS_PC32 streamline_config.pl: remove superfluous exclamation mark kbuild: dummy-tools: Add python3 scripts: kconfig: merge_config.sh: warn on duplicate input files scripts: kconfig: merge_config.sh: use awk in checks too scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk kallsyms: Get rid of kallsyms relative base mips: Add support for PC32 relocations in vmlinux Documentation: dev-tools: add container.rst page scripts: add tool to run containerized builds ...
2026-02-11Merge tag 'sched_ext-for-6.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Move C example schedulers back from the external scx repo to tools/sched_ext as the authoritative source. scx_userland and scx_pair are returning while scx_sdt (BPF arena-based task data management) is new. These schedulers will be dropped from the external repo. - Improve error reporting by adding scx_bpf_error() calls when DSQ creation fails across all in-tree schedulers - Avoid redundant irq_work_queue() calls in destroy_dsq() by only queueing when llist_add() indicates an empty list - Fix flaky init_enable_count selftest by properly synchronizing pre-forked children using a pipe instead of sleep() * tag 'sched_ext-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: selftests/sched_ext: Fix init_enable_count flakiness tools/sched_ext: Fix data header access during free in scx_sdt tools/sched_ext: Add error logging for dsq creation failures in remaining schedulers tools/sched_ext: add arena based scheduler tools/sched_ext: add scx_pair scheduler tools/sched_ext: add scx_userland scheduler sched_ext: Add error logging for dsq creation failures sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2026-02-11Merge tag 'cgroup-for-6.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cpuset changes: - Continue separating v1 and v2 implementations by moving more v1-specific logic into cpuset-v1.c - Improve partition handling. Sibling partitions are no longer invalidated on cpuset.cpus conflict, cpuset.cpus changes no longer fail in v2, and effective_xcpus computation is made consistent - Fix partition effective CPUs overlap that caused a warning on cpuset removal when sibling partitions shared CPUs - Increase the maximum cgroup subsystem count from 16 to 32 to accommodate future subsystem additions - Misc cleanups and selftest improvements including switching to css_is_online() helper, removing dead code and stale documentation references, using lockdep_assert_cpuset_lock_held() consistently, and adding polling helpers for asynchronously updated cgroup statistics * tag 'cgroup-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: fix overlap of partition effective CPUs cgroup: increase maximum subsystem count from 16 to 32 cgroup: Remove stale cpu.rt.max reference from documentation cpuset: replace direct lockdep_assert_held() with lockdep_assert_cpuset_lock_held() cgroup/cpuset: Move the v1 empty cpus/mems check to cpuset1_validate_change() cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflict cgroup/cpuset: Don't fail cpuset.cpus change in v2 cgroup/cpuset: Consistently compute effective_xcpus in update_cpumasks_hier() cgroup/cpuset: Streamline rm_siblings_excl_cpus() cpuset: remove dead code in cpuset-v1.c cpuset: remove v1-specific code from generate_sched_domains cpuset: separate generate_sched_domains for v1 and v2 cpuset: move update_domain_attr_tree to cpuset_v1.c cpuset: add cpuset1_init helper for v1 initialization cpuset: add cpuset1_online_css helper for v1-specific operations cpuset: add lockdep_assert_cpuset_lock_held helper cpuset: Remove unnecessary checks in rebuild_sched_domains_locked cgroup: switch to css_is_online() helper selftests: cgroup: Replace sleep with cg_read_key_long_poll() for waiting on nr_dying_descendants selftests: cgroup: make test_memcg_sock robust against delayed sock stats ...
2026-02-11Merge tag 'wq-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: - Rework the rescuer to process work items one-by-one instead of slurping all pending work items in a single pass. As there is only one rescuer per workqueue, a single long-blocking work item could cause high latency for all tasks queued behind it, even after memory pressure is relieved and regular kworkers become available to service them. - Add CONFIG_BOOTPARAM_WQ_STALL_PANIC build-time option and workqueue.panic_on_stall_time parameter for time-based stall panic, giving systems more control over workqueue stall handling. - Replace BUG_ON() with panic() in the stall panic path for clearer intent and more informative output. * tag 'wq-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: replace BUG_ON with panic in panic_on_wq_watchdog workqueue: add time-based panic for stalls workqueue: add CONFIG_BOOTPARAM_WQ_STALL_PANIC option workqueue: Process extra works in rescuer on memory pressure workqueue: Process rescuer work items one-by-one using a cursor workqueue: Make send_mayday() take a PWQ argument directly
2026-02-11sched/mmcid: Don't assume CID is CPU owned on mode switchThomas Gleixner
Shinichiro reported a KASAN UAF, which is actually an out of bounds access in the MMCID management code. CPU0 CPU1 T1 runs in userspace T0: fork(T4) -> Switch to per CPU CID mode fixup() set MM_CID_TRANSIT on T1/CPU1 T4 exit() T3 exit() T2 exit() T1 exit() switch to per task mode ---> Out of bounds access. As T1 has not scheduled after T0 set the TRANSIT bit, it exits with the TRANSIT bit set. sched_mm_cid_remove_user() clears the TRANSIT bit in the task and drops the CID, but it does not touch the per CPU storage. That's functionally correct because a CID is only owned by the CPU when the ONCPU bit is set, which is mutually exclusive with the TRANSIT flag. Now sched_mm_cid_exit() assumes that the CID is CPU owned because the prior mode was per CPU. It invokes mm_drop_cid_on_cpu() which clears the not set ONCPU bit and then invokes clear_bit() with an insanely large bit number because TRANSIT is set (bit 29). Prevent that by actually validating that the CID is CPU owned in mm_drop_cid_on_cpu(). Fixes: 007d84287c74 ("sched/mmcid: Drop per CPU CID immediately when switching to per task mode") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/aYsZrixn9b6s_2zL@shinmob Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-11tracing: Reset last_boot_info if ring buffer is resetMasami Hiramatsu (Google)
Commit 32dc0042528d ("tracing: Reset last-boot buffers when reading out all cpu buffers") resets the last_boot_info when user read out all data via trace_pipe* files. But it is not reset when user resets the buffer from other files. (e.g. write `trace` file) Reset it when the corresponding ring buffer is reset too. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/177071302364.2293046.17895165659153977720.stgit@mhiramat.tok.corp.google.com Fixes: 32dc0042528d ("tracing: Reset last-boot buffers when reading out all cpu buffers") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-11tracing: Fix to set write permission to per-cpu buffer_size_kbMasami Hiramatsu (Google)
Since the per-cpu buffer_size_kb file is writable for changing per-cpu ring buffer size, the file should have the write access permission. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/177071301597.2293046.11683339475076917920.stgit@mhiramat.tok.corp.google.com Fixes: 21ccc9cd7211 ("tracing: Disable "other" permission bits in the tracefs files") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-11Merge branch 'for-6.20' into for-linusPetr Mladek
2026-02-11time/jiffies: Inline jiffies_to_msecs() and jiffies_to_usecs()Eric Dumazet
For common cases (HZ=100, 250 or 1000), these helpers are at most one multiply, so there is no point calling a tiny function. Keep them out of line for HZ=300 and others. This saves cycles in TCP fast path, among other things. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/8 grow/shrink: 25/89 up/down: 530/-3474 (-2944) ... nla_put_msecs 193 - -193 message_stats_print 2131 920 -1211 Total: Before=25365208, After=25362264, chg -0.01% Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260210170226.57209-1-edumazet@google.com
2026-02-10Merge tag 'powerpc-7.0-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates for 7.0 - Implement masked user access - Add bpf support for internal only per-CPU instructions and inline the bpf_get_smp_processor_id() and bpf_get_current_task() functions - Fix pSeries MSI-X allocation failure when quota is exceeded - Fix recursive pci_lock_rescan_remove locking in EEH event handling - Support tailcalls with subprogs & BPF exceptions on 64bit - Extend "trusted" keys to support the PowerVM Key Wrapping Module (PKWM) Thanks to Abhishek Dubey, Christophe Leroy, Gaurav Batra, Guangshuo Li, Jarkko Sakkinen, Mahesh Salgaonkar, Mimi Zohar, Miquel Sabaté Solà, Nam Cao, Narayana Murty N, Nayna Jain, Nilay Shroff, Puranjay Mohan, Saket Kumar Bhaskar, Sourabh Jain, Srish Srinivasan, and Venkat Rao Bagalkote. * tag 'powerpc-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (27 commits) powerpc/pseries: plpks: export plpks_wrapping_is_supported docs: trusted-encryped: add PKWM as a new trust source keys/trusted_keys: establish PKWM as a trusted source pseries/plpks: add HCALLs for PowerVM Key Wrapping Module pseries/plpks: expose PowerVM wrapping features via the sysfs powerpc/pseries: move the PLPKS config inside its own sysfs directory pseries/plpks: fix kernel-doc comment inconsistencies powerpc/smp: Add check for kcalloc() failure in parse_thread_groups() powerpc: kgdb: Remove OUTBUFMAX constant powerpc64/bpf: Additional NVR handling for bpf_throw powerpc64/bpf: Support exceptions powerpc64/bpf: Add arch_bpf_stack_walk() for BPF JIT powerpc64/bpf: Avoid tailcall restore from trampoline powerpc64/bpf: Support tailcalls with subprogs powerpc64/bpf: Moving tail_call_cnt to bottom of frame powerpc/eeh: fix recursive pci_lock_rescan_remove locking in EEH event handling powerpc/pseries: Fix MSI-X allocation failure when quota is exceeded powerpc/iommu: bypass DMA APIs for coherent allocations for pre-mapped memory powerpc64/bpf: Inline bpf_get_smp_processor_id() and bpf_get_current_task/_btf() powerpc64/bpf: Support internal-only MOV instruction to resolve per-CPU addrs ...
2026-02-10printk: Add execution context (task name/CPU) to printk_infoBreno Leitao
Extend struct printk_info to include the task name, pid, and CPU number where printk messages originate. This information is captured at vprintk_store() time and propagated through printk_message to nbcon_write_context, making it available to nbcon console drivers. This is useful for consoles like netconsole that want to include execution context in their output, allowing correlation of messages with specific tasks and CPUs regardless of where the console driver actually runs. The feature is controlled by CONFIG_PRINTK_EXECUTION_CTX, which is automatically selected by CONFIG_NETCONSOLE_DYNAMIC. When disabled, the helper functions compile to no-ops with no overhead. Suggested-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: John Ogness <john.ogness@linutronix.de> Link: https://patch.msgid.link/20260206-nbcon-v7-1-62bda69b1b41@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-10Merge tag 'x86_paravirt_for_v7.0_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...
2026-02-10Merge tag 'timers-vdso-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO updates from Thomas Gleixner: - Provide the missing 64-bit variant of clock_getres() This allows the extension of CONFIG_COMPAT_32BIT_TIME to the vDSO and finally the removal of 32-bit time types from the kernel and UAPI. - Remove the useless and broken getcpu_cache from the VDSO The intention was to provide a trivial way to retrieve the CPU number from the VDSO, but as the VDSO data is per process there is no way to make it work. - Switch get/put_unaligned() from packed struct to memcpy() The packed struct violates strict aliasing rules which requires to pass -fno-strict-aliasing to the compiler. As this are scalar values __builtin_memcpy() turns them into simple loads and stores - Use __typeof_unqual__() for __unqual_scalar_typeof() The get/put_unaligned() changes triggered a new sparse warning when __beNN types are used with get/put_unaligned() as sparse builds add a special 'bitwise' attribute to them which prevents sparse to evaluate the Generic in __unqual_scalar_typeof(). Newer sparse versions support __typeof_unqual__() which avoids the problem, but requires a recent sparse install. So this adds a sanity check to sparse builds, which validates that sparse is available and capable of handling it. - Force inline __cvdso_clock_getres_common() Compilers sometimes un-inline agressively, which results in function call overhead and problems with automatic stack variable initialization. Interestingly enough the force inlining results in smaller code than the un-inlined variant produced by GCC when optimizing for size. * tag 'timers-vdso-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: vdso/gettimeofday: Force inlining of __cvdso_clock_getres_common() x86/percpu: Make CONFIG_USE_X86_SEG_SUPPORT work with sparse compiler: Use __typeof_unqual__() for __unqual_scalar_typeof() powerpc/vdso: Provide clock_getres_time64() tools headers: Remove unneeded ignoring of warnings in unaligned.h tools headers: Update the linux/unaligned.h copy with the kernel sources vdso: Switch get/put_unaligned() from packed struct to memcpy() parisc: Inline a type punning version of get_unaligned_le32() vdso: Remove struct getcpu_cache MIPS: vdso: Provide getres_time64() for 32-bit ABIs arm64: vdso32: Provide clock_getres_time64() ARM: VDSO: Provide clock_getres_time64() ARM: VDSO: Patch out __vdso_clock_getres() if unavailable x86/vdso: Provide clock_getres_time64() for x86-32 selftests: vDSO: vdso_test_abi: Add test for clock_getres_time64() selftests: vDSO: vdso_test_abi: Use UAPI system call numbers selftests: vDSO: vdso_config: Add configurations for clock_getres_time64() vdso: Add prototype for __vdso_clock_getres_time64()
2026-02-10Merge tag 'timers-core-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Inline timecounter_cyc2time() as that is now used in the networking hotpath. Inlining it significantly improves performance. - Optimize the tick dependency check in case that the tracepoint is disabled, which improves the hotpath performance in the tick management code, which is a hotpath on transitions in and out of idle. - The usual cleanups and improvements * tag 'timers-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time/kunit: Document handling of negative years of is_leap() tick/nohz: Optimize check_tick_dependency() with early return time/sched_clock: Use ACCESS_PRIVATE() to evaluate hrtimer::function hrtimer: Drop _tv64() helpers hrtimer: Remove public definition of HIGH_RES_NSEC hrtimer: Remove unused resolution constants time/timecounter: Inline timecounter_cyc2time()
2026-02-10Merge tag 'irq-msi-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the [PCI] MSI subsystem: - Add interrupt redirection infrastructure Some PCI controllers use a single demultiplexing interrupt for the MSI interrupts of subordinate devices. This prevents setting the interrupt affinity of device interrupts, which causes device interrupts to be delivered to a single CPU. That obviously is counterproductive for multi-queue devices and interrupt balancing. To work around this limitation the new infrastructure installs a dummy irq_set_affinity() callback which captures the affinity mask and picks a redirection target CPU out of the mask. When the PCI controller demultiplexes the interrupts it invokes a new handling function in the core, which either runs the interrupt handler in the context of the target CPU or delegates it to irq_work on the target CPU. - Utilize the interrupt redirection mechanism in the PCI DWC host controller driver. This allows affinity control for the subordinate device MSI interrupts instead of being randomly executed on the CPU which runs the demultiplex handler. - Replace the binary 64-bit MSI flag with a DMA mask Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. With the binary 64-bit flag there is no other choice than disabling 64-bit MSI support which leaves the device disfunctional. By using a DMA mask the address limit of a device can be described correctly which provides support for the above scenario. - Make use of the DMA mask based address limit in the hda/intel and radeon drivers to enable them on affected platforms - The usual small cleanups and improvements" * tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ALSA: hda/intel: Make MSI address limit based on the device DMA limit drm/radeon: Make MSI address limit based on the device DMA limit PCI/MSI: Check the device specific address mask in msi_verify_entries() PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask genirq/redirect: Prevent writing MSI message on affinity change PCI/MSI: Unmap MSI-X region on error genirq: Update effective affinity for redirected interrupts PCI: dwc: Enable MSI affinity support PCI: dwc: Code cleanup genirq: Add interrupt redirection infrastructure genirq/msi: Correct kernel-doc in <linux/msi.h>
2026-02-10Merge tag 'irq-core-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq core updates from Thomas Gleixner: "Updates for the interrupt core subsystem: - Remove the interrupt timing infrastructure This was added seven years ago to be used for power management purposes, but that integration never happened. - Clean up the remaining setup_percpu_irq() users The memory allocator is available when interrupts can be requested so there is not need for static irq_action. Move the remaining users to request_percpu_irq() and delete the historical cruft. - Warn when interrupt flag inconsistencies are detected in request*_irq(). Inconsistent flags can lead to hard to diagnose malfunction. The fallout of this new warning has been addressed in next and the fixes are coming in via the maintainer trees and the tip irq/cleanup pull requests. - Invoke affinity notifier when CPU hotplug breaks affinity Otherwise the code using the notifier misses the affinity change and operates on stale information. - The usual cleanups and improvements" * tag 'irq-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/proc: Replace snprintf with strscpy in register_handler_proc genirq/cpuhotplug: Notify about affinity changes breaking the affinity mask genirq: Move clear of kstat_irqs to free_desc() genirq: Warn about using IRQF_ONESHOT without a threaded handler irqdomain: Fix up const problem in irq_domain_set_name() genirq: Remove setup_percpu_irq() clocksource/drivers/mips-gic-timer: Move GIC timer to request_percpu_irq() MIPS: Move IP27 timer to request_percpu_irq() MIPS: Move IP30 timer to request_percpu_irq() genirq: Remove __request_percpu_irq() helper genirq: Remove IRQ timing tracking infrastructure
2026-02-10Merge tag 'sched-core-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Scheduler Kconfig space updates: - Further consolidate configurable preemption modes (Peter Zijlstra) Reduce the number of architectures that are allowed to offer PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number of preemption models from four to just two: 'full' and 'lazy' on up-to-date architectures (arm64, loongarch, powerpc, riscv, s390, x86). None and voluntary are only available as legacy features on platforms that don't implement lazy preemption yet, or which don't even support preemption. The goal is to eventually remove cond_resched() and voluntary preemption altogether. RSEQ based 'scheduler time slice extension' support (Thomas Gleixner and Peter Zijlstra): This allows a thread to request a time slice extension when it enters a critical section to avoid contention on a resource when the thread is scheduled out inside of the critical section. - Add fields and constants for time slice extension - Provide static branch for time slice extensions - Add statistics for time slice extensions - Add prctl() to enable time slice extensions - Implement sys_rseq_slice_yield() - Implement syscall entry work for time slice extensions - Implement time slice extension enforcement timer - Reset slice extension when scheduled - Implement rseq_grant_slice_extension() - entry: Hook up rseq time slice extension - selftests: Implement time slice extension test - Allow registering RSEQ with slice extension - Move slice_ext_nsec to debugfs - Lower default slice extension - selftests/rseq: Add rseq slice histogram script Scheduler performance/scalability improvements: - Update rq->avg_idle when a task is moved to an idle CPU, which improves the scalability of various workloads (Shubhang Kaushik) - Reorder fields in 'struct rq' for better caching (Blake Jones) - Fair scheduler SMP NOHZ balancing code speedups (Shrikanth Hegde): - Move checking for nohz cpus after time check - Change likelyhood of nohz.nr_cpus - Remove nohz.nr_cpus and use weight of cpumask instead - Avoid false sharing for sched_clock_irqtime (Wangyang Guo) - Cleanups (Yury Norov): - Drop useless cpumask_empty() in find_energy_efficient_cpu() - Simplify task_numa_find_cpu() - Use cpumask_weight_and() in sched_balance_find_dst_group() DL scheduler updates: - Add a deadline server for sched_ext tasks (by Andrea Righi and Joel Fernandes, with fixes by Peter Zijlstra) RT scheduler updates: - Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang) Entry code updates and performance improvements (Jinjie Ruan) This is part of the scheduler tree in this cycle due to inter- dependencies with the RSEQ based time slice extension work: - Remove unused syscall argument from syscall_trace_enter() - Rework syscall_exit_to_user_mode_work() for architecture reuse - Add arch_ptrace_report_syscall_entry/exit() - Inline syscall_exit_work() and syscall_trace_enter() Scheduler core updates (Peter Zijlstra): - Rework sched_class::wakeup_preempt() and rq_modified_*() - Avoid rq->lock bouncing in sched_balance_newidle() - Rename rcu_dereference_check_sched_domain() => rcu_dereference_sched_domain() - <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper Fair scheduler updates/refactoring (Peter Zijlstra and Ingo Molnar): - Fold the sched_avg update - Change rcu_dereference_check_sched_domain() to rcu-sched - Switch to rcu_dereference_all() - Remove superfluous rcu_read_lock() - Limit hrtick work - Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks - Clean up comments in 'struct cfs_rq' - Separate se->vlag from se->vprot - Rename cfs_rq::avg_load to cfs_rq::sum_weight - Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions - Introduce and use the vruntime_cmp() and vruntime_op() wrappers for wrapped-signed aritmetics - Sort out 'blocked_load*' namespace noise Scheduler debugging code updates: - Export hidden tracepoints to modules (Gabriele Monaco) - Convert copy_from_user() + kstrtouint() to kstrtouint_from_user() (Fushuai Wang) - Add assertions to QUEUE_CLASS (Peter Zijlstra) - hrtimer: Fix tracing oddity (Thomas Gleixner) Misc fixes and cleanups: - Re-evaluate scheduling when migrating queued tasks out of throttled cgroups (Zicheng Qu) - Remove task_struct->faults_disabled_mapping (Christoph Hellwig) - Fix math notation errors in avg_vruntime comment (Zhan Xusheng) - sched/cpufreq: Use %pe format for PTR_ERR() printing (zenghongling)" * tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) sched: Re-evaluate scheduling when migrating queued tasks out of throttled cgroups sched/cpufreq: Use %pe format for PTR_ERR() printing sched/rt: Skip currently executing CPU in rto_next_cpu() sched/clock: Avoid false sharing for sched_clock_irqtime selftests/sched_ext: Add test for DL server total_bw consistency selftests/sched_ext: Add test for sched_ext dl_server sched/debug: Fix dl_server (re)start conditions sched/debug: Add support to change sched_ext server params sched_ext: Add a DL server for sched_ext tasks sched/debug: Stop and start server based on if it was active sched/debug: Fix updating of ppos on server write ops sched/deadline: Clear the defer params entry: Inline syscall_exit_work() and syscall_trace_enter() entry: Add arch_ptrace_report_syscall_entry/exit() entry: Rework syscall_exit_to_user_mode_work() for architecture reuse entry: Remove unused syscall argument from syscall_trace_enter() sched: remove task_struct->faults_disabled_mapping sched: Update rq->avg_idle when a task is moved to an idle CPU selftests/rseq: Add rseq slice histogram script hrtimer: Fix trace oddity ...
2026-02-10Merge tag 'locking-core-2026-02-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Lock debugging: - Implement compiler-driven static analysis locking context checking, using the upcoming Clang 22 compiler's context analysis features (Marco Elver) We removed Sparse context analysis support, because prior to removal even a defconfig kernel produced 1,700+ context tracking Sparse warnings, the overwhelming majority of which are false positives. On an allmodconfig kernel the number of false positive context tracking Sparse warnings grows to over 5,200... On the plus side of the balance actual locking bugs found by Sparse context analysis is also rather ... sparse: I found only 3 such commits in the last 3 years. So the rate of false positives and the maintenance overhead is rather high and there appears to be no active policy in place to achieve a zero-warnings baseline to move the annotations & fixers to developers who introduce new code. Clang context analysis is more complete and more aggressive in trying to find bugs, at least in principle. Plus it has a different model to enabling it: it's enabled subsystem by subsystem, which results in zero warnings on all relevant kernel builds (as far as our testing managed to cover it). Which allowed us to enable it by default, similar to other compiler warnings, with the expectation that there are no warnings going forward. This enforces a zero-warnings baseline on clang-22+ builds (Which are still limited in distribution, admittedly) Hopefully the Clang approach can lead to a more maintainable zero-warnings status quo and policy, with more and more subsystems and drivers enabling the feature. Context tracking can be enabled for all kernel code via WARN_CONTEXT_ANALYSIS_ALL=y (default disabled), but this will generate a lot of false positives. ( Having said that, Sparse support could still be added back, if anyone is interested - the removal patch is still relatively straightforward to revert at this stage. ) Rust integration updates: (Alice Ryhl, Fujita Tomonori, Boqun Feng) - Add support for Atomic<i8/i16/bool> and replace most Rust native AtomicBool usages with Atomic<bool> - Clean up LockClassKey and improve its documentation - Add missing Send and Sync trait implementation for SetOnce - Make ARef Unpin as it is supposed to be - Add __rust_helper to a few Rust helpers as a preparation for helper LTO - Inline various lock related functions to avoid additional function calls WW mutexes: - Extend ww_mutex tests and other test-ww_mutex updates (John Stultz) Misc fixes and cleanups: - rcu: Mark lockdep_assert_rcu_helper() __always_inline (Arnd Bergmann) - locking/local_lock: Include more missing headers (Peter Zijlstra) - seqlock: fix scoped_seqlock_read kernel-doc (Randy Dunlap) - rust: sync: Replace `kernel::c_str!` with C-Strings (Tamir Duberstein)" * tag 'locking-core-2026-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits) locking/rwlock: Fix write_trylock_irqsave() with CONFIG_INLINE_WRITE_TRYLOCK rcu: Mark lockdep_assert_rcu_helper() __always_inline compiler-context-analysis: Remove __assume_ctx_lock from initializers tomoyo: Use scoped init guard crypto: Use scoped init guard kcov: Use scoped init guard compiler-context-analysis: Introduce scoped init guards cleanup: Make __DEFINE_LOCK_GUARD handle commas in initializers seqlock: fix scoped_seqlock_read kernel-doc tools: Update context analysis macros in compiler_types.h rust: sync: Replace `kernel::c_str!` with C-Strings rust: sync: Inline various lock related methods rust: helpers: Move #define __rust_helper out of atomic.c rust: wait: Add __rust_helper to helpers rust: time: Add __rust_helper to helpers rust: task: Add __rust_helper to helpers rust: sync: Add __rust_helper to helpers rust: refcount: Add __rust_helper to helpers rust: rcu: Add __rust_helper to helpers rust: processor: Add __rust_helper to helpers ...
2026-02-10Merge tag 'perf-core-2026-02-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: "x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs (Dapeng Mi) Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs (Zide Chen) This centers around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code (Martin Schiller) Performance enhancements: - Speed up kexec shutdown by avoiding unnecessary cross CPU calls (Jan H. Schönherr) - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)" * tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) s390: remove kvm_types.h from Kbuild uprobes: Fix incorrect lockdep condition in filter_chain() x86/ibs: Fix typo in dc_l2tlb_miss comment x86/uprobes: Fix XOL allocation failure for 32-bit tasks perf/x86/intel/uncore: Convert comma to semicolon perf/x86/intel: Add support for rdpmc user disable feature perf/x86: Use macros to replace magic numbers in attr_rdpmc perf/x86/intel: Add core PMU support for Novalake perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL perf/x86/intel: Add core PMU support for DMR perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL perf/core: Fix slow perf_event_task_exit() with LBR callstacks perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls uprobes: use kmap_local_page() for temporary page mappings arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() perf/x86/intel/uncore: Add Nova Lake support ...
2026-02-10Merge tag 'bpf-next-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support associating BPF program with struct_ops (Amery Hung) - Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung) - Fix live registers marking for indirect jumps (Anton Protopopov) - Introduce execution context detection BPF helpers (Changwoo Min) - Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda) - Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng) - Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman) - In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis) - Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai) - Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen) - Fix ORC stack unwind from kprobe_multi (Jiri Olsa) - Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa) - Require frozen map for calculating map hash (KP Singh) - Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi) - Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang) - Make kfuncs return trusted pointers by default (Matt Bobrowski) - Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong) - Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov) - Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan) - Allow BPF arena calls from non-sleepable context (Puranjay Mohan) - Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan) - In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin) - Use CFI compatible destructor kfunc type (Sami Tolvanen) - Add bitwise tracking for BPF_END in the verifier (Tianci Cao) - Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang) - Make BPF selftests work with 64k page size (Yonghong Song) * tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...
2026-02-10Merge tag 'modules-7.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull module updates from Sami Tolvanen: "Module signing: - Remove SHA-1 support for signing modules. SHA-1 is no longer considered secure for signatures due to vulnerabilities that can lead to hash collisions. None of the major distributions use SHA-1 anymore, and the kernel has defaulted to SHA-512 since v6.11. Note that loading SHA-1 signed modules is still supported. - Update scripts/sign-file to use only the OpenSSL CMS API for signing. As SHA-1 support is gone, we can drop the legacy PKCS#7 API which was limited to SHA-1. This also cleans up support for legacy OpenSSL versions. Cleanups and fixes: - Use system_dfl_wq instead of the per-cpu system_wq following the ongoing workqueue API refactoring. - Avoid open-coded kvrealloc() in module decompression logic by using the standard helper. - Improve section annotations by replacing the custom __modinit with __init_or_module and removing several unused __INIT*_OR_MODULE macros. - Fix kernel-doc warnings in include/linux/moduleparam.h. - Ensure set_module_sig_enforced is only declared when module signing is enabled. - Fix gendwarfksyms build failures on 32-bit hosts. MAINTAINERS: - Update the module subsystem entry to reflect the maintainer rotation and update the git repository link" * tag 'modules-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: modules: moduleparam.h: fix kernel-doc comments module: Only declare set_module_sig_enforced when CONFIG_MODULE_SIG=y module/decompress: Avoid open-coded kvrealloc() gendwarfksyms: Fix build on 32-bit hosts sign-file: Use only the OpenSSL CMS API for signing module: Remove SHA-1 support for module signing module: replace use of system_wq with system_dfl_wq params: Replace __modinit with __init_or_module module: Remove unused __INIT*_OR_MODULE macros MAINTAINERS: Update module subsystem maintainers and repository
2026-02-10Merge tag 'v7.0-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Fix race condition in hwrng core by using RCU Algorithms: - Allow authenc(sha224,rfc3686) in fips mode - Add test vectors for authenc(hmac(sha384),cbc(aes)) - Add test vectors for authenc(hmac(sha224),cbc(aes)) - Add test vectors for authenc(hmac(md5),cbc(des3_ede)) - Add lz4 support in hisi_zip - Only allow clear key use during self-test in s390/{phmac,paes} Drivers: - Set rng quality to 900 in airoha - Add gcm(aes) support for AMD/Xilinx Versal device - Allow tfms to share device in hisilicon/trng" * tag 'v7.0-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (100 commits) crypto: img-hash - Use unregister_ahashes in img_{un}register_algs crypto: testmgr - Add test vectors for authenc(hmac(md5),cbc(des3_ede)) crypto: cesa - Simplify return statement in mv_cesa_dequeue_req_locked crypto: testmgr - Add test vectors for authenc(hmac(sha224),cbc(aes)) crypto: testmgr - Add test vectors for authenc(hmac(sha384),cbc(aes)) hwrng: core - use RCU and work_struct to fix race condition crypto: starfive - Fix memory leak in starfive_aes_aead_do_one_req() crypto: xilinx - Fix inconsistant indentation crypto: rng - Use unregister_rngs in register_rngs crypto: atmel - Use unregister_{aeads,ahashes,skciphers} hwrng: optee - simplify OP-TEE context match crypto: ccp - Add sysfs attribute for boot integrity dt-bindings: crypto: atmel,at91sam9g46-sha: add microchip,lan9691-sha dt-bindings: crypto: atmel,at91sam9g46-aes: add microchip,lan9691-aes dt-bindings: crypto: qcom,inline-crypto-engine: document the Milos ICE crypto: caam - fix netdev memory leak in dpaa2_caam_probe crypto: hisilicon/qm - increase wait time for mailbox crypto: hisilicon/qm - obtain the mailbox configuration at one time crypto: hisilicon/qm - remove unnecessary code in qm_mb_write() crypto: hisilicon/qm - move the barrier before writing to the mailbox register ...
2026-02-10Revert "pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers"Oleg Nesterov
This reverts commit abdfd4948e45c51b19162cf8b3f5003f8f53c9b9. The changelog in this commit explains why it is not easy to avoid ns == NULL when the caller is exiting, but pid_vnr() is equally unsafe in this case. However, commit 006568ab4c5c ("pid: Add a judgment for ns null in pid_nr_ns") already added the ns != NULL check in pid_nr_ns(), so we can remove the same check from __task_pid_nr_ns(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://patch.msgid.link/20251015123613.GA9456@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10pidfs: implement ino allocation without the pidmap lockMateusz Guzik
This paves the way for scalable PID allocation later. The 32 bit variant merely takes a spinlock for simplicity, the 64 bit variant uses a scalable scheme. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20260120184539.1480930-1-mjguzik@gmail.com Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10pidfs: convert rb-tree to rhashtableChristian Brauner
Mateusz reported performance penalties [1] during task creation because pidfs uses pidmap_lock to add elements into the rbtree. Switch to an rhashtable to have separate fine-grained locking and to decouple from pidmap_lock moving all heavy manipulations outside of it. Convert the pidfs inode-to-pid mapping from an rb-tree with seqcount protection to an rhashtable. This removes the global pidmap_lock contention from pidfs_ino_get_pid() lookups and allows the hashtable insert to happen outside the pidmap_lock. pidfs_add_pid() is split. pidfs_prepare_pid() allocates inode number and initializes pid fields and is called inside pidmap_lock. pidfs_add_pid() inserts pid into rhashtable and is called outside pidmap_lock. Insertion into the rhashtable can fail and memory allocation may happen so we need to drop the spinlock. To guard against accidently opening an already reaped task pidfs_ino_get_pid() uses additional checks beyond pid_vnr(). If pid->attr is PIDFS_PID_DEAD or NULL the pid either never had a pidfd or it already went through pidfs_exit() aka the process as already reaped. If pid->attr is valid check PIDFS_ATTR_BIT_EXIT to figure out whether the task has exited. This slightly changes visibility semantics: pidfd creation is denied after pidfs_exit() runs, which is just before the pid number is removed from the via free_pid(). That should not be an issue though. Link: https://lore.kernel.org/20251206131955.780557-1-mjguzik@gmail.com [1] Link: https://patch.msgid.link/20260120-work-pidfs-rhashtable-v2-1-d593c4d0f576@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10tracing: Fix false sharing in hwlat get_sample()Colin Lord
The get_sample() function in the hwlat tracer assumes the caller holds hwlat_data.lock, but this is not actually happening. The result is unprotected data access to hwlat_data, and in per-cpu mode can result in false sharing which may show up as false positive latency events. The specific case of false sharing observed was primarily between hwlat_data.sample_width and hwlat_data.count. These are separated by just 8B and are therefore likely to share a cache line. When one thread modifies count, the cache line is in a modified state so when other threads read sample_width in the main latency detection loop, they fetch the modified cache line. On some systems, the fetch itself may be slow enough to count as a latency event, which could set up a self reinforcing cycle of latency events as each event increments count which then causes more latency events, continuing the cycle. The other result of the unprotected data access is that hwlat_data.count can end up with duplicate or missed values, which was observed on some systems in testing. Convert hwlat_data.count to atomic64_t so it can be safely modified without locking, and prevent false sharing by pulling sample_width into a local variable. One system this was tested on was a dual socket server with 32 CPUs on each numa node. With settings of 1us threshold, 1000us width, and 2000us window, this change reduced the number of latency events from 500 per second down to approximately 1 event per minute. Some machines tested did not exhibit measurable latency from the false sharing. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260210074810.6328-1-clord@mykolab.com Signed-off-by: Colin Lord <clord@mykolab.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-09Merge tag 'kthread-for-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks Pull kthread updates from Frederic Weisbecker: "The kthread code provides an infrastructure which manages the preferred affinity of unbound kthreads (node or custom cpumask) against housekeeping (CPU isolation) constraints and CPU hotplug events. One crucial missing piece is the handling of cpuset: when an isolated partition is created, deleted, or its CPUs updated, all the unbound kthreads in the top cpuset become indifferently affine to _all_ the non-isolated CPUs, possibly breaking their preferred affinity along the way. Solve this with performing the kthreads affinity update from cpuset to the kthreads consolidated relevant code instead so that preferred affinities are honoured and applied against the updated cpuset isolated partitions. The dispatch of the new isolated cpumasks to timers, workqueues and kthreads is performed by housekeeping, as per the nice Tejun's suggestion. As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set from boot defined domain isolation (through isolcpus=) and cpuset isolated partitions. Housekeeping cpumasks are now modifiable with a specific RCU based synchronization. A big step toward making nohz_full= also mutable through cpuset in the future" * tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (33 commits) doc: Add housekeeping documentation kthread: Document kthread_affine_preferred() kthread: Comment on the purpose and placement of kthread_affine_node() call kthread: Honour kthreads preferred affinity after cpuset changes sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management kthread: Include kthreadd to the managed affinity list kthread: Include unbound kthreads in the managed affinity list kthread: Refine naming of affinity related fields PCI: Remove superfluous HK_TYPE_WQ check sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() cpuset: Remove cpuset_cpu_is_isolated() timers/migration: Remove superfluous cpuset isolation test cpuset: Propagate cpuset isolation update to timers through housekeeping cpuset: Propagate cpuset isolation update to workqueue through housekeeping PCI: Flush PCI probe workqueue on cpuset isolated partition change sched/isolation: Flush vmstat workqueues on cpuset isolated partition change sched/isolation: Flush memcg workqueues on cpuset isolated partition change cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset ...
2026-02-09Merge tag 'pm-6.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "By the number of commits, cpufreq is the leading party (again) and the most visible change there is the removal of the omap-cpufreq driver that has not been used for a long time (good riddance). There are also quite a few changes in the cppc_cpufreq driver, mostly related to fixing its frequency invariance engine in the case when the CPPC registers used by it are not in PCC. In addition to that, support for AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev list is updated for some platforms. The remaining cpufreq changes are assorted fixes and cleanups. Next up is cpuidle and the changes there are dominated by intel_idle driver updates, mostly related to the new command line facility allowing users to adjust the list of C-states used by the driver. There are also a few updates of cpuidle governors, including two menu governor fixes and some refinements of the teo governor, and a MAINTAINERS update adding Christian Loehle as a cpuidle reviewer. [Thanks for stepping up Christian!] The most significant update related to system suspend and hibernation is the one to stop freezing the PM runtime workqueue during system PM transitions which allows some deadlocks to be avoided. There is also a fix for possible concurrent bit field updates in the core device suspend code and a few other minor fixes. Apart from the above, several drivers are updated to discard the return value of pm_runtime_put() which is going to be converted to a void function as soon as everybody stops using its return value, PL4 support for Ice Lake is added to the Intel RAPL power capping driver, and there are assorted cleanups, documentation fixes, and some cpupower utility improvements. Specifics: - Remove the unused omap-cpufreq driver (Andreas Kemnade) - Optimize error handling code in cpufreq_boost_trigger_state() and make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy supports boost (Lifeng Zheng) - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling, Dhruva Gole, and Konrad Dybcio) - Minor improvements to the cpufreq and cpumask rust implementation (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen) - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole) - Update arch_freq_scale in the CPPC cpufreq driver's frequency invariance engine (FIE) in scheduler ticks if the related CPPC registers are not in PCC (Jie Zhan) - Assorted minor cleanups and improvements in ARM cpufreq drivers (Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov) - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit Gupta) - Make the scaling_setspeed cpufreq sysfs attribute return the actual requested frequency to avoid confusion (Pengjie Zhang) - Simplify the idle CPU time granularity test in the ondemand cpufreq governor (Frederic Weisbecker) - Enable asym capacity in intel_pstate only when CPU SMT is not possible (Yaxiong Tian) - Update the description of rate_limit_us default value in cpufreq documentation (Yaxiong Tian) - Add a command line option to adjust the C-states table in the intel_idle driver, remove the 'preferred_cstates' module parameter from it, add C-states validation to it and clean it up (Artem Bityutskiy) - Make the menu cpuidle governor always check the time till the closest timer event when the scheduler tick has been stopped to prevent it from mistakenly selecting the deepest available idle state (Rafael Wysocki) - Update the teo cpuidle governor to avoid making suboptimal decisions in certain corner cases and generally improve idle state selection accuracy (Rafael Wysocki) - Remove an unlikely() annotation on the early-return condition in menu_select() that leads to branch misprediction 100% of the time on systems with only 1 idle state enabled, like ARM64 servers (Breno Leitao) - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer (Christian Loehle) - Stop flagging the PM runtime workqueue as freezable to avoid system suspend and resume deadlocks in subsystems that assume asynchronous runtime PM to work during system-wide PM transitions (Rafael Wysocki) - Drop redundant NULL pointer checks before acomp_request_free() from the hibernation code handling image saving (Rafael Wysocki) - Update wakeup_sources_walk_start() to handle empty lists of wakeup sources as appropriate (Samuel Wu) - Make dev_pm_clear_wake_irq() check the power.wakeirq value under power.lock to avoid race conditions (Gui-Dong Han) - Avoid bit field races related to power.work_in_progress in the core device suspend code (Xuewen Yan) - Make several drivers discard pm_runtime_put() return value in preparation for converting that function to a void one (Rafael Wysocki) - Add PL4 support for Ice Lake to the Intel RAPL power capping driver (Daniel Tang) - Replace sprintf() with sysfs_emit() in power capping sysfs show functions (Sumeet Pawnikar) - Make dev_pm_opp_get_level() return value match the documentation after a previous update of the latter (Aleks Todorov) - Use scoped for each OF child loop in the OPP code (Krzysztof Kozlowski) - Fix a bug in an example code snippet and correct typos in the energy model management documentation (Patrick Little) - Fix miscellaneous problems in cpupower (Kaushlendra Kumar): * idle_monitor: Fix incorrect value logged after stop * Fix inverted APERF capability check * Use strcspn() to strip trailing newline * Reset errno before strtoull() * Show C0 in idle-info dump - Improve cpupower installation procedure by making the systemd step optional and allowing users to disable the installation of systemd's unit file (João Marcos Costa)" * tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits) PM: sleep: core: Avoid bit field races related to work_in_progress PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races cpufreq: Documentation: Update description of rate_limit_us default value cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible PM: wakeup: Handle empty list in wakeup_sources_walk_start() PM: EM: Documentation: Fix bug in example code snippet Documentation: Fix typos in energy model documentation cpuidle: governors: teo: Refine intercepts-based idle state lookup cpuidle: governors: teo: Adjust the classification of wakeup events cpufreq: ondemand: Simplify idle cputime granularity test cpufreq: userspace: make scaling_setspeed return the actual requested frequency PM: hibernate: Drop NULL pointer checks before acomp_request_free() cpufreq: CPPC: Add generic helpers for sysfs show/store cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id() cpufreq: ti-cpufreq: add support for AM62L3 SoC cpufreq: dt-platdev: Add ti,am62l3 to blocklist cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy cpufreq: scmi: correct SCMI explanation cpufreq: dt-platdev: Block the driver from probing on more QC platforms rust: cpumask: rename methods of Cpumask for clarity and consistency ...
2026-02-09Merge tag 'acpi-6.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "This one is significantly larger than previous ACPI support pull requests because several significant updates have coincided in it. First, there is a routine ACPICA code update, to upstream version 20251212, but this time it covers new ACPI 6.6 material that has not been covered yet. Among other things, it includes definitions of a few new ACPI tables and updates of some others, like the GICv5 MADT structures and ARM IORT IWB node definitions that are used for adding GICv5 ACPI probing on ARM (that technically is IRQ subsystem material, but it depends on the ACPICA changes, so it is included here). The latter alone adds a few hundred lines of new code. Second, there is an update of ACPI _OSC handling including a fix that prevents failures from occurring in some corner cases due to careless handling of _OSC error bits. On top of that, the "system resource" ACPI device objects with the PNP0C01 and PNP0C02 are now going to be handled by the ACPI core device enumeration code instead of handing them over to the legacy PNP system driver which causes device enumeration issues to occur. Some of those issues have been worked around in device drivers and elsewhere and those workarounds should not be necessary any more, so they are going away. Moreover, the time has come to convert all "core ACPI" device drivers that were still using struct acpi_driver objects for device binding into proper platform drivers that use struct platform_driver for this purpose. These updates are accompanied by some requisite core ACPI device enumeration code changes. Next, there are ACPI APEI updates, including changes to avoid excess overhead in the NMI handler and in SEA on the ARM side, changes to unify ACPI-based HW error tracing and logging, and changes to prevent APEI code from reaching out of its allocated memory. There are also some ACPI power management updates, mostly related to the ACPI cpuidle support in the processor driver, suspend-to-idle handling on systems with ACPI support and to ACPI PM of devices. In addition to the above, bugs are fixed and the code is cleaned up in assorted places all over. Specifics: - Update the ACPICA code in the kernel to upstream version 20251212 which includes the following changes: * Add support for new ACPI table DTPR (Michal Camacho Romero) * Release objects with acpi_ut_delete_object_desc() (Zilin Guan) * Add UUIDs for Microsoft fan extensions and UUIDs associated with TPM 2.0 devices (Armin Wolf) * Fix NULL pointer dereference in acpi_ev_address_space_dispatch() (Alexey Simakov) * Add KEYP ACPI table definition (Dave Jiang) * Add support for the Microsoft display mux _OSI string (Armin Wolf) * Add definitions for the IOVT ACPI table (Xianglai Li) * Abort AML bytecode execution on AML_FATAL_OP (Armin Wolf) * Include all fields in subtable type1 for PPTT (Ben Horgan) * Add GICv5 MADT structures and Arm IORT IWB node definitions (Jose Marinho) * Update Parameter Block structure for RAS2 and add a new flag in Memory Affinity Structure for SRAT (Pawel Chmielewski) * Add _VDM (Voltage Domain) object (Pawel Chmielewski) - Add support for GICv5 ACPI probing on ARM which is based on the GICv5 MADT structures and ARM IORT IWB node definitions recently added to ACPICA (Lorenzo Pieralisi) - Rework ACPI PM notification setup for PCI root buses and modify the ACPI PM setup for devices to register wakeup source objects under physical (that is, PCI, platform, etc.) devices instead of doing that under their ACPI companions (Rafael Wysocki) - Adjust debug messages regarding postponed ACPI PM printed during system resume to be more accurate (Rafael Wysocki) - Remove dead code from lps0_device_attach() (Gergo Koteles) - Start to invoke Microsoft Function 9 (Turn On Display) of the Low- Power S0 Idle (LPS0) _DSM in the suspend-to-idle resume flow on systems with ACPI LPS0 support to address a functional issue on Lenovo Yoga Slim 7i Aura (15ILL9), where system fans and keyboard backlights fail to resume after suspend (Jakob Riemenschneider) - Add sysfs attribute cid for exposing _CID lists under ACPI device objects (Rafael Wysocki) - Replace sprintf() with sysfs_emit() in all of the core ACPI sysfs interface code (Sumeet Pawnikar) - Use acpi_get_local_u64_address() in the code implementing ACPI support for PCI to evaluate _ADR instead of evaluating that object directly (Andy Shevchenko) - Add JWIPC JVC9100 to irq1_level_low_skip_override[] to unbreak serial IRQs on that system (Ai Chao) - Fix handling of _OSC errors in acpi_run_osc() to avoid failures on systems where _OSC error bits are set even though the _OSC return buffer contains acknowledged feature bits (Rafael Wysocki) - Clean up and rearrange \_SB._OSC handling for general platform features and USB4 features to avoid code duplication and unnecessary memory management overhead (Rafael Wysocki) - Make the ACPI core device enumeration code handle PNP0C01 and PNP0C02 ("system resource") device objects directly instead of letting the legacy PNP system driver handle them to avoid device enumeration issues on systems where PNP0C02 is present in the _CID list under ACPI device objects with a _HID matching a proper device driver in Linux (Rafael Wysocki) - Drop workarounds for the known device enumeration issues related to _CID lists containing PNP0C02 (Rafael Wysocki) - Drop outdated comment regarding removed function in the ACPI-based device enumeration code (Julia Lawall) - Make PRP0001 device matching work as expected for ACPI device objects using it as a _HID for board development and similar purposes (Kartik Rajput) - Use async schedule function in acpi_scan_clear_dep_fn() to avoid races with user space initialization on some systems (Yicong Yang) - Add a piece of documentation explaining why binding drivers directly to ACPI device objects is not a good idea in general and why it is desirable to convert drivers doing so into proper platform drivers that use struct platform_driver for device binding (Rafael Wysocki) - Convert multiple "core ACPI" drivers, including the NFIT ACPI device driver, the generic ACPI button drivers, the generic ACPI thermal zone driver, the ACPI hardware event device (HED) driver, the ACPI EC driver, the ACPI SMBUS HC driver, the ACPI Smart Battery Subsystem (SBS) driver, and the ACPI backlight (video) driver to proper platform drivers that use struct platform_driver for device binding (Rafael Wysocki) - Use acpi_get_local_u64_address() in the ACPI backlight (video) driver to evaluate _ADR instead of evaluating that object directly (Andy Shevchenko) - Convert the generic ACPI battery driver to a proper platform driver using struct platform_driver for device binding (Rafael Wysocki) - Fix incorrect charging status when current is zero in the generic ACPI battery driver (Ata İlhan Köktürk) - Use LIST_HEAD() for initializing a stack-allocated list in the generic ACPI watchdog device driver (Can Peng) - Rework the ACPI idle driver initialization to register it directly from the common initialization code instead of doing that from a CPU hotplug "online" callback and clean it up (Huisong Li, Rafael Wysocki) - Fix a possible NULL pointer dereference in acpi_processor_errata_piix4() (Tuo Li) - Make read-only array non_mmio_desc[] static const (Colin Ian King) - Prevent the APEI GHES support code on ARM from accessing memory out of bounds or going past the ARM processor CPER record buffer (Mauro Carvalho Chehab) - Prevent cper_print_fw_err() from dumping the entire memory on systems with defective firmware (Mauro Carvalho Chehab) - Improve ghes_notify_nmi() status check to avoid unnecessary overhead in the NMI handler by carrying out all of the requisite preparations and the NMI registration time (Tony Luck) - Refactor the GHES driver by extracting common functionality into reusable helper functions to reduce code duplication and improve the ghes_notify_sea() status check in analogy with the previous ghes_notify_nmi() status check improvement (Shuai Xue) - Make ELOG and GHES log and trace consistently and support the CPER CXL protocol analogously (Fabio De Francesco) - Disable KASAN instrumentation in the APEI GHES driver when compile testing with clang < 18 (Nathan Chancellor) - Let ghes_edac be the preferred driver to load on __ZX__ and _BYO_ systems by extending the platform detection list in the APEI GHES driver (Tony W Wang-oc) - Clean up cppc_perf_caps and cppc_perf_ctrls structs and rename EPP constants for clarity in the ACPI CPPC library (Sumit Gupta)" * tag 'acpi-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (117 commits) ACPI: battery: fix incorrect charging status when current is zero ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display) ACPI: APEI: GHES: Add ghes_edac support for __ZX__ and _BYO_ systems ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18 ACPI: sysfs: Replace sprintf() with sysfs_emit() ACPI: CPPC: Rename EPP constants for clarity ACPI: CPPC: Clean up cppc_perf_caps and cppc_perf_ctrls structs ACPI: processor: idle: Rework the handling of acpi_processor_ffh_lpi_probe() ACPI: processor: idle: Convert acpi_processor_setup_cpuidle_dev() to void ACPI: processor: idle: Convert acpi_processor_setup_cpuidle_states() to void irqchip/gic-v5: Add ACPI IWB probing irqchip/gic-v5: Add ACPI ITS probing irqchip/gic-v5: Add ACPI IRS probing irqchip/gic-v5: Split IRS probing into OF and generic portions PCI/MSI: Make the pci_msi_map_rid_ctlr_node() interface firmware agnostic irqdomain: Add parent field to struct irqchip_fwid ACPI: PCI: simplify code with acpi_get_local_u64_address() ACPI: video: simplify code with acpi_get_local_u64_address() ACPI: PM: Adjust messages regarding postponed ACPI PM ...
2026-02-09Merge tag 'for-7.0/block-20260206' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...
2026-02-09Merge tag 'io_uring-bpf-restrictions.4-20260206' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring bpf filters from Jens Axboe: "This adds support for both cBPF filters for io_uring, as well as task inherited restrictions and filters. seccomp and io_uring don't play along nicely, as most of the interesting data to filter on resides somewhat out-of-band, in the submission queue ring. As a result, things like containers and systemd that apply seccomp filters, can't filter io_uring operations. That leaves them with just one choice if filtering is critical - filter the actual io_uring_setup(2) system call to simply disallow io_uring. That's rather unfortunate, and has limited us because of it. io_uring already has some filtering support. It requires the ring to be setup in a disabled state, and then a filter set can be applied. This filter set is completely bi-modal - an opcode is either enabled or it's not. Once a filter set is registered, the ring can be enabled. This is very restrictive, and it's not useful at all to systemd or containers which really want both broader and more specific control. This first adds support for cBPF filters for opcodes, which enables tighter control over what exactly a specific opcode may do. As examples, specific support is added for IORING_OP_OPENAT/OPENAT2, allowing filtering on resolve flags. And another example is added for IORING_OP_SOCKET, allowing filtering on domain/type/protocol. These are both common use cases. cBPF was chosen rather than eBPF, because the latter is often restricted in containers as well. These filters are run post the init phase of the request, which allows filters to even dip into data that is being passed in struct in user memory, as the init side of requests make that data stable by bringing it into the kernel. This allows filtering without needing to copy this data twice, or have filters etc know about the exact layout of the user data. The filters get the already copied and sanitized data passed. On top of that support is added for per-task filters, meaning that any ring created with a task that has a per-task filter will get those filters applied when it's created. These filters are inherited across fork as well. Once a filter has been registered, any further added filters may only further restrict what operations are permitted. Filters cannot change the return value of an operation, they can only permit or deny it based on the contents" * tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring: allow registration of per-task restrictions io_uring: add task fork hook io_uring/bpf_filter: add ref counts to struct io_bpf_filter io_uring/bpf_filter: cache lookup table in ctx->bpf_filters io_uring/bpf_filter: allow filtering on contents of struct open_how io_uring/net: allow filtering on IORING_OP_SOCKET data io_uring: add support for BPF filtering for opcode restrictions
2026-02-09Merge tag 'pull-filename' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs 'struct filename' updates from Al Viro: "[Mostly] sanitize struct filename handling" * tag 'pull-filename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (68 commits) sysfs(2): fs_index() argument is _not_ a pathname alpha: switch osf_mount() to strndup_user() ksmbd: use CLASS(filename_kernel) mqueue: switch to CLASS(filename) user_statfs(): switch to CLASS(filename) statx: switch to CLASS(filename_maybe_null) quotactl_block(): switch to CLASS(filename) chroot(2): switch to CLASS(filename) move_mount(2): switch to CLASS(filename_maybe_null) namei.c: switch user pathname imports to CLASS(filename{,_flags}) namei.c: convert getname_kernel() callers to CLASS(filename_kernel) do_f{chmod,chown,access}at(): use CLASS(filename_uflags) do_readlinkat(): switch to CLASS(filename_flags) do_sys_truncate(): switch to CLASS(filename) do_utimes_path(): switch to CLASS(filename_uflags) chdir(2): unspaghettify a bit... do_fchownat(): unspaghettify a bit... fspick(2): use CLASS(filename_flags) name_to_handle_at(): use CLASS(filename_uflags) vfs_open_tree(): use CLASS(filename_uflags) ...
2026-02-09tracing: Move d_max_latency out of CONFIG_FSNOTIFY protectionSteven Rostedt
The tracing_max_latency shouldn't be limited if CONFIG_FSNOTIFY is defined or not and it was moved out of that protection to be always available with CONFIG_TRACER_MAX_TRACE. All was moved out except the dentry descriptor for it (d_max_latency) and it failed to build on some configs. Move that out of the CONFIG_FSNOTIFY protection too. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260209194631.788bfc85@fedora Fixes: ba73713da50e ("tracing: Clean up use of trace_create_maxlat_file()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602092133.fTdojd95-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-09Merge tag 'vfs-7.0-rc1.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains a mix of VFS cleanups, performance improvements, API fixes, documentation, and a deprecation notice. Scalability and performance: - Rework pid allocation to only take pidmap_lock once instead of twice during alloc_pid(), improving thread creation/teardown throughput by 10-16% depending on false-sharing luck. Pad the namespace refcount to reduce false-sharing - Track file lock presence via a flag in ->i_opflags instead of reading ->i_flctx, avoiding false-sharing with ->i_readcount on open/close hot paths. Measured 4-16% improvement on 24-core open-in-a-loop benchmarks - Use a consume fence in locks_inode_context() to match the store-release/load-consume idiom, eliminating a hardware fence on some architectures - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent false-sharing - Remove a redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu() that never fires since the caller already verifies it, eliminating a 100% mispredicted branch - Fix a 100% mispredicted likely() in devcgroup_inode_permission() that became wrong after a prior code reorder Bug fixes and correctness: - Make insert_inode_locked() wait for inode destruction instead of skipping, fixing a corner case where two matching inodes could exist in the hash - Move f_mode initialization before file_ref_init() in alloc_file() to respect the SLAB_TYPESAFE_BY_RCU ordering contract - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with no buffers attached, preventing a null pointer dereference when AS_RELEASE_ALWAYS is set but no release_folio op exists - Fix select restart_block to store end_time as timespec64, avoiding truncation of tv_sec on 32-bit architectures - Make dump_inode() use get_kernel_nofault() to safely access inode and superblock fields, matching the dump_mapping() pattern API modernization: - Make posix_acl_to_xattr() allocate the buffer internally since every single caller was doing it anyway. Reduces boilerplate and unnecessary error checking across ~15 filesystems - Replace deprecated simple_strtoul() with kstrtoul() for the ihash_entries, dhash_entries, mhash_entries, and mphash_entries boot parameters, adding proper error handling - Convert chardev code to use guard(mutex) and __free(kfree) cleanup patterns - Replace min_t() with min() or umin() in VFS code to avoid silently truncating unsigned long to unsigned int - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers already check the flag Deprecation: - Begin deprecating legacy BSD process accounting (acct(2)). The interface has numerous footguns and better alternatives exist (eBPF) Documentation: - Fix and complete kernel-doc for struct export_operations, removing duplicated documentation between ReST and source - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait() Testing: - Add a kunit test for initramfs cpio handling of entries with filesize > PATH_MAX Misc: - Add missing <linux/init_task.h> include in fs_struct.c" * tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) posix_acl: make posix_acl_to_xattr() alloc the buffer fs: make insert_inode_locked() wait for inode destruction initramfs_test: kunit test for cpio.filesize > PATH_MAX fs: improve dump_inode() to safely access inode fields fs: add <linux/init_task.h> for 'init_fs' docs: exportfs: Use source code struct documentation fs: move initializing f_mode before file_ref_init() exportfs: Complete kernel-doc for struct export_operations exportfs: Mark struct export_operations functions at kernel-doc exportfs: Fix kernel-doc output for get_name() acct(2): begin the deprecation of legacy BSD process accounting device_cgroup: remove branch hint after code refactor VFS: fix __start_dirop() kernel-doc warnings fs: Describe @isnew parameter in ilookup5_nowait() fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS select: store end_time as timespec64 in restart block chardev: Switch to guard(mutex) and __free(kfree) namespace: Replace simple_strtoul with kstrtoul to parse boot params dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries ...
2026-02-09Merge tag 'lsm-pr-20260203' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Unify the security_inode_listsecurity() calls in NFSv4 While looking at security_inode_listsecurity() with an eye towards improving the interface, we realized that the NFSv4 code was making multiple calls to the LSM hook that could be consolidated into one. - Mark the LSM static branch keys as static - this helps resolve some sparse warnings - Add __rust_helper annotations to the LSM and cred wrapper functions - Remove the unsused set_security_override_from_ctx() function - Minor fixes to some of the LSM kdoc comment blocks * tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: make keys for static branch static cred: remove unused set_security_override_from_ctx() rust: security: add __rust_helper to helpers rust: cred: add __rust_helper to helpers nfs: unify security_inode_listsecurity() calls lsm: fix kernel-doc struct member names
2026-02-09Merge tag 'audit-pr-20260203' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: - Improve the NETFILTER_PKT audit records Add source and destination ports to the NETFILTER_PKT audit records while also consolidating a lot of the code into a new, singular audit_log_nf_skb() function. This new approach to structuring the NETFILTER_PKT record generation should eliminate some unnecessary overhead when audit is not built into the kernel. - Update the audit syscall classifier code Add the listxattrat(), getxattrat(), and fchmodat2() syscall to the audit code which classifies syscalls into categories of operations, e.g. "read" or "change attributes". - Move the syscall classifier declarations into audit_arch.h Shuffle around some header file declarations to resolve some sparse warnings. * tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: move the compat_xxx_class[] extern declarations to audit_arch.h audit: add missing syscalls to read class audit: include source and destination ports to NETFILTER_PKT audit: add audit_log_nf_skb helper function audit: add fchmodat2() to change attributes class
2026-02-09Merge tag 'rcu.release.v7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Boqun Feng: - RCU Tasks Trace: Re-implement RCU tasks trace in term of SRCU-fast, not only more than 500 lines of code are saved because of the reimplementation, a new set of API, rcu_read_{,un}lock_tasks_trace(), becomes possible as well. Compared to the previous rcu_read_{,un}lock_trace(), the new API avoid the task_struct accesses thanks to the SRCU-fast semantics. As a result, the old rcu_read{,un}lock_trace() API is now deprecated. - RCU Torture Test: - Multiple improvements on kvm-series.sh (parallel run and progress showing metrics) - Add context checks to rcu_torture_timer() - Make config2csv.sh properly handle comments in .boot files - Include commit discription in testid.txt - Miscellaneous RCU changes: - Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early - Use suitable gfp_flags for the init_srcu_struct_nodes() - Fix rcu_read_unlock() deadloop due to softirq - Correctly compute probability to invoke ->exp_current() in rcutorture - Make expedited RCU CPU stall warnings detect stall-end races - RCU nocb: - Remove unnecessary WakeOvfIsDeferred wake path and callback overload handling - Extract nocb_defer_wakeup_cancel() helper * tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (25 commits) rcu/nocb: Extract nocb_defer_wakeup_cancel() helper rcu/nocb: Remove dead callback overload handling rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early srcu: Use suitable gfp_flags for the init_srcu_struct_nodes() rcu: Fix rcu_read_unlock() deadloop due to softirq rcutorture: Correctly compute probability to invoke ->exp_current() rcu: Make expedited RCU CPU stall warnings detect stall-end races rcutorture: Add --kill-previous option to terminate previous kvm.sh runs rcutorture: Prevent concurrent kvm.sh runs on same source tree torture: Include commit discription in testid.txt torture: Make config2csv.sh properly handle comments in .boot files torture: Make kvm-series.sh give run numbers and totals torture: Make kvm-series.sh give build numbers and totals torture: Parallelize kvm-series.sh guest-OS execution rcutorture: Add context checks to rcu_torture_timer() rcutorture: Test rcu_tasks_trace_expedite_current() srcu: Create an rcu_tasks_trace_expedite_current() function checkpatch: Deprecate rcu_read_{,un}lock_trace() rcu: Update Requirements.rst for RCU Tasks Trace ...
2026-02-08tracing: Better separate SNAPSHOT and MAX_TRACE optionsSteven Rostedt
The latency tracers (scheduler, irqsoff, etc) were created when tracing was first added. These tracers required a "snapshot" buffer that was the same size as the ring buffer being written to. When a new max latency was hit, the main ring buffer would swap with the snapshot buffer so that the trace leading up to the latency would be saved in the snapshot buffer (The snapshot buffer is never written to directly and the data within it can be viewed without fear of being overwritten). Later, a new feature was added to allow snapshots to be taken by user space or even event triggers. This created a "snapshot" file that allowed users to trigger a snapshot from user space to save the current trace. The config for this new feature (CONFIG_TRACER_SNAPSHOT) would select the latency tracer config (CONFIG_TRACER_MAX_LATENCY) as it would need all the functionality from it as it already existed. But this was incorrect. As the snapshot feature is really what the latency tracers need and not the other way around. Have CONFIG_TRACER_MAX_TRACE select CONFIG_TRACER_SNAPSHOT where the tracers that needs the max latency buffer selects the TRACE_MAX_TRACE which will then select TRACER_SNAPSHOT. Also, go through trace.c and trace.h and make the code that only needs the TRACER_MAX_TRACE protected by that and the code that always requires the snapshot to be protected by TRACER_SNAPSHOT. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208183856.767870992@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-08tracing: Add tracer_uses_snapshot() helper to remove #ifdefsSteven Rostedt
Instead of having #ifdef CONFIG_TRACER_MAX_TRACE around every access to the struct tracer's use_max_tr field, add a helper function for that access and if CONFIG_TRACER_MAX_TRACE is not configured it just returns false. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208183856.599390238@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-08tracing: Rename trace_array field max_buffer to snapshot_bufferSteven Rostedt
When tracing was first added, there were latency tracers that would take a snapshot of the current trace when a new max latency was hit. This snapshot buffer was called "max_buffer". Since then, a snapshot feature was added that allowed user space or event triggers to trigger a snapshot of the current buffer using the same max_buffer of the trace_array. As this snapshot buffer now has a more generic use case, calling it "max_buffer" is confusing. Rename it to snapshot_buffer. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208183856.428446729@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-08tracing: Move pid filtering into trace_pid.cSteven Rostedt
The trace.c file was a dumping ground for most tracing code. Start organizing it better by moving various functions out into their own files. Move the PID filtering functions from trace.c into its own trace_pid.c file. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208032450.998330662@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-08tracing: Move trace_printk functions out of trace.c and into trace_printk.cSteven Rostedt
The file trace.c has become a catchall for most things tracing. Start making it smaller by breaking out various aspects into their own files. Move the functions associated to the trace_printk operations out of trace.c and into trace_printk.c. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208032450.828744197@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-08tracing: Use system_state in trace_printk_init_buffers()Steven Rostedt
The function trace_printk_init_buffers() is used to expand tha trace_printk buffers when trace_printk() is used within the kernel or in modules. On kernel boot up, it holds off from starting the sched switch cmdline recorder, but will start it immediately when it is added by a module. Currently it uses a trick to see if the global_trace buffer has been allocated or not to know if it was called by module load or not. But this is more of a hack, and can not be used when this code is moved out of trace.c. Instead simply look at the system_state and if it is running then it is know that it could only be called by module load. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://patch.msgid.link/20260208032450.660237094@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>