summaryrefslogtreecommitdiff
path: root/kernel/vmcore_info.c
AgeCommit message (Collapse)Author
2026-02-12Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...
2026-02-11Merge tag 'kbuild-7.0-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nathan Chancellor: "Kbuild: - Drop '*_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) - Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) - Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) - Update gendwarfksyms documentation with required dependencies (Jihan LIN) - Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) - Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) - Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) - Performance and usability improvements to scripts/make_fit.py (Simon Glass) - Minor various clean ups and fixes Kconfig: - Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) - Support depends on FOO if BAR as syntactic sugar for depends on FOO || !BAR (Nicolas Pitre, Graham Roff) - Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli)" * tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits) kbuild: remove dependency of run-command on config scripts/make_fit: Compress dtbs in parallel scripts/make_fit: Support a few more parallel compressors kbuild: Support a FIT_EXTRA_ARGS environment variable scripts/make_fit: Move dtb processing into a function scripts/make_fit: Support an initial ramdisk scripts/make_fit: Speed up operation rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option MAINTAINERS: Add scripts/install.sh into Kbuild entry modpost: Amend ppc64 save/restfpr symnames for -Os build MIPS: tools: relocs: Ship a definition of R_MIPS_PC32 streamline_config.pl: remove superfluous exclamation mark kbuild: dummy-tools: Add python3 scripts: kconfig: merge_config.sh: warn on duplicate input files scripts: kconfig: merge_config.sh: use awk in checks too scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk kallsyms: Get rid of kallsyms relative base mips: Add support for PC32 relocations in vmlinux Documentation: dev-tools: add container.rst page scripts: add tool to run containerized builds ...
2026-01-31Merge branch 'mm-hotfixes-stable' into mm-nonmm-stable to pick up changesAndrew Morton
required to merge "kho: use unsigned long for nr_pages".
2026-01-26vmcoreinfo: make hwerr_data visible for debuggingBreno Leitao
If the kernel is compiled with LTO, hwerr_data symbol might be lost, and vmcoreinfo doesn't have it dumped. This is currently seen in some production kernels with LTO enabled. Remove the static qualifier from hwerr_data so that the information is still preserved when the kernel is built with LTO. Making hwerr_data a global symbol ensures its debug info survives the LTO link process and appears in kallsyms. Also document it, so it doesn't get removed in the future as suggested by akpm. Link: https://lkml.kernel.org/r/20260122-fix_vmcoreinfo-v2-1-2d6311f9e36c@debian.org Fixes: 3fa805c37dd4 ("vmcoreinfo: track and log recoverable hardware errors") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Omar Sandoval <osandov@osandov.com> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Zhiquan Li <zhiquan1.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-22kallsyms: Get rid of kallsyms relative baseArd Biesheuvel
When the kallsyms relative base was introduced, per-CPU variable references on x86_64 SMP were implemented as offsets into the respective per-CPU region, rather than offsets relative to the location of the variable's template in the kernel image, which is how other architectures implement it. This required kallsyms to reason about the difference between the two, and the sign of the value in the kallsyms_offsets[] array was used to distinguish them. This meant that negative offsets were not permitted for ordinary variables, and so it was crucial that the relative base was chosen such that all offsets were positive numbers. This is no longer needed: instead, the offsets can simply be encoded as values in the range -/+ 2 GiB, which is precisely what PC32 relocations provide on most architectures. So it is possible to simplify the logic, and just use _text as the anchor directly, and let the linker calculate the final value based on the location of the entry itself. Some architectures (nios2, extensa) do not support place-relative relocations at all, but these are all 32-bit and non-relocatable, and so there is no need for place-relative relocations in the first place, and the actual symbol values can just be stored directly. This makes all entries in the kallsyms_offsets[] array visible as place-relative references in the ELF metadata, which will be important when implementing ELF-based fg-kaslr. Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://patch.msgid.link/20260116093359.2442297-6-ardb+git@google.com Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-01-20kernel: vmcoreinfo: allocate vmcoreinfo_data based on VMCOREINFO_BYTESPnina Feder
Patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE". VMCOREINFO_BYTES is defined as a configurable size, but multiple code paths implicitly assume it always fits into a single page. This series removes that assumption by allocating and mapping vmcoreinfo based on its actual size. Patch 1 updates vmcoreinfo allocation to use get_order(VMCOREINFO_BYTES). Patch 2 updates crash kernel handling to correctly allocate and map multiple pages when copying vmcoreinfo. This makes vmcoreinfo size consistent across the kernel and avoids future breakage if VMCOREINFO_BYTES grows. (No functional change when VMCOREINFO_BYTES == PAGE_SIZE.) This patch (of 2): VMCOREINFO_BYTES defines the size of vmcoreinfo data, but the current implementation assumes a single page allocation. Allocate vmcoreinfo_data using get_order(VMCOREINFO_BYTES) so that vmcoreinfo can safely grow beyond PAGE_SIZE. This avoids hidden assumptions and keeps vmcoreinfo size consistent across the kernel. Link: https://lkml.kernel.org/r/20251216132801.807260-1-pnina.feder@mobileye.com Link: https://lkml.kernel.org/r/20251216132801.807260-2-pnina.feder@mobileye.com Signed-off-by: Pnina Feder <pnina.feder@mobileye.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27vmcoreinfo: track and log recoverable hardware errorsBreno Leitao
Introduce a generic infrastructure for tracking recoverable hardware errors (HW errors that are visible to the OS but does not cause a panic) and record them for vmcore consumption. This aids post-mortem crash analysis tools by preserving a count and timestamp for the last occurrence of such errors. On the other side, correctable errors, which the OS typically remains unaware of because the underlying hardware handles them transparently, are less relevant for crash dump and therefore are NOT tracked in this infrastructure. Add centralized logging for sources of recoverable hardware errors based on the subsystem it has been notified. hwerror_data is write-only at kernel runtime, and it is meant to be read from vmcore using tools like crash/drgn. For example, this is how it looks like when opening the crashdump from drgn. >>> prog['hwerror_data'] (struct hwerror_info[1]){ { .count = (int)844, .timestamp = (time64_t)1752852018, }, ... This helps fleet operators quickly triage whether a crash may be influenced by hardware recoverable errors (which executes a uncommon code path in the kernel), especially when recoverable errors occurred shortly before a panic, such as the bug fixed by commit ee62ce7a1d90 ("page_pool: Track DMA-mapped pages and unmap them when destroying the pool") This is not intended to replace full hardware diagnostics but provides a fast way to correlate hardware events with kernel panics quickly. Rare machine check exceptions—like those indicated by mce_flags.p5 or mce_flags.winchip—are not accounted for in this method, as they fall outside the intended usage scope for this feature's user base. [leitao@debian.org: add hw-recoverable-errors to toctree] Link: https://lkml.kernel.org/r/20251127-vmcoreinfo_fix-v1-1-26f5b1c43da9@debian.org Link: https://lkml.kernel.org/r/20251010-vmcore_hw_error-v5-1-636ede3efe44@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Suggested-by: Tony Luck <tony.luck@intel.com> Suggested-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> [APEI] Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Bob Moore <robert.moore@intel.com> Cc: Borislav Betkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com> Cc: Len Brown <lenb@kernel.org> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Omar Sandoval <osandov@osandov.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfoZhiquan Li
On Intel TDX guest, unaccepted memory is unusable free memory which is not managed by buddy, until it's accepted by guest. Before that, it cannot be accessed by the first kernel as well as the kexec'ed kernel. The kexec'ed kernel will skip these pages and fill in zero data for the reader of vmcore. The dump tool like makedumpfile creates a page descriptor (size 24 bytes) for each non-free page, including zero data page, but it will not create descriptor for free pages. If it is not able to distinguish these unaccepted pages with zero data pages, a certain amount of space will be wasted in proportion (~1/170). In fact, as a special kind of free page the unaccepted pages should be excluded, like the real free pages. Export the page type PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo, so that dump tool can identify whether a page is unaccepted. [zhiquan1.li@intel.com: fix docs: "Title underline too short" warning] Link: https://lore.kernel.org/all/20240809114854.3745464-5-kirill.shutemov@linux.intel.com/ Link: https://lkml.kernel.org/r/20250405060610.860465-1-zhiquan1.li@intel.com Link: https://lore.kernel.org/all/20240809114854.3745464-5-kirill.shutemov@linux.intel.com/ Link: https://lkml.kernel.org/r/20250403030801.758687-1-zhiquan1.li@intel.com Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Zhiquan Li <zhiquan1.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03mm: support only one page_type per pageMatthew Wilcox (Oracle)
By using a few values in the top byte, users of page_type can store up to 24 bits of additional data in page_type. It also reduces the code size as (with replacement of READ_ONCE() with data_race()), the kernel can check just a single byte. eg: ffffffff811e3a79: 8b 47 30 mov 0x30(%rdi),%eax ffffffff811e3a7c: 55 push %rbp ffffffff811e3a7d: 48 89 e5 mov %rsp,%rbp ffffffff811e3a80: 25 00 00 00 82 and $0x82000000,%eax ffffffff811e3a85: 3d 00 00 00 80 cmp $0x80000000,%eax ffffffff811e3a8a: 74 4d je ffffffff811e3ad9 <folio_mapping+0x69> becomes: ffffffff811e3a69: 80 7f 33 f5 cmpb $0xf5,0x33(%rdi) ffffffff811e3a6d: 55 push %rbp ffffffff811e3a6e: 48 89 e5 mov %rsp,%rbp ffffffff811e3a71: 74 4d je ffffffff811e3ac0 <folio_mapping+0x60> replacing three instructions with one. [wangkefeng.wang@huawei.com: fix ubsan warnings] Link: https://lkml.kernel.org/r/2d19c48a-c550-4345-bf36-d05cd303c5de@huawei.com Link: https://lkml.kernel.org/r/20240821173914.2270383-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-20kallsyms: get rid of code for absolute kallsymsJann Horn
Commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture") removed the last use of the absolute kallsyms. Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/all/20240221202655.2423854-1-jannh@google.com/ [masahiroy@kernel.org: rebase the code and reword the commit description] Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-04-25mm: free up PG_slabMatthew Wilcox (Oracle)
Reclaim the Slab page flag by using a spare bit in PageType. We are perennially short of page flags for various purposes, and now that the original SLAB allocator has been retired, SLUB does not use the mapcount/page_type field. This lets us remove a number of special cases for ignoring mapcount on Slab pages. [willy@infradead.org: update vmcoreinfo] Link: https://lkml.kernel.org/r/ZgGV-O8WYQ_83kxp@casper.infradead.org Link: https://lkml.kernel.org/r/20240321142448.1645400-8-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-24mm: turn folio_test_hugetlb into a PageTypeMatthew Wilcox (Oracle)
The current folio_test_hugetlb() can be fooled by a concurrent folio split into returning true for a folio which has never belonged to hugetlbfs. This can't happen if the caller holds a refcount on it, but we have a few places (memory-failure, compaction, procfs) which do not and should not take a speculative reference. Since hugetlb pages do not use individual page mapcounts (they are always fully mapped and use the entire_mapcount field to record the number of mappings), the PageType field is available now that page_mapcount() ignores the value in this field. In compaction and with CONFIG_DEBUG_VM enabled, the current implementation can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks in the PageHuge() testing path. [willy@infradead.org: update vmcoreinfo] Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Luis Chamberlain <mcgrof@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227 Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04crash_core: export vmemmap when CONFIG_SPARSEMEM_VMEMMAP is enabledHuang Shijie
In memory_model.h, if CONFIG_SPARSEMEM_VMEMMAP is configed, kernel will use vmemmap to do the __pfn_to_page/page_to_pfn, and kernel will not use the "classic sparse" to do the __pfn_to_page/page_to_pfn. So export the vmemmap when CONFIG_SPARSEMEM_VMEMMAP is configed. This makes the user applications (crash, etc) get faster pfn_to_page/page_to_pfn operations too. Link: https://lkml.kernel.org/r/20240227014952.3184-1-shijie@os.amperecomputing.com Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Kazuhito Hagio <k-hagio-ab@nec.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23crash: split vmcoreinfo exporting code out from crash_core.cBaoquan He
Now move the relevant codes into separate files: kernel/crash_reserve.c, include/linux/crash_reserve.h. And add config item CRASH_RESERVE to control its enabling. And also update the old ifdeffery of CONFIG_CRASH_CORE, including of <linux/crash_core.h> and config item dependency on CRASH_CORE accordingly. And also do renaming as follows: - arch/xxx/kernel/{crash_core.c => vmcore_info.c} because they are only related to vmcoreinfo exporting on x86, arm64, riscv. And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to decide if build in crash_core.c. [yang.lee@linux.alibaba.com: remove duplicated include in vmcore_info.c] Link: https://lkml.kernel.org/r/20240126005744.16561-1-yang.lee@linux.alibaba.com Link: https://lkml.kernel.org/r/20240124051254.67105-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pingfan Liu <piliu@redhat.com> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Michael Kelley <mhklinux@outlook.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>