summaryrefslogtreecommitdiff
path: root/tools/perf/util/header.c
AgeCommit message (Collapse)Author
2026-02-03perf header: Add e_machine/e_flags to the headerIan Rogers
Add 64-bits of feature data to record the ELF machine and flags. This allows readers to initialize based on the data. For example, `perf kvm stat` wants to initialize based on the kind of data to be read, but at initialization time there are no threads to base this data upon and using the host means cross platform support won't work. The values in the perf_env also act as a cache for these within the session. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-28perf header: Replace hardcoded max cpus by MAX_NR_CPUSSwapnil Sapkal
cpumask and cpulist from cpu-domain header have hardcoded max_cpus value of 1024. Current systems have more cpus than this value. Replace it with MAX_NR_CPUS. Also define a macro to represent domain name length. Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info") Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Chen Yu <yu.c.chen@intel.com> Cc: Gautham Shenoy <gautham.shenoy@amd.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-23perf header: Fix memory leaks in process_cpu_domain_info()Ian Rogers
do_read_string() returns a string in allocated memory, for some reason there was unused memory allocations and unnecessary strdups. Remove these and make the "perf annotate basic tests" leak sanitizer clean. Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Guo Ren <guoren@kernel.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Julia Lawall <Julia.Lawall@inria.fr> Cc: Justin Stitt <justinstitt@google.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Suchit Karunakaran <suchitkarunakaran@gmail.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Tianyou Li <tianyou.li@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Zecheng Li <zecheng@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-22perf sched stats: Add support for live modeSwapnil Sapkal
The live mode works similar to simple `perf stat` command, by profiling the target and printing results on the terminal as soon as the target finishes. Example usage: # perf sched stats -- true Description ---------------------------------------------------------------------------------------------------- DESC -> Description of the field COUNT -> Value of the field PCT_CHANGE -> Percent change with corresponding base value AVG_JIFFIES -> Avg time in jiffies between two consecutive occurrence of event ---------------------------------------------------------------------------------------------------- Time elapsed (in jiffies) : 1 ---------------------------------------------------------------------------------------------------- CPU: <ALL CPUS SUMMARY> ---------------------------------------------------------------------------------------------------- DESC COUNT PCT_CHANGE ---------------------------------------------------------------------------------------------------- yld_count : 0 array_exp : 0 sched_count : 0 sched_goidle : 0 ( 0.00% ) ttwu_count : 0 ttwu_local : 0 ( 0.00% ) rq_cpu_time : 27875 run_delay : 0 ( 0.00% ) pcount : 0 ---------------------------------------------------------------------------------------------------- CPU: <ALL CPUS SUMMARY> | DOMAIN: SMT ---------------------------------------------------------------------------------------------------- DESC COUNT AVG_JIFFIES ----------------------------------------- <Category busy> ------------------------------------------ busy_lb_count : 0 $ 0.00 $ busy_lb_balanced : 0 $ 0.00 $ busy_lb_failed : 0 $ 0.00 $ busy_lb_imbalance_load : 0 busy_lb_imbalance_util : 0 busy_lb_imbalance_task : 0 busy_lb_imbalance_misfit : 0 busy_lb_gained : 0 busy_lb_hot_gained : 0 busy_lb_nobusyq : 0 $ 0.00 $ busy_lb_nobusyg : 0 $ 0.00 $ *busy_lb_success_count : 0 *busy_lb_avg_pulled : 0.00 ... and so on. Output will show similar data for all the cpus in the system. Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Tested-by: James Clark <james.clark@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: David Vernet <void@manifault.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Gautham Shenoy <gautham.shenoy@amd.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yujie Liu <yujie.liu@intel.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> [ Avoid potentially using 'sv' uninitialized by calling free_cpu_domain_info() only when build_cpu_domain_map() is called ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-21perf header: Support CPU DOMAIN relation infoSwapnil Sapkal
The '/proc/schedstat' file gives info about load balancing statistics within a given domain. It also contains the cpu_mask giving information about the sibling cpus and domain names after schedstat version 17. Storing this information in perf header will help tools like `perf sched stats` for better analysis. Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: David Vernet <void@manifault.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Gautham Shenoy <gautham.shenoy@amd.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yujie Liu <yujie.liu@intel.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-12-07Merge tag 'perf-tools-for-v6.19-2025-12-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Perf event/metric description: Unify all event and metric descriptions in JSON format. Now event parsing and handling is greatly simplified by that. From users point of view, perf list will provide richer information about hardware events like the following. $ perf list hw List of pre-defined events (to be used in -e or -M): legacy hardware: branch-instructions [Retired branch instructions [This event is an alias of branches]. Unit: cpu] branch-misses [Mispredicted branch instructions. Unit: cpu] branches [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu] bus-cycles [Bus cycles,which can be different from total cycles. Unit: cpu] cache-misses [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu] cache-references [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu] cpu-cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu] cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu] instructions [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu] ref-cycles [Total cycles; not affected by CPU frequency scaling. Unit: cpu] But most notable changes would be in the perf stat. On the right side, the default metrics are better named and aligned. :) $ perf stat -- perf test -w noploop Performance counter stats for 'perf test -w noploop': 11 context-switches # 10.8 cs/sec cs_per_second 0 cpu-migrations # 0.0 migrations/sec migrations_per_second 3,612 page-faults # 3532.5 faults/sec page_faults_per_second 1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized 110,466 branch-misses # 0.0 % branch_miss_rate (88.66%) 6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%) 4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%) 27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%) TopdownL1 # 0.3 % tma_backend_bound # 9.3 % tma_bad_speculation (89.05%) # 9.7 % tma_frontend_bound (77.86%) # 80.7 % tma_retiring (88.81%) 1.025318171 seconds time elapsed 1.013248000 seconds user 0.012014000 seconds sys Deferred unwinding support: With the kernel support (commit c69993ecdd4d: "perf: Support deferred user unwind"), perf can use deferred callchains for userspace stack trace with frame pointers like below: $ perf record --call-graph fp,defer ... This will be transparent to users when it comes to other commands like perf report and perf script. They will merge the deferred callchains to the previous samples as if they were collected together. ARM SPE updates - Extensive enhancements to support various kinds of memory operations including GCS, MTE allocation tags, memcpy/memset, register access, and SIMD operations. - Add inverted data source filter (inv_data_src_filter) support to exclude certain data sources. - Improve documentation. Vendor event updates: - Intel: Updated event files for Sierra Forest, Panther Lake, Meteor Lake, Lunar Lake, Granite Rapids, and others. - Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE definitions. - RISC-V: Added JSON support for T-HEAD C920V2. Misc: - Improve pointer tracking in data type profiling. It'd give better output when the variable is using container_of() to convert type. - Annotation support for perf c2c report in TUI. Press 'a' key to enter annotation view from cacheline browser window. This will show which instruction is causing the cacheline contention. - Lots of fixes and test coverage improvements!" * tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits) libperf: Use 'extern' in LIBPERF_API visibility macro perf stat: Improve handling of termination by signal perf tests stat: Add test for error for an offline CPU perf stat: When no events, don't report an error if there is none perf tests stat: Add "--null" coverage perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map perf stat: Allow no events to open if this is a "--null" run perf test kvm: Add some basic perf kvm test coverage perf tests evlist: Add basic evlist test perf tests script dlfilter: Add a dlfilter test perf tests kallsyms: Add basic kallsyms test perf tests timechart: Add a perf timechart test perf tests top: Add basic perf top coverage test perf tests buildid: Add purge and remove testing perf tests c2c: Add a basic c2c perf c2c: Clean up some defensive gets and make asan clean perf jitdump: Fix missed dso__put perf mem-events: Don't leak online CPU map perf hist: In init, ensure mem_info is put on error paths ...
2025-11-19perf header: Switch "cpu" for find_core_pmu in caps feature writingIan Rogers
Writing currently fails on non-x86 and hybrid CPUs. Switch to the more regular find_core_pmu that is normally used in this case. Tested on hybrid alderlake system. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-13perf header: Write bpf_prog (infos|btfs)_cnt to data fileThomas Falcon
With commit f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info"), the write_bpf_( prog_info() | btf() ) functions exit without writing anything if env->bpf_prog.(infos| btfs)_cnt is zero. process_bpf_( prog_info() | btf() ), however, still expect a "count" value to exist in the data file. If btf information is empty, for example, process_bpf_btf will read garbage or some other data as the number of btf nodes in the data file. As a result, the data file will not be processed correctly. Instead, write the count to the data file and exit if it is zero. Fixes: f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-07perf tool: Add the perf_tool argument to all callbacksIan Rogers
Getting context for what a tool is doing, such as the perf_inject instance, using container_of the tool is a common pattern in the code. This isn't possible event_op2, event_op3 and event_op4 callbacks as the tool isn't passed. Add the argument and then fix function signatures to match. As tools maybe reading a tool from somewhere else, change that code to use the passed in tool. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06perf record: Make sure to update build-ID cacheNamhyung Kim
Recent change on enabling --buildid-mmap by default brought an issue with build-id handling. With build-ID in MMAP2 records, we don't need to save the build-ID table in the header of a perf data file. But the actual file contents still need to be cached in the debug directory for annotation etc. Split the build-ID header processing and caching and make sure perf record to save hit DSOs in the build-ID cache by moving perf_session__cache_build_ids() to the end of the record__ finish_output(). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25perf header: Clean up use of perf_envIan Rogers
Always use the perf_env from the feat_fd's perf_header. Cache the value on entry to a function in `env` and use `env->` consistently in the code. Ensure the header is initialized for use in perf_session__do_write_header. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25perf evlist: Change env variable to sessionIan Rogers
The session holds a perf_env pointer env. In UI code container_of is used to turn the env to a session, but this assumes the session header's env is in use. Rather than a dubious container_of, hold the session in the evlist and derive the env from the session with evsel__env, perf_session__env, etc. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25perf build-id: Change sprintf functions to snprintfIan Rogers
Pass in a size argument rather than implying all build id strings must be SBUILD_ID_SIZE. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-4-irogers@google.com [ fixed some build errors ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-24libperf evsel: Rename own_cpus to pmu_cpusIan Rogers
own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to try to make this clearer. Variable rename with no other changes. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03perf header: Fix pipe mode header dumpingIan Rogers
The pipe mode header dumping was accidentally removed when tracing of header feature events in pipe mode was added. Minor spelling tweak to header test failure message. Fixes: 61051f9a8452 ("perf header: In pipe mode dump features without --header/-I") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250703042000.2740640-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Don't write empty BPF/BTF infoIan Rogers
If there are no values in bpf_prog_info or bpf_btf feature don't write the data into the header. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Display message if BPF/BTF info is emptyIan Rogers
The perf.data file may contain a bpf_prog_info or bpf_btf feature. If the contents of these are empty then nothing is displayed. Rather than display nothing and not account for the file space, display an empty message. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: Allow tracing of attr eventsIan Rogers
In pipe mode attr events capture the perf_event_attr. Allow their dumping as they normally start the file. Before: ``` $ perf record -o - -a sleep 1 | perf script -D -i - . ... raw event: size 272 bytes . 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @............... . 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................ . 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ . 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0090: 91 08 00 00 00 00 00 00 92 08 00 00 00 00 00 00 ................ . 00a0: 93 08 00 00 00 00 00 00 94 08 00 00 00 00 00 00 ................ . 00b0: 95 08 00 00 00 00 00 00 96 08 00 00 00 00 00 00 ................ . 00c0: 97 08 00 00 00 00 00 00 98 08 00 00 00 00 00 00 ................ . 00d0: 99 08 00 00 00 00 00 00 9a 08 00 00 00 00 00 00 ................ . 00e0: 9b 08 00 00 00 00 00 00 9c 08 00 00 00 00 00 00 ................ . 00f0: 9d 08 00 00 00 00 00 00 9e 08 00 00 00 00 00 00 ................ . 0100: 9f 08 00 00 00 00 00 00 a0 08 00 00 00 00 00 00 ................ -1 -1 0 [0x110]: PERF_RECORD_ATTR 0x110@pipe [0x110]: event: 64 ... ``` After: ``` $ perf record -o - -a sleep 1 | perf script -D -i - 0@pipe [0x110]: event: 64 . . ... raw event: size 272 bytes . 0000: 40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00 @............... . 0010: 00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00 ................ . 0020: 87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ . 0030: 01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0090: 5c 08 00 00 00 00 00 00 5d 08 00 00 00 00 00 00 \.......]....... . 00a0: 5e 08 00 00 00 00 00 00 5f 08 00 00 00 00 00 00 ^......._....... . 00b0: 60 08 00 00 00 00 00 00 61 08 00 00 00 00 00 00 `.......a....... . 00c0: 62 08 00 00 00 00 00 00 63 08 00 00 00 00 00 00 b.......c....... . 00d0: 64 08 00 00 00 00 00 00 65 08 00 00 00 00 00 00 d.......e....... . 00e0: 66 08 00 00 00 00 00 00 67 08 00 00 00 00 00 00 f.......g....... . 00f0: 68 08 00 00 00 00 00 00 69 08 00 00 00 00 00 00 h.......i....... . 0100: 6a 08 00 00 00 00 00 00 6b 08 00 00 00 00 00 00 j.......k....... -1 -1 0 [0x110]: PERF_RECORD_ATTR, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, precise_ip = 3, sample_id_all = 1 0x110@pipe [0x110]: event: 64 ... ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf header: In pipe mode dump features without --header/-IIan Rogers
In pipe mode the header features are contained within events. While other events dump details the header features only dump if --header or -I are passed, which doesn't make sense as in pipe mode there is no perf file header. Make the printing of the information conditional on dump_trace as with other events. Before: ``` $ perf record -o - -a sleep 1 | perf script -D -i - ... 0x2c8@pipe [0x54]: event: 80 . . ... raw event: size 84 bytes . 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T......... . 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad . 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb...... . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 .... 0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE ``` After: ``` $ perf record -o - -a sleep 1 | perf script -D -i - ... 0x2c8@pipe [0x54]: event: 80 . . ... raw event: size 84 bytes . 0000: 50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00 P.....T......... . 0010: 40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64 @...6.15.rc7.gad . 0020: 32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00 2a691c99fb...... . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 .... 0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE, # perf version : 6.15.rc7.gad2a691c99fb ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf record: collect BPF metadata from new programsBlake Jones
This collects metadata for any BPF programs that were loaded during a "perf record" run, and emits it at the end of the run. Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-4-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf header: remove unecessary core id testAnubhav Shelat
It is possible for systems to have a greater socket id number than the number of cpus present on a machine, so this test is obselete and should be removed. Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250618142921.4053400-2-ashelat@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24perf report: Fix a memory leak for perf_env on AMDNamhyung Kim
The env.pmu_mapping can be leaked when it reads data from a pipe on AMD. For a pipe data, it reads the header data including pmu_mapping from PERF_RECORD_HEADER_FEATURE runtime. But it's already set in: perf_session__new() __perf_session__new() evlist__init_trace_event_sample_raw() evlist__has_amd_ibs() perf_env__nr_pmu_mappings() Then it'll overwrite that when it processes the HEADER_FEATURE record. Here's a report from address sanitizer. Direct leak of 2689 byte(s) in 1 object(s) allocated from: #0 0x7fed8f814596 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x5595a7d416b1 in strbuf_grow util/strbuf.c:64 #2 0x5595a7d414ef in strbuf_init util/strbuf.c:25 #3 0x5595a7d0f4b7 in perf_env__read_pmu_mappings util/env.c:362 #4 0x5595a7d12ab7 in perf_env__nr_pmu_mappings util/env.c:517 #5 0x5595a7d89d2f in evlist__has_amd_ibs util/amd-sample-raw.c:315 #6 0x5595a7d87fb2 in evlist__init_trace_event_sample_raw util/sample-raw.c:23 #7 0x5595a7d7f893 in __perf_session__new util/session.c:179 #8 0x5595a7b79572 in perf_session__new util/session.h:115 #9 0x5595a7b7e9dc in cmd_report builtin-report.c:1603 #10 0x5595a7c019eb in run_builtin perf.c:351 #11 0x5595a7c01c92 in handle_internal_command perf.c:404 #12 0x5595a7c01deb in run_argv perf.c:448 #13 0x5595a7c02134 in main perf.c:556 #14 0x7fed85833d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Let's free the existing pmu_mapping data if any. Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250311000416.817631-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24perf header: Switch mem topology to io_dir__readdirIan Rogers
Switch memory_node__read and build_mem_topology from opendir/readdir to io_dir__readdir, with smaller stack allocations. Reduces peak memory consumption of perf record by 10kb. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-10perf header: Fix one memory leakage in process_bpf_prog_info()Zhongqiu Han
Function __perf_env__insert_bpf_prog_info() will return without inserting bpf prog info node into perf env again due to a duplicate bpf prog info node insertion, causing the temporary info_linear and info_node memory to leak. Modify the return type of this function to bool and add a check to ensure the memory is freed if the function returns false. Fixes: 606f972b1361f477 ("perf bpf: Save bpf_prog_info information as headers to perf.data") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241205084500.823660-3-quic_zhonhan@quicinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10perf header: Fix one memory leakage in process_bpf_btf()Zhongqiu Han
If __perf_env__insert_btf() returns false due to a duplicate btf node insertion, the temporary node will leak. Add a check to ensure the memory is freed if the function returns false. Fixes: a70a1123174ab592 ("perf bpf: Save BTF information as headers to perf.data") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241205084500.823660-2-quic_zhonhan@quicinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Pass a perf_cpu rather than a PMU to get_cpuid_strIan Rogers
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers. As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Refactor get_cpuid to take a CPU for ARMIan Rogers
ARM BIG.little has no notion of a constant CPUID for both core types. To reflect this reality, change the get_cpuid function to also pass in a possibly unused logical cpu. If the dummy value (-1) is passed in then ARM can, as currently happens, select the first logical CPU's "CPUID". The changes to ARM getcpuid happen in a follow up change. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Move is_cpu_online to numa benchIan Rogers
The helper function is only used in the NUMA benchmark as typically online CPUs are determined through perf_cpu_map__new_online_cpus(). Reduce the scope of the function for now. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-08perf build: Include libtraceevent headers directly indicated by pkg-configYicong Yang
Currently the libtraceevent's found by pkg-config, which give the include path as: [root@localhost tmp]# pkg-config --cflags libtraceevent -I/usr/local/include/traceevent So we should include the libtraceevent headers directly without "traceevent/" prefix. Update all the users. Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}") Suggested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/linux-perf-users/ZyF5_Hf1iL01kldE@google.com/ Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: leo.yan@arm.com Cc: amadio@gentoo.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20241105105649.45399-1-yangyicong@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-08-30perf header: Remove repipe optionIan Rogers
No longer used by `perf inject` the repipe_fd is always -1 and repipe is always false. Remove the options and associated code knowing the constant values of the removed variables. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240829150154.37929-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-30perf inject: Overhaul handling of pipe filesIan Rogers
Previously inject->is_pipe was set if the input or output were a pipe. Determining the input was a pipe had to be done prior to starting the session and opening the file. This was done by comparing the input file name with '-' but it fails if the pipe file is written to disk. Opening a pipe file from disk will correctly set perf_data.is_pipe, but this is too late for 'perf inject' and results in a broken file. A workaround is 'cat pipe_perf|perf inject -i - ...'. This change removes inject->is_pipe and changes the dependent conditions to use the is_pipe flag on the input (inject->session->data) and output files (inject->output). This ensures the is_pipe condition reflects things like the header being read. The change removes the use of perf file header repiping, that is writing the file header out while reading it in. The case of input pipe and output file cannot repipe as the attributes for the file are unknown. To resolve this, write the file header when writing to disk and as the attributes may be unknown, write them after the data. Update sessions repipe variable to be trace_event_repipe as those are the only events now impacted by it. Update __perf_session__new as the repipe_fd no longer needs passing. Fully removing repipe from session header reading will be done in a later change. Committer testing: root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 | perf report -i - # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.050 MB - ] # # Total Lost Samples: 0 # # Samples: 1 of event 'syscalls:sys_enter_clock_nanosleep' # Event count (approx.): 1 # # Overhead Command Shared Object Symbol # ........ ....... ............. ............................... # 100.00% sleep libc.so.6 [.] clock_nanosleep@GLIBC_2.2.5 | ---__libc_start_main@@GLIBC_2.34 __libc_start_call_main 0x562fc2560a9f clock_nanosleep@GLIBC_2.2.5 # # (Tip: Create an archive with symtabs to analyse on other machine: perf archive) # root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 > pipe.data [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.050 MB - ] root@number:~# perf report --stdio -i pipe.data # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1 of event 'syscalls:sys_enter_clock_nanosleep' # Event count (approx.): 1 # # Overhead Command Shared Object Symbol # ........ ....... ............. ............................... # 100.00% sleep libc.so.6 [.] clock_nanosleep@GLIBC_2.2.5 | ---__libc_start_main@@GLIBC_2.34 __libc_start_call_main 0x55f775975a9f clock_nanosleep@GLIBC_2.2.5 # # (Tip: To set sampling period of individual events use perf record -e cpu/cpu-cycles,period=100001/,cpu/branches,period=10001/ ...) # root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240829150154.37929-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29perf header: Allow attributes to be written after dataIan Rogers
With a file, to write data an offset needs to be known. Typically data follows the event attributes in a file. However, if processing a pipe the number of event attributes may not be known. It is convenient in that case to write the attributes after the data. Expand perf_session__do_write_header() to allow this when the data offset and size are known. This approach may be useful for more than just taking a pipe file to write into a data file, `perf inject --itrace` will reserve and additional 8kb for attributes, which would be unnecessary if the attributes were written after the data. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240829150154.37929-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29perf header: Fail read if header sections overlapIan Rogers
Buggy perf.data files can have the attributes and data overlapping. For example, when processing pipe data the attributes aren't known and so file offset header calculations can consider them not present. Later this can cause the attributes to overwrite the data. This can be seen in: $ perf record -o - true > a.data [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.059 MB - ] $ perf inject -i a.data -o b.data $ perf report --stats -i b.data 0x68 [0]: failed to process type: 510379 [Invalid argument] Error: failed to process sample $ This change makes reading the corrupt file fail: $ perf report --stats -i b.data Perf file header corrupt: Attributes and data overlap incompatible file format (rerun with -v to learn more) $ Which is more informative. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240829150154.37929-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf tool: Constify tool pointersIan Rogers
The tool pointer (to a struct largely of function pointers) is passed around but is unchanged except at initialization. Change parameter and variable types to be const to lower the possibilities of what could happen with a tool. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06perf dso: Add reference count checking and accessor functionsIan Rogers
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso. The majority of the change is mechanical in nature and not easy to split up. Committer testing: 'perf test' up to this patch shows no regressions. But: util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs.... This was updated: - symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso); But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora). Add the missing argument: symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true); Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-03perf env: Avoid recursively taking env->bpf_progs.lockIan Rogers
Add variants of perf_env__insert_bpf_prog_info(), perf_env__insert_btf() and perf_env__find_btf prefixed with __ to indicate the env->bpf_progs.lock is assumed held. Call these variants when the lock is held to avoid recursively taking it and potentially having a thread deadlock with itself. Fixes: f8dfeae009effc0b ("perf bpf: Show more BPF program info in print_bpf_prog_info()") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20231207014655.1252484-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-13perf header: Fix one memory leakage in perf_event__fprintf_event_update()Yicong Yang
When dump the raw trace by `perf report -D` ASan reports a memory leakage in perf_event__fprintf_event_update(). It shows that we allocated a temporary cpumap for dumping the CPUs but doesn't release it and it's not used elsewhere. Fix this by free the cpumap after the dumping. Fixes: c853f9394b7bc189 ("perf tools: Add perf_event__fprintf_event_update function") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20231207081635.8427-2-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27perf header: Fix segfault on build_mem_topology() error pathAdrian Hunter
Do not increase the node count unless a node has been successfully read, because it can lead to a segfault if an error occurs. For example, if perf exceeds the open file limit in memory_node__read(), which, on a test system, could be made to happen by setting the file limit to exactly 32: Before: $ ulimit -n 32 $ perf mem record --all-user -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open memory sysfs data perf: Segmentation fault Obtained 14 stack frames. perf(sighandler_dump_stack+0x48) [0x55f4b1f59558] /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f4ba1c42520] /lib/x86_64-linux-gnu/libc.so.6(free+0x1e) [0x7f4ba1ca53fe] perf(+0x178ff4) [0x55f4b1f48ff4] perf(+0x179a70) [0x55f4b1f49a70] perf(+0x17ef5d) [0x55f4b1f4ef5d] perf(+0x85c0b) [0x55f4b1e55c0b] perf(cmd_record+0xe1d) [0x55f4b1e5920d] perf(cmd_mem+0xc96) [0x55f4b1e80e56] perf(+0x130460) [0x55f4b1f00460] perf(main+0x689) [0x55f4b1e427d9] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4ba1c29d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4ba1c29e40] perf(_start+0x25) [0x55f4b1e42a25] Segmentation fault (core dumped) $ After: $ ulimit -n 32 $ perf mem record --all-user -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open memory sysfs data [ perf record: Captured and wrote 0.005 MB perf.data (11 samples) ] $ Fixes: f8e502b9d1b3b197 ("perf header: Ensure bitmaps are freed") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10perf header: Additional note on AMD IBS for max_precise pmu capArnaldo Carvalho de Melo
x86 core PMU exposes supported maximum precision level via max_precise PMU capability. Although, AMD core PMU does not support precise mode, certain core PMU events with precise_ip > 0 are allowed and forwarded to IBS OP PMU. Display a note about this in the 'perf report' header output and document the details in the perf-list man page. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ross Zwisler <zwisler@chromium.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-06perf header: Support num and width of branch countersKan Liang
To support the branch counters feature, the information of the maximum number of supported counters and the width of the counters is exposed in the sysfs caps folder. The perf tool can use the information to parse the logged counters in each branch. Store the information in the perf_env for later usage. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tinghao Zhang <tinghao.zhang@intel.com> Link: https://lore.kernel.org/r/20231025201626.3000228-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-10-12perf header: Fix various error path memory leaksIan Rogers
Memory leaks were detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-19-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-08-29perf tools: Convert to perf_record_header_attr_id()Namhyung Kim
Instead of accessing the attr.id directly, use the perf_record_header_attr_id() helper to handle old versions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230825152552.112913-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29perf tools: Handle old data in PERF_RECORD_ATTRNamhyung Kim
The PERF_RECORD_ATTR is used for a pipe mode to describe an event with attribute and IDs. The ID table comes after the attr and it calculate size of the table using the total record size and the attr size. n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64) This is fine for most use cases, but sometimes it saves the pipe output in a file and then process it later. And it becomes a problem if there is a change in attr size between the record and report. $ perf record -o- > perf-pipe.data # old version $ perf report -i- < perf-pipe.data # new version For example, if the attr size is 128 and it has 4 IDs, then it would save them in 168 byte like below: 8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 }, 128 byte: perf event attr { .size = 128, ... }, 32 byte: event IDs [] = { 1234, 1235, 1236, 1237 }, But when report later, it thinks the attr size is 136 then it only read the last 3 entries as ID. 8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 }, 136 byte: perf event attr { .size = 136, ... }, 24 byte: event IDs [] = { 1235, 1236, 1237 }, // 1234 is missing So it should use the recorded version of the attr. The attr has the size field already then it should honor the size when reading data. Fixes: 2c46dbb517a10b18 ("perf: Convert perf header attrs into attr events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <zanussi@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29perf tools: Allow to use cpuinfo on LoongArchYanteng Si
Define these macros so that the CPU name can be displayed when running 'perf report' and 'perf timechart'. Committer notes: No need to have: if (strcasestr(buf, "Model Name")) { strlcpy(cpu_m, &buf[13], 255); break; } else if (strcasestr(buf, "model name")) { strlcpy(cpu_m, &buf[13], 255); break; } As the point of strcasestr() is to be case insensitive to both the haystack and the needle, so simplify the above to just: if (strcasestr(buf, "model name")) { strlcpy(cpu_m, &buf[13], 255); break; } Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: loongarch@lists.linux.dev Cc: loongson-kernel@lists.loongnix.cn Link: https://lore.kernel.org/r/db968a186a10e4629fe10c26a1210f7126ad41ec.1692962043.git.siyanteng@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25perf pmu: Remove logic for PMU name being NULLIan Rogers
The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Li <liwei391@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25perf header: Fix missing PMU capsIan Rogers
PMU caps are written as HEADER_PMU_CAPS or for the special case of the PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not any "core" PMU, the logic had become broken and core PMUs not called "cpu" were not having their caps written. This affects ARM and s390 non-hybrid PMUs. Simplify the PMU caps writing logic to scan one fewer time and to be more explicit in its behavior. Fixes: 178ddf3bad981380 ("perf header: Avoid hybrid PMU list in write_pmu_caps") Reported-by: Wei Li <liwei391@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230825024002.801955-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf header: Avoid out-of-bounds readIan Rogers
intel-pt tests were failing: -- Test virtual LBR --- Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.126 MB /tmp/perf-test-intel-pt-sh.FW57CXnCqQ/test-perf.data ] Failed with virtual lbr ... ``` The root cause is an out-of-bounds read in header (where maxbrstack.py is from test_intel_pt.sh): ``` $ perf --no-pager script --itrace=L -s maxbrstack.py ================================================================= ==3907930==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000095a8 at pc 0x563c26c840bb bp 0x7fff43582710 sp 0x7fff43582708 READ of size 4 at 0x6020000095a8 thread T0 #0 0x563c26c840ba in process_group_desc util/header.c:2847 #1 0x563c26c8bc78 in perf_file_section__process util/header.c:4037 #2 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813 #3 0x563c26c8d028 in perf_session__read_header util/header.c:4286 #4 0x563c26cbab29 in perf_session__open util/session.c:113 #5 0x563c26cbb3d0 in __perf_session__new util/session.c:221 #6 0x563c26aacb14 in perf_session__new util/session.h:73 #7 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212 #8 0x563c26bb58ff in run_builtin tools/perf/perf.c:323 #9 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377 #10 0x563c26bb6238 in run_argv tools/perf/perf.c:421 #11 0x563c26bb67a0 in main tools/perf/perf.c:537 #12 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #13 0x7f34bde46244 in __libc_start_main_impl ../csu/libc-start.c:381 #14 0x563c26a33390 in _start (/tmp/perf/perf+0x1eb390) 0x6020000095a8 is located 8 bytes to the right of 16-byte region [0x602000009590,0x6020000095a0) allocated by thread T0 here: #0 0x7f34beeb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77 #1 0x563c26c83df8 in process_group_desc util/header.c:2824 #2 0x563c26c8bc78 in perf_file_section__process util/header.c:4037 #3 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813 #4 0x563c26c8d028 in perf_session__read_header util/header.c:4286 #5 0x563c26cbab29 in perf_session__open util/session.c:113 #6 0x563c26cbb3d0 in __perf_session__new util/session.c:221 #7 0x563c26aacb14 in perf_session__new util/session.h:73 #8 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212 #9 0x563c26bb58ff in run_builtin tools/perf/perf.c:323 #10 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377 #11 0x563c26bb6238 in run_argv tools/perf/perf.c:421 #12 0x563c26bb67a0 in main tools/perf/perf.c:537 #13 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 ``` Avoid the out-of-bounds read checking for the leader. Leave the 'nr' check intact as nr will be 0 or the counting down and evsel be a group member. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/lkml/20230608232823.4027869-24-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf header: Ensure bitmaps are freedIan Rogers
memory_node bitmaps need a bitmap_free to avoid memory leaks. Caught by leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-28perf header: Make nodes dynamic in write_mem_topology()Ian Rogers
Avoid a large static array, dynamically allocate the nodes avoiding a hard coded limited as well. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Ross Zwisler <zwisler@chromium.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230526183401.2326121-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-27perf pmus: Remove perf_pmus__has_hybridIan Rogers
perf_pmus__has_hybrid was used to detect when there was >1 core PMU, this can be achieved with perf_pmus__num_core_pmus that doesn't depend upon is_pmu_hybrid and PMU name comparisons. When modifying the function calls take the opportunity to improve comments, enable/simplify tests that were previously failing for hybrid but now pass and to simplify generic code. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-34-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>