summaryrefslogtreecommitdiff
path: root/kernel/trace/trace_events_synth.c
AgeCommit message (Collapse)Author
2026-02-22Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL usesKees Cook
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-01-28tracing: Remove notrace from trace_event_raw_event_synth()Steven Rostedt
When debugging the synthetic events, being able to function trace its functions is very useful (now that CONFIG_FUNCTION_SELF_TRACING is available). For some reason trace_event_raw_event_synth() was marked as "notrace", which was totally unnecessary as all of the tracing directory had function tracing disabled until the recent FUNCTION_SELF_TRACING was added. Remove the notrace annotation from trace_event_raw_event_synth() as there's no reason to not trace it when tracing synthetic event functions. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260122204526.068a98c9@gandalf.local.home Acked-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-23tracing: Fix crash on synthetic stacktrace field usageSteven Rostedt
When creating a synthetic event based on an existing synthetic event that had a stacktrace field and the new synthetic event used that field a kernel crash occurred: ~# cd /sys/kernel/tracing ~# echo 's:stack unsigned long stack[];' > dynamic_events ~# echo 'hist:keys=prev_pid:s0=common_stacktrace if prev_state & 3' >> events/sched/sched_switch/trigger ~# echo 'hist:keys=next_pid:s1=$s0:onmatch(sched.sched_switch).trace(stack,$s1)' >> events/sched/sched_switch/trigger The above creates a synthetic event that takes a stacktrace when a task schedules out in a non-running state and passes that stacktrace to the sched_switch event when that task schedules back in. It triggers the "stack" synthetic event that has a stacktrace as its field (called "stack"). ~# echo 's:syscall_stack s64 id; unsigned long stack[];' >> dynamic_events ~# echo 'hist:keys=common_pid:s2=stack' >> events/synthetic/stack/trigger ~# echo 'hist:keys=common_pid:s3=$s2,i0=id:onmatch(synthetic.stack).trace(syscall_stack,$i0,$s3)' >> events/raw_syscalls/sys_exit/trigger The above makes another synthetic event called "syscall_stack" that attaches the first synthetic event (stack) to the sys_exit trace event and records the stacktrace from the stack event with the id of the system call that is exiting. When enabling this event (or using it in a historgram): ~# echo 1 > events/synthetic/syscall_stack/enable Produces a kernel crash! BUG: unable to handle page fault for address: 0000000000400010 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 6 UID: 0 PID: 1257 Comm: bash Not tainted 6.16.3+deb14-amd64 #1 PREEMPT(lazy) Debian 6.16.3-1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:trace_event_raw_event_synth+0x90/0x380 Code: c5 00 00 00 00 85 d2 0f 84 e1 00 00 00 31 db eb 34 0f 1f 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 <49> 8b 04 24 48 83 c3 01 8d 0c c5 08 00 00 00 01 cd 41 3b 5d 40 0f RSP: 0018:ffffd2670388f958 EFLAGS: 00010202 RAX: ffff8ba1065cc100 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: fffff266ffda7b90 RDI: ffffd2670388f9b0 RBP: 0000000000000010 R08: ffff8ba104e76000 R09: ffffd2670388fa50 R10: ffff8ba102dd42e0 R11: ffffffff9a908970 R12: 0000000000400010 R13: ffff8ba10a246400 R14: ffff8ba10a710220 R15: fffff266ffda7b90 FS: 00007fa3bc63f740(0000) GS:ffff8ba2e0f48000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000400010 CR3: 0000000107f9e003 CR4: 0000000000172ef0 Call Trace: <TASK> ? __tracing_map_insert+0x208/0x3a0 action_trace+0x67/0x70 event_hist_trigger+0x633/0x6d0 event_triggers_call+0x82/0x130 trace_event_buffer_commit+0x19d/0x250 trace_event_raw_event_sys_exit+0x62/0xb0 syscall_exit_work+0x9d/0x140 do_syscall_64+0x20a/0x2f0 ? trace_event_raw_event_sched_switch+0x12b/0x170 ? save_fpregs_to_fpstate+0x3e/0x90 ? _raw_spin_unlock+0xe/0x30 ? finish_task_switch.isra.0+0x97/0x2c0 ? __rseq_handle_notify_resume+0xad/0x4c0 ? __schedule+0x4b8/0xd00 ? restore_fpregs_from_fpstate+0x3c/0x90 ? switch_fpu_return+0x5b/0xe0 ? do_syscall_64+0x1ef/0x2f0 ? do_fault+0x2e9/0x540 ? __handle_mm_fault+0x7d1/0xf70 ? count_memcg_events+0x167/0x1d0 ? handle_mm_fault+0x1d7/0x2e0 ? do_user_addr_fault+0x2c3/0x7f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The reason is that the stacktrace field is not labeled as such, and is treated as a normal field and not as a dynamic event that it is. In trace_event_raw_event_synth() the event is field is still treated as a dynamic array, but the retrieval of the data is considered a normal field, and the reference is just the meta data: // Meta data is retrieved instead of a dynamic array str_val = (char *)(long)var_ref_vals[val_idx]; // Then when it tries to process it: len = *((unsigned long *)str_val) + 1; It triggers a kernel page fault. To fix this, first when defining the fields of the first synthetic event, set the filter type to FILTER_STACKTRACE. This is used later by the second synthetic event to know that this field is a stacktrace. When creating the field of the new synthetic event, have it use this FILTER_STACKTRACE to know to create a stacktrace field to copy the stacktrace into. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Link: https://patch.msgid.link/20260122194824.6905a38e@gandalf.local.home Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix fixed array of synthetic eventSteven Rostedt
The commit 4d38328eb442d ("tracing: Fix synth event printk format for str fields") replaced "%.*s" with "%s" but missed removing the number size of the dynamic and static strings. The commit e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") fixed the dynamic part but did not fix the static part. That is, with the commands: # echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger That caused the output of: <idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155 sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58 <idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91 The commit e1a453a57bc7 fixed the part where the synthetic event had "char[] wakee". But if one were to replace that with a static size string: # echo 's:wake_lat char[16] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events Where "wakee" is defined as "char[16]" and not "char[]" making it a static size, the code triggered the "(efaul)" again. Remove the added STR_VAR_LEN_MAX size as the string is still going to be nul terminated. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Link: https://patch.msgid.link/20251204151935.5fa30355@gandalf.local.home Fixes: e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-04tracing: Allow tracer to add more than 32 optionsMasami Hiramatsu (Google)
Since enum trace_iterator_flags is 32bit, the max number of the option flags is limited to 32 and it is fully used now. To add a new option, we need to expand it. So replace the TRACE_ITER_##flag with TRACE_ITER(flag) macro which is 64bit bitmask. Link: https://lore.kernel.org/all/176187877103.994619.166076000668757232.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-08-01tracing: Add guard(ring_buffer_nest)Steven Rostedt
Some calls to the tracing ring buffer can happen when the ring buffer is already being written to by the same context (for example, a trace_printk() in between a ring_buffer_lock_reserve() and a ring_buffer_unlock_commit()). In order to not trigger the recursion detection, these functions use ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for these functions so that their use cases can be simplified and not need to use goto for the release. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203857.710501021@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-09tracing: Do not add length to print format in synthetic eventsSteven Rostedt
The following causes a vsnprintf fault: # echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger Because the synthetic event's "wakee" field is created as a dynamic string (even though the string copied is not). The print format to print the dynamic string changed from "%*s" to "%s" because another location (__set_synth_event_print_fmt()) exported this to user space, and user space did not need that. But it is still used in print_synth_event(), and the output looks like: <idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155 sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58 <idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91 bash-880 [002] d..5. 193.811371: wake_lat: wakee=(efault)kworker/u35:2delta=21 <idle>-0 [001] d..5. 193.811516: wake_lat: wakee=(efault)sshd-sessiondelta=129 sshd-session-879 [001] d..5. 193.967576: wake_lat: wakee=(efault)kworker/u34:5delta=50 The length isn't needed as the string is always nul terminated. Just print the string and not add the length (which was hard coded to the max string length anyway). Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Douglas Raillard <douglas.raillard@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/20250407154139.69955768@gandalf.local.home Fixes: 4d38328eb442d ("tracing: Fix synth event printk format for str fields"); Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-25tracing: Replace strncpy with memcpy for fixed-length substring copySiddarth G
checkpatch.pl reports the following warning: WARNING: Prefer strscpy, strscpy_pad, or __nonstring over strncpy (see: https://github.com/KSPP/linux/issues/90) In synth_field_string_size(), replace strncpy() with memcpy() to copy 'len' characters from 'start' to 'buf'. The code manually adds a NUL terminator after the copy, making memcpy safe here. Link: https://lore.kernel.org/20250325181232.38284-1-siddarthsgml@gmail.com Signed-off-by: Siddarth G <siddarthsgml@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-25tracing: Fix synth event printk format for str fieldsDouglas Raillard
The printk format for synth event uses "%.*s" to print string fields, but then only passes the pointer part as var arg. Replace %.*s with %s as the C string is guaranteed to be null-terminated. The output in print fmt should never have been updated as __get_str() handles the string limit because it can access the length of the string in the string meta data that is saved in the ring buffer. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 8db4d6bfbbf92 ("tracing: Change synthetic event string format to limit printed length") Link: https://lore.kernel.org/20250325165202.541088-1-douglas.raillard@arm.com Signed-off-by: Douglas Raillard <douglas.raillard@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-24tracing: Do not use PERF enums when perf is not definedSteven Rostedt
An update was made to up the module ref count when a synthetic event is registered for both trace and perf events. But if perf is not configured in, the perf enums used will cause the kernel to fail to build. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Link: https://lore.kernel.org/20250323152151.528b5ced@batman.local.home Fixes: 21581dd4e7ff ("tracing: Ensure module defining synth event cannot be unloaded while tracing") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503232230.TeREVy8R-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-23tracing: Ensure module defining synth event cannot be unloaded while tracingDouglas Raillard
Currently, using synth_event_delete() will fail if the event is being used (tracing in progress), but that is normally done in the module exit function. At that stage, failing is problematic as returning a non-zero status means the module will become locked (impossible to unload or reload again). Instead, ensure the module exit function does not get called in the first place by increasing the module refcnt when the event is enabled. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 35ca5207c2d11 ("tracing: Add synthetic event command generation functions") Link: https://lore.kernel.org/20250318180906.226841-1-douglas.raillard@arm.com Signed-off-by: Douglas Raillard <douglas.raillard@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-23tracing: Align synth event print fmtDouglas Raillard
The vast majority of ftrace event print fmt consist of a space-separated field=value pair. Synthetic event currently use a comma-separated field=value pair, which sticks out from events created via more classical means. Align the format of synth events so they look just like any other event, for better consistency and less headache when doing crude text-based data processing. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250319215028.1680278-1-douglas.raillard@arm.com Signed-off-by: Douglas Raillard <douglas.raillard@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26tracing: Switch trace_events_synth.c code over to use guard()Steven Rostedt
There are a couple functions in trace_events_synth.c that have "goto out" or equivalent on error in order to release locks that were taken. This can be error prone or just simply make the code more complex. Switch every location that ends with unlocking a mutex on error over to using the guard(mutex)() infrastructure to let the compiler worry about releasing locks. This makes the code easier to read and understand. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/20241219201346.371082515@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-02-15tracing/synthetic: Fix trace_string() return valueThorsten Blum
Fix trace_string() by assigning the string length to the return variable which got lost in commit ddeea494a16f ("tracing/synthetic: Use union instead of casts") and caused trace_string() to always return 0. Link: https://lore.kernel.org/linux-trace-kernel/20240214220555.711598-1-thorsten.blum@toblux.com Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-20tracing/synthetic: fix kernel-doc warningsRandy Dunlap
scripts/kernel-doc warns about using @args: for variadic arguments to functions. Documentation/doc-guide/kernel-doc.rst says that this should be written as @...: instead, so update the source code to match that, preventing the warnings. trace_events_synth.c:1165: warning: Excess function parameter 'args' description in '__synth_event_gen_cmd_start' trace_events_synth.c:1714: warning: Excess function parameter 'args' description in 'synth_event_trace' Link: https://lore.kernel.org/linux-trace-kernel/20231220061226.30962-1-rdunlap@infradead.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 35ca5207c2d11 ("tracing: Add synthetic event command generation functions") Fixes: 8dcc53ad956d2 ("tracing: Add synth_event_trace() and related functions") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-01tracing: Have the user copy of synthetic event address use correct contextSteven Rostedt (Google)
A synthetic event is created by the synthetic event interface that can read both user or kernel address memory. In reality, it reads any arbitrary memory location from within the kernel. If the address space is in USER (where CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE is set) then it uses strncpy_from_user_nofault() to copy strings otherwise it uses strncpy_from_kernel_nofault(). But since both functions use the same variable there's no annotation to what that variable is (ie. __user). This makes sparse complain. Quiet sparse by typecasting the strncpy_from_user_nofault() variable to a __user pointer. Link: https://lore.kernel.org/linux-trace-kernel/20231031151033.73c42e23@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 0934ae9977c2 ("tracing: Fix reading strings from synthetic events"); Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311010013.fm8WTxa5-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-11tracing/synthetic: Print out u64 values properlyTero Kristo
The synth traces incorrectly print pointer to the synthetic event values instead of the actual value when using u64 type. Fix by addressing the contents of the union properly. Link: https://lore.kernel.org/linux-trace-kernel/20230911141704.3585965-1-tero.kristo@linux.intel.com Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts") Cc: stable@vger.kernel.org Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-08-16tracing/synthetic: Allocate one additional element for sizeSven Schnelle
While debugging another issue I noticed that the stack trace contains one invalid entry at the end: <idle>-0 [008] d..4. 26.484201: wake_lat: pid=0 delta=2629976084 000000009cc24024 stack=STACK: => __schedule+0xac6/0x1a98 => schedule+0x126/0x2c0 => schedule_timeout+0x150/0x2c0 => kcompactd+0x9ca/0xc20 => kthread+0x2f6/0x3d8 => __ret_from_fork+0x8a/0xe8 => 0x6b6b6b6b6b6b6b6b This is because the code failed to add the one element containing the number of entries to field_size. Link: https://lkml.kernel.org/r/20230816154928.4171614-4-svens@linux.ibm.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-08-16tracing/synthetic: Skip first entry for stack tracesSven Schnelle
While debugging another issue I noticed that the stack trace output contains the number of entries on top: <idle>-0 [000] d..4. 203.322502: wake_lat: pid=0 delta=2268270616 stack=STACK: => 0x10 => __schedule+0xac6/0x1a98 => schedule+0x126/0x2c0 => schedule_timeout+0x242/0x2c0 => __wait_for_common+0x434/0x680 => __wait_rcu_gp+0x198/0x3e0 => synchronize_rcu+0x112/0x138 => ring_buffer_reset_online_cpus+0x140/0x2e0 => tracing_reset_online_cpus+0x15c/0x1d0 => tracing_set_clock+0x180/0x1d8 => hist_register_trigger+0x486/0x670 => event_hist_trigger_parse+0x494/0x1318 => trigger_process_regex+0x1d4/0x258 => event_trigger_write+0xb4/0x170 => vfs_write+0x210/0xad0 => ksys_write+0x122/0x208 Fix this by skipping the first element. Also replace the pointer logic with an index variable which is easier to read. Link: https://lkml.kernel.org/r/20230816154928.4171614-3-svens@linux.ibm.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-08-16tracing/synthetic: Use union instead of castsSven Schnelle
The current code uses a lot of casts to access the fields member in struct synth_trace_events with different sizes. This makes the code hard to read, and had already introduced an endianness bug. Use a union and struct instead. Link: https://lkml.kernel.org/r/20230816154928.4171614-2-svens@linux.ibm.com Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 00cf3d672a9dd ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-28tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.cGaosheng Cui
Fix kernel-doc warning: kernel/trace/trace_events_synth.c:1257: warning: Function parameter or member 'mod' not described in 'synth_event_gen_cmd_array_start' Link: https://lkml.kernel.org/r/20230724140827.1023266-3-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-06tracing/synthetic: Make lastcmd_mutex staticSteven Rostedt (Google)
The lastcmd_mutex is only used in trace_events_synth.c and should be static. Link: https://lore.kernel.org/linux-trace-kernel/202304062033.cRStgOuP-lkp@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230406111033.6e26de93@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Tze-nan Wu <Tze-nan.Wu@mediatek.com> Fixes: 4ccf11c4e8a8e ("tracing/synthetic: Fix races on freeing last_cmd") Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-03tracing/synthetic: Fix races on freeing last_cmdTze-nan Wu
Currently, the "last_cmd" variable can be accessed by multiple processes asynchronously when multiple users manipulate synthetic_events node at the same time, it could lead to use-after-free or double-free. This patch add "lastcmd_mutex" to prevent "last_cmd" from being accessed asynchronously. ================================================================ It's easy to reproduce in the KASAN environment by running the two scripts below in different shells. script 1: while : do echo -n -e '\x88' > /sys/kernel/tracing/synthetic_events done script 2: while : do echo -n -e '\xb0' > /sys/kernel/tracing/synthetic_events done ================================================================ double-free scenario: process A process B ------------------- --------------- 1.kstrdup last_cmd 2.free last_cmd 3.free last_cmd(double-free) ================================================================ use-after-free scenario: process A process B ------------------- --------------- 1.kstrdup last_cmd 2.free last_cmd 3.tracing_log_err(use-after-free) ================================================================ Appendix 1. KASAN report double-free: BUG: KASAN: double-free in kfree+0xdc/0x1d4 Free of addr ***** by task sh/4879 Call trace: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x60/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Allocated by task 4879: ... kstrdup+0x5c/0x98 create_or_delete_synth_event+0x6c/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Freed by task 5464: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x60/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... ================================================================ Appendix 2. KASAN report use-after-free: BUG: KASAN: use-after-free in strlen+0x5c/0x7c Read of size 1 at addr ***** by task sh/5483 sh: CPU: 7 PID: 5483 Comm: sh ... __asan_report_load1_noabort+0x34/0x44 strlen+0x5c/0x7c tracing_log_err+0x60/0x444 create_or_delete_synth_event+0xc4/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Allocated by task 5483: ... kstrdup+0x5c/0x98 create_or_delete_synth_event+0x80/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Freed by task 5480: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x74/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Link: https://lore.kernel.org/linux-trace-kernel/20230321110444.1587-1-Tze-nan.Wu@mediatek.com Fixes: 27c888da9867 ("tracing: Remove size restriction on synthetic event cmd error logging") Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: "Tom Zanussi" <zanussi@kernel.org> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-25Merge tag 'probes-v6.3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull kprobes cleanup updates from Masami Hiramatsu: "These are probe events cleanups, no new features but improve readability: - Rename print_probe_args() to trace_probe_print_args() and un-inline it - Introduce a set of default data fetch functions for dynamic probe events - Extract common code of data fetch process of dynamic probe events" * tag 'probes-v6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: kernel/trace: extract common part in process_fetch_insn kernel/trace: Provide default impelentations defined in trace_probe_tmpl.h kernel/trace: Introduce trace_probe_print_args and use it in *probes
2023-02-24kernel/trace: Provide default impelentations defined in trace_probe_tmpl.hSong Chen
There are 6 function definitions in trace_probe_tmpl.h, they are: 1, fetch_store_strlen 2, fetch_store_string 3, fetch_store_strlen_user 4, fetch_store_string_user 5, probe_mem_read 6, probe_mem_read_user Every C file which includes trace_probe_tmpl.h has to implement them, otherwise it gets warnings and errors. However, some of them are identical, like kprobe and eprobe, as a result, there is a lot redundant code in those 2 files. This patch would like to provide default behaviors for those functions which kprobe and eprobe can share by just including trace_probe_kernel.h with trace_probe_tmpl.h together. It removes redundant code, increases readability, and more importantly, makes it easier to introduce a new feature based on trace probe (it's possible). Link: https://lore.kernel.org/all/1672382018-18347-1-git-send-email-chensong_2000@189.cn/ Signed-off-by: Song Chen <chensong_2000@189.cn> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-02-15tracing/histogram: Don't use strlen to find length of stacktrace variablesTom Zanussi
Because stacktraces are saved in dynamic strings, trace_event_raw_event_synth() uses strlen to determine the length of the stack. Stacktraces may contain 0-bytes, though, in the saved addresses, so the length found and passed to reserve() will be too small. Fix this by using the first unsigned long in the stack variables to store the actual number of elements in the stack and have trace_event_raw_event_synth() use that to determine the length of the stack. Link: https://lkml.kernel.org/r/1ed6906cd9d6477ef2bd8e63c61de20a9ffe64d7.1676063532.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-07tracing: Fix trace_event_raw_event_synth() if else statementSteven Rostedt (Google)
The test to check if the field is a stack is to be done if it is not a string. But the code had: } if (event->fields[i]->is_stack) { and not } else if (event->fields[i]->is_stack) { which would cause it to always be tested. Worse yet, this also included an "else" statement that was only to be called if the field was not a string and a stack, but this code allows it to be called if it was a string (and not a stack). Also fixed some whitespace issues. Link: https://lore.kernel.org/all/202301302110.mEtNwkBD-lkp@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230131095237.63e3ca8d@gandalf.local.home Cc: Tom Zanussi <zanussi@kernel.org> Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-01-25tracing: Allow synthetic events to pass around stacktracesSteven Rostedt (Google)
Allow a stacktrace from one event to be displayed by the end event of a synthetic event. This is very useful when looking for the longest latency of a sleep or something blocked on I/O. # cd /sys/kernel/tracing/ # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 1||prev_state == 2' > events/sched/sched_switch/trigger # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger The above creates a "block_lat" synthetic event that take the stacktrace of when a task schedules out in either the interruptible or uninterruptible states, and on a new per process max $delta (the time it was scheduled out), will print the process id and the stacktrace. # echo 1 > events/synthetic/block_lat/enable # cat trace # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | kworker/u16:0-767 [006] d..4. 560.645045: block_lat: pid=767 delta=66 stack=STACK: => __schedule => schedule => pipe_read => vfs_read => ksys_read => do_syscall_64 => 0x966000aa <idle>-0 [003] d..4. 561.132117: block_lat: pid=0 delta=413787 stack=STACK: => __schedule => schedule => schedule_hrtimeout_range_clock => do_sys_poll => __x64_sys_poll => do_syscall_64 => 0x966000aa <...>-153 [006] d..4. 562.068407: block_lat: pid=153 delta=54 stack=STACK: => __schedule => schedule => io_schedule => rq_qos_wait => wbt_wait => __rq_qos_throttle => blk_mq_submit_bio => submit_bio_noacct_nocheck => ext4_bio_write_page => mpage_submit_page => mpage_process_page_bufs => mpage_prepare_extent_to_map => ext4_do_writepages => ext4_writepages => do_writepages => __writeback_single_inode Link: https://lkml.kernel.org/r/20230117152236.010941267@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Ross Zwisler <zwisler@google.com> Cc: Ching-lin Yu <chinglinyu@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25tracing: Simplify calculating entry size using struct_size()Steven Rostedt (Google)
When tracing a dynamic string field for a synthetic event, the offset calculation for where to write the next event can use struct_size() to find what the current size of the structure is. This simplifies the code and makes it less error prone. Link: https://lkml.kernel.org/r/20230117152235.698632147@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Ross Zwisler <zwisler@google.com> Cc: Ching-lin Yu <chinglinyu@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-10tracing: Fix issue of missing one synthetic fieldZheng Yejian
The maximum number of synthetic fields supported is defined as SYNTH_FIELDS_MAX which value currently is 64, but it actually fails when try to generate a synthetic event with 64 fields by executing like: # echo "my_synth_event int v1; int v2; int v3; int v4; int v5; int v6;\ int v7; int v8; int v9; int v10; int v11; int v12; int v13; int v14;\ int v15; int v16; int v17; int v18; int v19; int v20; int v21; int v22;\ int v23; int v24; int v25; int v26; int v27; int v28; int v29; int v30;\ int v31; int v32; int v33; int v34; int v35; int v36; int v37; int v38;\ int v39; int v40; int v41; int v42; int v43; int v44; int v45; int v46;\ int v47; int v48; int v49; int v50; int v51; int v52; int v53; int v54;\ int v55; int v56; int v57; int v58; int v59; int v60; int v61; int v62;\ int v63; int v64" >> /sys/kernel/tracing/synthetic_events Correct the field counting to fix it. Link: https://lore.kernel.org/linux-trace-kernel/20221207091557.3137904-1-zhengyejian1@huawei.com Cc: <mhiramat@kernel.org> Cc: <zanussi@kernel.org> Cc: stable@vger.kernel.org Fixes: c9e759b1e845 ("tracing: Rework synthetic event command parsing") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-11-23tracing: Add tracing_reset_all_online_cpus_unlocked() functionSteven Rostedt (Google)
Currently the tracing_reset_all_online_cpus() requires the trace_types_lock held. But only one caller of this function actually has that lock held before calling it, and the other just takes the lock so that it can call it. More users of this function is needed where the lock is not held. Add a tracing_reset_all_online_cpus_unlocked() function for the one use case that calls it without being held, and also add a lockdep_assert to make sure it is held when called. Then have tracing_reset_all_online_cpus() take the lock internally, such that callers do not need to worry about taking it. Link: https://lkml.kernel.org/r/20221123192741.658273220@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-11-17tracing: Fix wild-memory-access in register_synth_event()Shang XiaoJing
In register_synth_event(), if set_synth_event_print_fmt() failed, then both trace_remove_event_call() and unregister_trace_event() will be called, which means the trace_event_call will call __unregister_trace_event() twice. As the result, the second unregister will causes the wild-memory-access. register_synth_event set_synth_event_print_fmt failed trace_remove_event_call event_remove if call->event.funcs then __unregister_trace_event (first call) unregister_trace_event __unregister_trace_event (second call) Fix the bug by avoiding to call the second __unregister_trace_event() by checking if the first one is called. general protection fault, probably for non-canonical address 0xfbd59c0000000024: 0000 [#1] SMP KASAN PTI KASAN: maybe wild-memory-access in range [0xdead000000000120-0xdead000000000127] CPU: 0 PID: 3807 Comm: modprobe Not tainted 6.1.0-rc1-00186-g76f33a7eedb4 #299 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_trace_event+0x6e/0x280 Code: 00 fc ff df 4c 89 ea 48 c1 ea 03 80 3c 02 00 0f 85 0e 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 08 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 e2 01 00 00 49 89 2c 24 48 85 ed 74 28 e8 7a 9b RSP: 0018:ffff88810413f370 EFLAGS: 00010a06 RAX: dffffc0000000000 RBX: ffff888105d050b0 RCX: 0000000000000000 RDX: 1bd5a00000000024 RSI: ffff888119e276e0 RDI: ffffffff835a8b20 RBP: dead000000000100 R08: 0000000000000000 R09: fffffbfff0913481 R10: ffffffff8489a407 R11: fffffbfff0913480 R12: dead000000000122 R13: ffff888105d050b8 R14: 0000000000000000 R15: ffff888105d05028 FS: 00007f7823e8d540(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7823e7ebec CR3: 000000010a058002 CR4: 0000000000330ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __create_synth_event+0x1e37/0x1eb0 create_or_delete_synth_event+0x110/0x250 synth_event_run_command+0x2f/0x110 test_gen_synth_cmd+0x170/0x2eb [synth_event_gen_test] synth_event_gen_test_init+0x76/0x9bc [synth_event_gen_test] do_one_initcall+0xdb/0x480 do_init_module+0x1cf/0x680 load_module+0x6a50/0x70a0 __do_sys_finit_module+0x12f/0x1c0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://lkml.kernel.org/r/20221117012346.22647-3-shangxiaojing@huawei.com Fixes: 4b147936fa50 ("tracing: Add support for 'synthetic' events") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: stable@vger.kernel.org Cc: <mhiramat@kernel.org> Cc: <zanussi@kernel.org> Cc: <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-12tracing: Fix reading strings from synthetic eventsSteven Rostedt (Google)
The follow commands caused a crash: # cd /sys/kernel/tracing # echo 's:open char file[]' > dynamic_events # echo 'hist:keys=common_pid:file=filename:onchange($file).trace(open,$file)' > events/syscalls/sys_enter_openat/trigger' # echo 1 > events/synthetic/open/enable BOOM! The problem is that the synthetic event field "char file[]" will read the value given to it as a string without any memory checks to make sure the address is valid. The above example will pass in the user space address and the sythetic event code will happily call strlen() on it and then strscpy() where either one will cause an oops when accessing user space addresses. Use the helper functions from trace_kprobe and trace_eprobe that can read strings safely (and actually succeed when the address is from user space and the memory is mapped in). Now the above can show: packagekitd-1721 [000] ...2. 104.597170: open: file=/usr/lib/rpm/fileattrs/cmake.attr in:imjournal-978 [006] ...2. 104.599642: open: file=/var/lib/rsyslog/imjournal.state.tmp packagekitd-1721 [000] ...2. 104.626308: open: file=/usr/lib/rpm/fileattrs/debuginfo.attr Link: https://lkml.kernel.org/r/20221012104534.826549315@goodmis.org Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <zanussi@kernel.org> Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-11tracing: Fix strncpy warning in trace_events_synth.cTom Zanussi
0-day reported the strncpy error below: ../kernel/trace/trace_events_synth.c: In function 'last_cmd_set': ../kernel/trace/trace_events_synth.c:65:9: warning: 'strncpy' specified bound depends on the length o\ f the source argument [-Wstringop-truncation] 65 | strncpy(last_cmd, str, strlen(str) + 1); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../kernel/trace/trace_events_synth.c:65:32: note: length computed here 65 | strncpy(last_cmd, str, strlen(str) + 1); | ^~~~~~~~~~~ There's no reason to use strncpy here, in fact there's no reason to do anything but a simple kstrdup() (note we don't even need to check for failure since last_cmod is expected to be either the last cmd string or NULL, and the containing function is a void return). Link: https://lkml.kernel.org/r/77deca8cbfd226981b3f1eab203967381e9b5bd9.camel@kernel.org Fixes: 27c888da9867 ("tracing: Remove size restriction on synthetic event cmd error logging") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-10tracing: Remove size restriction on synthetic event cmd error loggingTom Zanussi
Currently, synthetic event command error strings are restricted to a length of MAX_FILTER_STR_VAL (256), which is too short for some commands already seen in the wild (with cmd strings longer than that showing up truncated in err_log). Remove the restriction so that no synthetic event command error string is ever truncated. Link: https://lkml.kernel.org/r/0376692396a81d0b795127c66ea92ca5bf60f481.1643399022.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-01-16Merge tag 'trace-v5.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New: - The Real Time Linux Analysis (RTLA) tool is added to the tools directory. - Can safely filter on user space pointers with: field.ustring ~ "match-string" - eprobes can now be filtered like any other event. - trace_marker(_raw) now uses stream_open() to allow multiple threads to safely write to it. Note, this could possibly break existing user space, but we will not know until we hear about it, and then can revert the change if need be. - New field in events to display when bottom halfs are disabled. - Sorting of the ftrace functions are now done at compile time instead of at bootup. Infrastructure changes to support future efforts: - Added __rel_loc type for trace events. Similar to __data_loc but the offset to the dynamic data is based off of the location of the descriptor and not the beginning of the event. Needed for user defined events. - Some simplification of event trigger code. - Make synthetic events process its callback better to not hinder other event callbacks that are registered. Needed for user defined events. And other small fixes and cleanups" * tag 'trace-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (50 commits) tracing: Add ustring operation to filtering string pointers rtla: Add rtla timerlat hist documentation rtla: Add rtla timerlat top documentation rtla: Add rtla timerlat documentation rtla: Add rtla osnoise hist documentation rtla: Add rtla osnoise top documentation rtla: Add rtla osnoise man page rtla: Add Documentation rtla/timerlat: Add timerlat hist mode rtla: Add timerlat tool and timelart top mode rtla/osnoise: Add the hist mode rtla/osnoise: Add osnoise top mode rtla: Add osnoise tool rtla: Helper functions for rtla rtla: Real-Time Linux Analysis tool tracing/osnoise: Properly unhook events if start_per_cpu_kthreads() fails tracing: Remove duplicate warnings when calling trace_create_file() tracing/kprobes: 'nmissed' not showed correctly for kretprobe tracing: Add test for user space strings when filtering on string pointers tracing: Have syscall trace events use trace_event_buffer_lock_reserve() ...
2021-12-11tracing: Do not let synth_events block other dyn_event systems during createBeau Belgrave
synth_events is returning -EINVAL if the dyn_event create command does not contain ' \t'. This prevents other systems from getting called back. synth_events needs to return -ECANCELED in these cases when the command is not targeting the synth_event system. Link: https://lore.kernel.org/linux-trace-devel/20210930223821.11025-1-beaub@linux.microsoft.com Fixes: c9e759b1e8456 ("tracing: Rework synthetic event command parsing") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-09tracing: Fix possible memory leak in __create_synth_event() error pathMiaoqian Lin
There's error paths in __create_synth_event() after the argv is allocated that fail to free it. Add a jump to free it when necessary. Link: https://lkml.kernel.org/r/20211209024317.11783-1-linmq006@gmail.com Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> [ Fixed up the patch and change log ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-12-06tracing: Fix synth_event_add_val() kernel-doc commentQiujun Huang
It's named field here. Link: https://lkml.kernel.org/r/20210516022410.64271-1-hqjagain@gmail.com Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-08tracing: Disable "other" permission bits in the tracefs filesSteven Rostedt (VMware)
When building the files in the tracefs file system, do not by default set any permissions for OTH (other). This will make it easier for admins who want to define a group for accessing tracefs and not having to first disable all the permission bits for "other" in the file system. As tracing can leak sensitive information, it should never by default allowing all users access. An admin can still set the permission bits for others to have access, which may be useful for creating a honeypot and seeing who takes advantage of it and roots the machine. Link: https://lkml.kernel.org/r/20210818153038.864149276@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-18tracing: Have dynamic events have a ref counterSteven Rostedt (VMware)
As dynamic events are not created by modules, if something is attached to one, calling "try_module_get()" on its "mod" field, is not going to keep the dynamic event from going away. Since dynamic events do not need the "mod" pointer of the event structure, make a union out of it in order to save memory (there's one structure for each of the thousand+ events in the kernel), and have any event with the DYNAMIC flag set to use a ref counter instead. Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/ Link: https://lkml.kernel.org/r/20210817035027.174869074@goodmis.org Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-18tracing: Add DYNAMIC flag for dynamic eventsSteven Rostedt (VMware)
To differentiate between static and dynamic events, add a new flag DYNAMIC to the event flags that all dynamic events have set. This will allow to differentiate when attaching to a dynamic event from a static event. Static events have a mod pointer that references the module they were created in (or NULL for core kernel). This can be incremented when the event has something attached to it. But there exists no such mechanism for dynamic events. This is dangerous as the dynamic events may now disappear without the "attachment" knowing that it no longer exists. To enforce the dynamic flag, change dyn_event_add() to pass the event that is being created such that it can set the DYNAMIC flag of the event. This helps make sure that no location that creates a dynamic event misses setting this flag. Link: https://lore.kernel.org/linux-trace-devel/20210813004448.51c7de69ce432d338f4d226b@kernel.org/ Link: https://lkml.kernel.org/r/20210817035026.936958254@goodmis.org Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-23tracing: Clean up alloc_synth_event()Steven Rostedt (VMware)
alloc_synth_event() currently has the following code to initialize the event fields and dynamic_fields: for (i = 0, j = 0; i < n_fields; i++) { event->fields[i] = fields[i]; if (fields[i]->is_dynamic) { event->dynamic_fields[j] = fields[i]; event->dynamic_fields[j]->field_pos = i; event->dynamic_fields[j++] = fields[i]; event->n_dynamic_fields++; } } 1) It would make more sense to have all fields keep track of their field_pos. 2) event->dynmaic_fields[j] is assigned twice for no reason. 3) We can move updating event->n_dynamic_fields outside the loop, and just assign it to j. This combination makes the code much cleaner. Link: https://lkml.kernel.org/r/20210721195341.29bb0f77@oasis.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-23tracing: Fix various typos in commentsIngo Molnar
Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04tracing: Fix memory leak in __create_synth_event()Vamshi K Sthambamkadi
kmemleak report: unreferenced object 0xc5a6f708 (size 8): comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s) hex dump (first 8 bytes): 00 c1 3d 60 14 83 1f 8a ..=`.... backtrace: [<f0aa4ac4>] __kmalloc_track_caller+0x2a6/0x460 [<7d3d60a6>] kstrndup+0x37/0x70 [<45a0e739>] argv_split+0x1c/0x120 [<c17982f8>] __create_synth_event+0x192/0xb00 [<0708b8a3>] create_synth_event+0xbb/0x150 [<3d1941e1>] create_dyn_event+0x5c/0xb0 [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140 [<04deb2ef>] dyn_event_write+0x10/0x20 [<8779ac95>] vfs_write+0xa9/0x3c0 [<ed93722a>] ksys_write+0x89/0xc0 [<b9ca0507>] __ia32_sys_write+0x15/0x20 [<7ce02d85>] __do_fast_syscall_32+0x45/0x80 [<cb0ecb35>] do_fast_syscall_32+0x29/0x60 [<2467454a>] do_SYSENTER_32+0x15/0x20 [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc unreferenced object 0xc5a6f078 (size 8): comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s) hex dump (first 8 bytes): 08 f7 a6 c5 00 00 00 00 ........ backtrace: [<bbac096a>] __kmalloc+0x2b6/0x470 [<aa2624b4>] argv_split+0x82/0x120 [<c17982f8>] __create_synth_event+0x192/0xb00 [<0708b8a3>] create_synth_event+0xbb/0x150 [<3d1941e1>] create_dyn_event+0x5c/0xb0 [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140 [<04deb2ef>] dyn_event_write+0x10/0x20 [<8779ac95>] vfs_write+0xa9/0x3c0 [<ed93722a>] ksys_write+0x89/0xc0 [<b9ca0507>] __ia32_sys_write+0x15/0x20 [<7ce02d85>] __do_fast_syscall_32+0x45/0x80 [<cb0ecb35>] do_fast_syscall_32+0x29/0x60 [<2467454a>] do_SYSENTER_32+0x15/0x20 [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc In __create_synth_event(), while iterating field/type arguments, the argv_split() will return array of atleast 2 elements even when zero arguments(argc=0) are passed. for e.g. when there is double delimiter or string ends with delimiter To fix call argv_free() even when argc=0. Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09tracing: Add a backward-compatibility check for synthetic event creationTom Zanussi
The synthetic event parsing rework now requires semicolons between synthetic event fields. That requirement breaks existing users who might already have used the old synthetic event command format, so this adds an inner loop that can parse more than one field, if present, between semicolons. For each field, parse_synth_field() checks in which version that field was introduced, using check_field_version(). The caller, __create_synth_event() can then use that version information to determine whether or not to enforce the requirement on the command as a whole. In the future, if/when new features are added, the requirement will be that any field/string containing the new feature must use semicolons, and the check_field_version() check can then check for those and enforce it. Using a version number allows this scheme to be extended if necessary. Link: https://lkml.kernel.org/r/74fcc500d561b40ce91c5ee94818c70c6b0c9330.1612208610.git.zanussi@kernel.org [ zanussi: added check_field_version() comment from rostedt@goodmis.org ] Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09tracing: Update synth command errorsTom Zanussi
Since array types are handled differently, errors referencing them also need to be handled differently. Add and use a new INVALID_ARRAY_SPEC error. Also add INVALID_CMD and INVALID_DYN_CMD to catch and display the correct form for badly-formed commands, which can also be used in place of CMD_INCOMPLETE, which is removed, and remove CMD_TOO_LONG, since it's no longer used. Link: https://lkml.kernel.org/r/b9dd434dc6458dcff11adc6ed616fe93a8794770.1612208610.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09tracing: Rework synthetic event command parsingTom Zanussi
Now that command parsing has been delegated to the create functions and we're no longer constrained by argv_split(), we can modify the synthetic event command parser to better match the higher-level structure of the synthetic event commands, which is basically an event name followed by a set of semicolon-separated fields. Since we're also now passed the raw command, we can also save it directly and can get rid of save_cmdstr(). Link: https://lkml.kernel.org/r/cb9e2be92d992ce59f2b4f132264a5d467f3933f.1612208610.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>