Hi list, I'm trying to profiling some Linux kernel subsystems by systemtap. So I want to know the overhead of various kinds of systemtap probes. Basically I sampled two kernel function probes, probe kernel.function(arch_cpu_idle_enter).call probe kernel.function(local_touch_nmi).call ======== a snip objdump of the code: ffffffff81035740 : ffffffff81035740: e8 4b 76 55 00 callq ffffffff8158cd90 <__fentry__> ffffffff81035745: e8 f6 96 ff ff callq ffffffff8102ee40 ======== because local irq is disabled, I suppose the duration of these two probes to be ZERO. However, I run the attached stp script and here is what I got: $ stap test.stp Press ^C to stop Total time: 10106 miliseconds t_end @count=2908 @min=676 @max=88152 @sum=15005802 @avg=5160 the duration varies from 676 to 88152, I understand all the tracing mechanism should have overhead, while 676ns is acceptable, but 88152 is not. And I have no idea why its range is so big? Please let me know if I can provide more information on my side. Thanks, -Aubrey $ stap -V Systemtap translator/driver (version 3.1/0.167, commit release-3.0-273-g7d27ff5084d6) Copyright (C) 2005-2016 Red Hat, Inc. and others This is free software; see the source for copying conditions. tested kernel versions: 2.6.18 ... 4.8 enabled features: NLS $ uname -a Linux aubrey-hp 4.8.10+ #5 SMP Fri Nov 25 02:11:32 CST 2016 x86_64 GNU/Linux