* [Bug tapsets/20734] New: "sleeping function called from invalid context" bogus kernel BUG on s390x
@ 2016-10-24 16:55 dsmith at redhat dot com
2017-10-05 18:13 ` [Bug tapsets/20734] " dsmith at redhat dot com
0 siblings, 1 reply; 2+ messages in thread
From: dsmith at redhat dot com @ 2016-10-24 16:55 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20734
Bug ID: 20734
Summary: "sleeping function called from invalid context" bogus
kernel BUG on s390x
Product: systemtap
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: tapsets
Assignee: systemtap at sourceware dot org
Reporter: dsmith at redhat dot com
Target Milestone: ---
Host: s390x
On the RHEL6.8 debug kernel (2.6.32-642.el6.s390x.debug), we're getting a bogus
kernel BUG:
====
BUG: sleeping function called from invalid context at mm/memory.c:3752
in_atomic(): 1, irqs_disabled(): 0, pid: 52998, name: cxxclass.exe
INFO: lockdep is turned off.
CPU: 0 Not tainted 2.6.32-642.el6.s390x.debug #1
Process cxxclass.exe (pid: 52998, task: 000000004197c540, ksp:
000000003b8cf9d8)
000000003b8cfa28 000000003b8cf9a8 0000000000000002 0000000000000000
000000003b8cfa48 000000003b8cf9c0 000000003b8cf9c0 000000000050ed16
0000000000000000 000000004197ca30 0000000000000001 0000000000400868
000000000000000d 000000000000000c 000000003b8cfa18 0000000000000000
0000000000000000 0000000000105fbc 000000003b8cf9a8 000000003b8cf9e8
Call Trace:
([<0000000000105eb4>] show_trace+0xf0/0x148)
[<000000000013c3ee>] __might_sleep+0x12e/0x15c
[<0000000000230b16>] might_fault+0x42/0xc8
[<000003e00266ecca>] function___global_user_string__overload_0+0x18a/0x2d8
[sta
p_2d8297dddcad2965f3be0ef5bb85c32_52997]
[<000003e0026752aa>] probe_2842+0x9e/0x294
[stap_2d8297dddcad2965f3be0ef5bb85c3
2_52997]
[<000003e002672dac>] enter_uprobe_probe+0x250/0x474
[stap_2d8297dddcad2965f3be0
ef5bb85c32_52997]
[<000003e00108f492>] uprobe_report_signal+0xa86/0xe80 [uprobes]
[<00000000001bf10c>] utrace_get_signal+0x2d4/0x758
[<000000000016744c>] get_signal_to_deliver+0x35c/0x4b8
[<000000000010ebbc>] do_signal+0x90/0xa1c
[<000000000011a1d0>] sysc_sigpending+0xe/0x22
[<00000000004006ea>] cio_clear+0x136/0x180
====
Why is this kernel BUG bogus? Because of the following. We're getting that
error in the tapset function user_string(), which ends up calling
_stp_strncpy_from_user(), which looks like this:
====
static long _stp_strncpy_from_user(char *dst, const char __user *src, long
count)
{
long res = -EFAULT;
mm_segment_t _oldfs = get_fs();
set_fs(USER_DS);
pagefault_disable();
if (!lookup_bad_addr(VERIFY_READ, (const unsigned long)src, count))
res = strncpy_from_user(dst, src, count);
pagefault_enable();
set_fs(_oldfs);
return res;
}
====
So, in that function, we disable page faults and call the kernel's
strncpy_from_user(). For RHEL6 s390x, that comes from
arch/s390/include/asm/uaccess.h and looks like this:
====
static inline long __must_check
strncpy_from_user(char *dst, const char __user *src, long count)
{
long res = -EFAULT;
might_fault();
if (access_ok(VERIFY_READ, src, 1))
res = uaccess.strncpy_from_user(count, src, dst);
return res;
}
====
The 'might_fault()' call here is the problem. In RHEL6, it looks like this
(from mm/memory.c):
====
void might_fault(void)
{
/*
* Some code (nfs/sunrpc) uses socket ops on kernel memory while
* holding the mmap_sem, this is safe because kernel memory doesn't
* get paged out, therefore we'll never actually fault, and the
* below annotations will generate false positives.
*/
if (segment_eq(get_fs(), KERNEL_DS))
return;
might_sleep();
/*
* it would be nicer only to annotate paths which are not under
* pagefault_disable, however that requires a larger audit and
* providing helpers like get_user_atomic.
*/
if (!in_atomic() && current->mm)
might_lock_read(¤t->mm->mmap_sem);
}
====
In RHEL7 and new kernels, there is the following code before the
'might_sleep()' call:
====
if (pagefault_disabled())
return;
====
So, we're getting this kernel BUG report because of a bug in RHEL6's
might_fault(), that it doesn't check for page faulting being disabled before
calling might_sleep().
The only fix I can think of is to have an arch-specific s390x version of
strncpy_from_user() for RHEL6-era kernels that doesn't call might_fault() (or
checks to see if page faulting is disabled before calling might_fault()).
(Alternatively, since we only get this BUG on a debug kernel, we could just
ignore it. If we decide to ignore this BUG, we could treat this bug report as
documentation for this issue.)
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tapsets/20734] "sleeping function called from invalid context" bogus kernel BUG on s390x
2016-10-24 16:55 [Bug tapsets/20734] New: "sleeping function called from invalid context" bogus kernel BUG on s390x dsmith at redhat dot com
@ 2017-10-05 18:13 ` dsmith at redhat dot com
0 siblings, 0 replies; 2+ messages in thread
From: dsmith at redhat dot com @ 2017-10-05 18:13 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20734
David Smith <dsmith at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #1 from David Smith <dsmith at redhat dot com> ---
This one is fixed now. As part of the bpf runtime changes,
_stp_strncpy_from_user() now calls _stp_deref_string_nofault(), which calls
__stp_get_user(). __stp_get_user() is really __get_user() on RHEL6 s390x. Since
__get_user() doesn't call might_fault(), we don't get this bogus BUG.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-10-05 18:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24 16:55 [Bug tapsets/20734] New: "sleeping function called from invalid context" bogus kernel BUG on s390x dsmith at redhat dot com
2017-10-05 18:13 ` [Bug tapsets/20734] " dsmith at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).