public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: "dsmith at redhat dot com" <sourceware-bugzilla@sourceware.org>
To: systemtap@sourceware.org
Subject: [Bug runtime/20742] kernel pointer dereference BUG on RHEL6 s390x
Date: Tue, 01 Nov 2016 20:15:00 -0000	[thread overview]
Message-ID: <bug-20742-6586-N4nCqxzzU7@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-20742-6586@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=20742

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
Created attachment 9606
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9606&action=edit
proposed patch

Here's a patch that attempts to fix this problem. There are actually 3 fixes in
this patch.

Fix #1: uprobe_report_clone(), did the following:

====
        rcu_read_lock();
        ptask = (struct uprobe_task *)rcu_dereference(engine->data);
        uproc = ptask->uproc;
        rcu_read_unlock();

        /*
         * Lock uproc so no new uprobes can be installed 'til all
         * report_clone activities are completed.  Lock uproc_table
         * in case we have to run uprobe_fork_uproc().
         */
        lock_uproc_table();
        down_write(&uproc->rwsem);
====

This isn't correct. When we're under the rcu_read_lock(), the ptask and uproc
points should be valid. However, outside of the rcu protection, the ptask/uproc
memory can be freed.

To fix this, I changed the code to the following:

====
        rcu_read_lock();
        ptask = (struct uprobe_task *)rcu_dereference(engine->data);
        BUG_ON(!ptask);
        /* Keep uproc intact until just before we return. */
        uproc = uprobe_get_process(ptask->uproc);
        rcu_read_unlock();

        if (!uproc)
                /* uprobe_free_process() has probably clobbered utask->proc. */
                return UTRACE_DETACH;

        /*
         * Lock uproc so no new uprobes can be installed 'til all
         * report_clone activities are completed.  Lock uproc_table
         * in case we have to run uprobe_fork_uproc().
         */
        lock_uproc_table();
        down_write(&uproc->rwsem);
====

This changes the code to increase the reference count on uproc so it won't get
freed. Similar code is present in uprobe_report_signal(), uprobe_report_exit(),
etc.

I also added code to uprobe_report_clone() to decrement the reference count on
uproc before returning.

When the systemtap.unprivileged/pr16806.exp is run with the
uprobe_report_clone() changes, the test no longer crashed the system, but 2 new
problems popped up:

[ BUG: lock held when returning to user space! ]                                
------------------------------------------------                                
loop/4866 is leaving the kernel with locks still held!                          
2 locks held by loop/4866:                                                      
 #0:  (&uproc->rwsem){+++++.}, at: [<000003e00108ad32>]
uprobe_report_signal+0x2
96/0xe80 [uprobes]                                                              
 #1:  (&slot->rwsem){+.+...}, at: [<000003e0010898e0>]
uprobe_find_insn_slot+0x1
60/0x33c [uprobes]                                                              


Fix #2: To fix first lock being held problem above I took a look at
uprobe_report_signal(). That code is quit complicated, and has 'goto'
statements that jump from one switch statement case to another. I could see
several paths through that code that could lead to uproc->rwsem still being
held on exit.

The changes to uprobe_report_signal() attempt to fix those issues.


Fix #3: uprobe_find_insn_slot() always returns the slot read-locked. It is only
called by uprobe_get_insn_slot(), which returns that read-locked slot.
uprobe_get_insn_slot() is only called by uprobe_pre_ssout(), which wasn't
unlocking the slot. I changed uprobe_pre_ssout() to unlock the slot before
returning.


With this path the test passes fine. However, I'll need to do more testing to
ensure I haven't broken anything else.

-- 
You are receiving this mail because:
You are the assignee for the bug.

  reply	other threads:[~2016-11-01 20:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 19:47 [Bug runtime/20742] New: " dsmith at redhat dot com
2016-11-01 20:15 ` dsmith at redhat dot com [this message]
2016-11-10 16:00 ` [Bug runtime/20742] " dsmith at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-20742-6586-N4nCqxzzU7@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).