From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sourceware-bugzilla@sourceware.org>
Received: by sourceware.org (Postfix, from userid 48)
 id D529B3858410; Fri,  5 Nov 2021 17:58:51 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D529B3858410
From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla@sourceware.org>
To: gdb-prs@sourceware.org
Subject: [Bug threads/28065] FAIL:
 gdb.threads/access-mem-running-thread-exit.exp: all-stop: access mem (print
 global_var after writing again, inf=1, iter=354)
Date: Fri, 05 Nov 2021 17:58:51 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gdb
X-Bugzilla-Component: threads
X-Bugzilla-Version: HEAD
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: cvs-commit at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-28065-4717-ONam0khWOq@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-28065-4717@http.sourceware.org/bugzilla/>
References: <bug-28065-4717@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gdb-prs@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-prs mailing list <gdb-prs.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-prs>,
 <mailto:gdb-prs-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-prs/>
List-Help: <mailto:gdb-prs-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-prs>,
 <mailto:gdb-prs-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Nov 2021 17:58:51 -0000

https://sourceware.org/bugzilla/show_bug.cgi?id=3D28065
--- Comment #22 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot g=
nu.org> ---
The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3D8a89ddbda2ec=
b41be0f12142e5d4b95c7bd5a138

commit 8a89ddbda2ecb41be0f12142e5d4b95c7bd5a138
Author: Pedro Alves <pedro@palves.net>
Date:   Tue Sep 14 19:01:37 2021 +0100

    Avoid /proc/pid/mem races (PR 28065)

    PR 28065 (gdb.threads/access-mem-running-thread-exit.exp intermittent
    failure) shows that GDB can hit an unexpected scenario -- it can
    happen that the kernel manages to open a /proc/PID/task/LWP/mem file,
    but then reading from the file returns 0/EOF, even though the process
    hasn't exited or execed.

    "0" out of read/write is normally what you get when the address space
    of the process the file was open for is gone, because the process
    execed or exited.  So when GDB gets the 0, it returns memory access
    failure.  In the bad case in question, the process hasn't execed or
    exited, so GDB fails a memory access when the access should have
    worked.

    GDB has code in place to gracefully handle the case of opening the
    /proc/PID/task/LWP/mem just while the LWP is exiting -- most often the
    open fails with EACCES or ENOENT.  When it happens, GDB just tries
    opening the file for a different thread of the process.  The testcase
    is written such that it stresses GDB's logic of closing/reopening the
    /proc/PID/task/LWP/mem file, by constantly spawning short lived
    threads.

    However, there's a window where the kernel manages to find the thread,
    but the thread exits just after and clears its address space pointer.
    In this case, the kernel creates a file successfully, but the file
    ends up with no address space associated, so a subsequent read/write
    returns 0/EOF too, just like if the whole process had execed or
    exited.  This is the case in question that GDB does not handle.

    Oleg Nesterov gave this suggestion as workaround for that race:

        gdb can open(/proc/pid/mem) and then read (say) /proc/pid/statm.
        If statm reports something non-zero, then open() was "successfull".

    I think that might work.  However, I didn't try it, because I realized
    we have another nasty race that that wouldn't fix.

    The other race I realized is that because we close/reopen the
    /proc/PID/task/LWP/mem file when GDB switches to a different inferior,
    then it can happen that GDB reopens /proc/PID/task/LWP/mem just after
    a thread execs, and before GDB has seen the corresponding exec event.
    I.e., we can open a /proc/PID/task/LWP/mem file accessing the
    post-exec address space thinking we're accessing the pre-exec address
    space.

    A few months back, Simon, Oleg and I discussed a similar race:

      [Bug gdb/26754] Race condition when resuming threads and one does an =
exec
      https://sourceware.org/bugzilla/show_bug.cgi?id=3D26754

    The solution back then was to make the kernel fail any ptrace
    operation until the exec event is consumed, with this kernel commit:

     commit dbb5afad100a828c97e012c6106566d99f041db6
     Author:     Oleg Nesterov <oleg@redhat.com>
     AuthorDate: Wed May 12 15:33:08 2021 +0200
     Commit:     Linus Torvalds <torvalds@linux-foundation.org>
     CommitDate: Wed May 12 10:45:22 2021 -0700

         ptrace: make ptrace() fail if the tracee changed its pid unexpecte=
dly

    This however, only applies to ptrace, not to the /proc/pid/mem file
    opening case.  Also, even if it did apply to the file open case, we
    would want to support current kernels until such a fix is more wide
    spread anyhow.

    So all in all, this commit gives up on the idea of only ever keeping
    one /proc/pid/mem file descriptor open.  Instead, make GDB open a
    /proc/pid/mem per inferior, and keep it open until the inferior exits,
    is detached or execs.  Make GDB open the file right after the inferior
    is created or is attached to or forks, at which point we know the
    inferior is stable and stopped and isn't thus going to exec, or have a
    thread exit, and so the file open won't fail (unless the whole process
    is SIGKILLed from outside GDB, at which point it doesn't matter
    whether we open the file).

    This way, we avoid both races described above, at the expense of using
    more file descriptors (one per inferior).

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=3D28065
    Change-Id: Iff943b95126d0f98a7973a07e989e4f020c29419

--=20
You are receiving this mail because:
You are on the CC list for the bug.=