public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/23296] Data race in setting function descriptor during lazy binding
       [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
@ 2020-03-30 20:38 ` cvs-commit at gcc dot gnu.org
  2020-03-30 20:45 ` danglin at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-03-30 20:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=23296

--- Comment #18 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by John David Anglin
<danglin@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1a044511a3f9020c3f430164e0a6a77426fecd7e

commit 1a044511a3f9020c3f430164e0a6a77426fecd7e
Author: John David Anglin <danglin@gcc.gnu.org>
Date:   Mon Mar 30 20:36:49 2020 +0000

    Fix data race in setting function descriptors during lazy binding on hppa.

    This addresses an issue that is present mainly on SMP machines running
    threaded code.  In a typical indirect call or PLT import stub, the
    target address is loaded first.  Then the global pointer is loaded into
    the PIC register in the delay slot of a branch to the target address.
    During lazy binding, the target address is a trampoline which transfers
    to _dl_runtime_resolve().

    _dl_runtime_resolve() uses the relocation offset stored in the global
    pointer and the linkage map stored in the trampoline to find the
    relocation.  Then, the function descriptor is updated.

    In a multi-threaded application, it is possible for the global pointer
    to be updated between the load of the target address and the global
    pointer.  When this happens, the relocation offset has been replaced
    by the new global pointer.  The function pointer has probably been
    updated as well but there is no way to find the address of the function
    descriptor and to transfer to the target.  So, _dl_runtime_resolve()
    typically crashes.

    HP-UX addressed this problem by adding an extra pc-relative branch to
    the trampoline.  The descriptor is initially setup to point to the
    branch.  The branch then transfers to the trampoline.  This allowed
    the trampoline code to figure out which descriptor was being used
    without any modification to user code.  I didn't use this approach
    as it is more complex and changes function pointer canonicalization.

    The order of loading the target address and global pointer in
    indirect calls was not consistent with the order used in import stubs.
    In particular, $$dyncall and some inline versions of it loaded the
    global pointer first.  This was inconsistent with the global pointer
    being updated first in dl-machine.h.  Assuming the accesses are
    ordered, we want elf_machine_fixup_plt() to store the global pointer
    first and calls to load it last.  Then, the global pointer will be
    correct when the target function is entered.

    However, just to make things more fun, HP added support for
    out-of-order execution of accesses in PA 2.0.  The accesses used by
    calls are weakly ordered. So, it's possibly under some circumstances
    that a function might be entered with the wrong global pointer.
    However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
    that loading the global pointer in the delay slot of the branch must
    work consistently.

    The basic fix for the race is a combination of modifying user code to
    preserve the address of the function descriptor in register %r22 and
    setting the least-significant bit in the relocation offset.  The
    latter was suggested by Carlos as a way to distinguish relocation
    offsets from global pointer values.  Conventionally, %r22 is used
    as the address of the function descriptor in calls to $$dyncall.
    So, it wasn't hard to preserve the address in %r22.

    I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
    $$dyncall and inline indirect calls.  I have also modified the import
    stubs in binutils trunk and the 2.33 branch to preserve %r22.  This
    required making the stubs one instruction longer but we save one
    relocation.  I also modified binutils to align the .plt section on
    a 8-byte boundary.  This allows descriptors to be updated atomically
    with a floting-point store.

    With these changes, _dl_runtime_resolve() can fallback to an alternate
    mechanism to find the relocation offset when it has been clobbered.
    There's just one additional instruction in the fast path. I tested
    the fallback function, _dl_fix_reloc_arg(), by changing the branch to
    always use the fallback.  Old code still runs as it did before.

    Fixes bug 23296.

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/23296] Data race in setting function descriptor during lazy binding
       [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
  2020-03-30 20:38 ` [Bug libc/23296] Data race in setting function descriptor during lazy binding cvs-commit at gcc dot gnu.org
@ 2020-03-30 20:45 ` danglin at gcc dot gnu.org
  2020-05-04 19:59 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: danglin at gcc dot gnu.org @ 2020-03-30 20:45 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=23296

John David Anglin <danglin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #19 from John David Anglin <danglin at gcc dot gnu.org> ---
Fixed on trunk by commit 1a044511a3f9020c3f430164e0a6a77426fecd7e.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/23296] Data race in setting function descriptor during lazy binding
       [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
  2020-03-30 20:38 ` [Bug libc/23296] Data race in setting function descriptor during lazy binding cvs-commit at gcc dot gnu.org
  2020-03-30 20:45 ` danglin at gcc dot gnu.org
@ 2020-05-04 19:59 ` cvs-commit at gcc dot gnu.org
  2020-05-04 20:01 ` cvs-commit at gcc dot gnu.org
  2020-07-08 16:44 ` jsm28 at gcc dot gnu.org
  4 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-04 19:59 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=23296

--- Comment #20 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The release/2.31/master branch has been updated by Aurelien Jarno
<aurel32@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=91b909315c4e33e4569529886d8b7dbbf97b244c

commit 91b909315c4e33e4569529886d8b7dbbf97b244c
Author: John David Anglin <danglin@gcc.gnu.org>
Date:   Mon Mar 30 20:36:49 2020 +0000

    Fix data race in setting function descriptors during lazy binding on hppa.

    This addresses an issue that is present mainly on SMP machines running
    threaded code.  In a typical indirect call or PLT import stub, the
    target address is loaded first.  Then the global pointer is loaded into
    the PIC register in the delay slot of a branch to the target address.
    During lazy binding, the target address is a trampoline which transfers
    to _dl_runtime_resolve().

    _dl_runtime_resolve() uses the relocation offset stored in the global
    pointer and the linkage map stored in the trampoline to find the
    relocation.  Then, the function descriptor is updated.

    In a multi-threaded application, it is possible for the global pointer
    to be updated between the load of the target address and the global
    pointer.  When this happens, the relocation offset has been replaced
    by the new global pointer.  The function pointer has probably been
    updated as well but there is no way to find the address of the function
    descriptor and to transfer to the target.  So, _dl_runtime_resolve()
    typically crashes.

    HP-UX addressed this problem by adding an extra pc-relative branch to
    the trampoline.  The descriptor is initially setup to point to the
    branch.  The branch then transfers to the trampoline.  This allowed
    the trampoline code to figure out which descriptor was being used
    without any modification to user code.  I didn't use this approach
    as it is more complex and changes function pointer canonicalization.

    The order of loading the target address and global pointer in
    indirect calls was not consistent with the order used in import stubs.
    In particular, $$dyncall and some inline versions of it loaded the
    global pointer first.  This was inconsistent with the global pointer
    being updated first in dl-machine.h.  Assuming the accesses are
    ordered, we want elf_machine_fixup_plt() to store the global pointer
    first and calls to load it last.  Then, the global pointer will be
    correct when the target function is entered.

    However, just to make things more fun, HP added support for
    out-of-order execution of accesses in PA 2.0.  The accesses used by
    calls are weakly ordered. So, it's possibly under some circumstances
    that a function might be entered with the wrong global pointer.
    However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
    that loading the global pointer in the delay slot of the branch must
    work consistently.

    The basic fix for the race is a combination of modifying user code to
    preserve the address of the function descriptor in register %r22 and
    setting the least-significant bit in the relocation offset.  The
    latter was suggested by Carlos as a way to distinguish relocation
    offsets from global pointer values.  Conventionally, %r22 is used
    as the address of the function descriptor in calls to $$dyncall.
    So, it wasn't hard to preserve the address in %r22.

    I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
    $$dyncall and inline indirect calls.  I have also modified the import
    stubs in binutils trunk and the 2.33 branch to preserve %r22.  This
    required making the stubs one instruction longer but we save one
    relocation.  I also modified binutils to align the .plt section on
    a 8-byte boundary.  This allows descriptors to be updated atomically
    with a floting-point store.

    With these changes, _dl_runtime_resolve() can fallback to an alternate
    mechanism to find the relocation offset when it has been clobbered.
    There's just one additional instruction in the fast path. I tested
    the fallback function, _dl_fix_reloc_arg(), by changing the branch to
    always use the fallback.  Old code still runs as it did before.

    Fixes bug 23296.

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 1a044511a3f9020c3f430164e0a6a77426fecd7e)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/23296] Data race in setting function descriptor during lazy binding
       [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-05-04 19:59 ` cvs-commit at gcc dot gnu.org
@ 2020-05-04 20:01 ` cvs-commit at gcc dot gnu.org
  2020-07-08 16:44 ` jsm28 at gcc dot gnu.org
  4 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-04 20:01 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=23296

--- Comment #21 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The release/2.30/master branch has been updated by Aurelien Jarno
<aurel32@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6f4527a7dddad303ef2d4ad99a970fac2fab6629

commit 6f4527a7dddad303ef2d4ad99a970fac2fab6629
Author: John David Anglin <danglin@gcc.gnu.org>
Date:   Mon Mar 30 20:36:49 2020 +0000

    Fix data race in setting function descriptors during lazy binding on hppa.

    This addresses an issue that is present mainly on SMP machines running
    threaded code.  In a typical indirect call or PLT import stub, the
    target address is loaded first.  Then the global pointer is loaded into
    the PIC register in the delay slot of a branch to the target address.
    During lazy binding, the target address is a trampoline which transfers
    to _dl_runtime_resolve().

    _dl_runtime_resolve() uses the relocation offset stored in the global
    pointer and the linkage map stored in the trampoline to find the
    relocation.  Then, the function descriptor is updated.

    In a multi-threaded application, it is possible for the global pointer
    to be updated between the load of the target address and the global
    pointer.  When this happens, the relocation offset has been replaced
    by the new global pointer.  The function pointer has probably been
    updated as well but there is no way to find the address of the function
    descriptor and to transfer to the target.  So, _dl_runtime_resolve()
    typically crashes.

    HP-UX addressed this problem by adding an extra pc-relative branch to
    the trampoline.  The descriptor is initially setup to point to the
    branch.  The branch then transfers to the trampoline.  This allowed
    the trampoline code to figure out which descriptor was being used
    without any modification to user code.  I didn't use this approach
    as it is more complex and changes function pointer canonicalization.

    The order of loading the target address and global pointer in
    indirect calls was not consistent with the order used in import stubs.
    In particular, $$dyncall and some inline versions of it loaded the
    global pointer first.  This was inconsistent with the global pointer
    being updated first in dl-machine.h.  Assuming the accesses are
    ordered, we want elf_machine_fixup_plt() to store the global pointer
    first and calls to load it last.  Then, the global pointer will be
    correct when the target function is entered.

    However, just to make things more fun, HP added support for
    out-of-order execution of accesses in PA 2.0.  The accesses used by
    calls are weakly ordered. So, it's possibly under some circumstances
    that a function might be entered with the wrong global pointer.
    However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
    that loading the global pointer in the delay slot of the branch must
    work consistently.

    The basic fix for the race is a combination of modifying user code to
    preserve the address of the function descriptor in register %r22 and
    setting the least-significant bit in the relocation offset.  The
    latter was suggested by Carlos as a way to distinguish relocation
    offsets from global pointer values.  Conventionally, %r22 is used
    as the address of the function descriptor in calls to $$dyncall.
    So, it wasn't hard to preserve the address in %r22.

    I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
    $$dyncall and inline indirect calls.  I have also modified the import
    stubs in binutils trunk and the 2.33 branch to preserve %r22.  This
    required making the stubs one instruction longer but we save one
    relocation.  I also modified binutils to align the .plt section on
    a 8-byte boundary.  This allows descriptors to be updated atomically
    with a floting-point store.

    With these changes, _dl_runtime_resolve() can fallback to an alternate
    mechanism to find the relocation offset when it has been clobbered.
    There's just one additional instruction in the fast path. I tested
    the fallback function, _dl_fix_reloc_arg(), by changing the branch to
    always use the fallback.  Old code still runs as it did before.

    Fixes bug 23296.

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 1a044511a3f9020c3f430164e0a6a77426fecd7e)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/23296] Data race in setting function descriptor during lazy binding
       [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2020-05-04 20:01 ` cvs-commit at gcc dot gnu.org
@ 2020-07-08 16:44 ` jsm28 at gcc dot gnu.org
  4 siblings, 0 replies; 5+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2020-07-08 16:44 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=23296

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |2.32

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-08 16:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-23296-131@http.sourceware.org/bugzilla/>
2020-03-30 20:38 ` [Bug libc/23296] Data race in setting function descriptor during lazy binding cvs-commit at gcc dot gnu.org
2020-03-30 20:45 ` danglin at gcc dot gnu.org
2020-05-04 19:59 ` cvs-commit at gcc dot gnu.org
2020-05-04 20:01 ` cvs-commit at gcc dot gnu.org
2020-07-08 16:44 ` jsm28 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).