public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Alan Modra <amodra@gmail.com>
To: libc-alpha@sourceware.org
Subject: [PATCH 0/6] PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY
Date: Thu, 01 Jun 2017 13:04:00 -0000	[thread overview]
Message-ID: <20170601130442.GF8842@bubble.grove.modra.org> (raw)

ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return.  Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions.   The
optimization is attractive.  The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.

This patch series is the glibc part of this optimization, the checks
in ld.so necessary to ensure that functions with st_other localentry
non-zero are not called by code expecting localentry:0.  Also, lots of
powerpc64 glibc assembly didn't use the proper localentry:0
designation for functions that don't use r2, so the series fixes that
too, and some other assorted problems I noticed along the way.  Many
of the mem and str functions benefit.

Note that building a multiarch glibc kills this optimization for most
functions.  IFUNCs can't be called with the optimized stub without
changing the ABI.

Regression tested on powerpc64le using --with-cpu=power8,
--with-cpu=power7, and --with-cpu=power8 --enable-multiarch.

Alan Modra (6):
  PowerPC64, fix calls to _mcount
  PowerPC64 FRAME_PARM_SAVE
  PowerPC64 sysdep.h tidy
  PowerPC64 strncpy, stpncpy and strstr fixes
  PowerPC64 ENTRY_TOCLESS
  PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY

 ChangeLog                                          | 195 +++++++++++++++++++
 elf/dl-runtime.c                                   |   3 +-
 elf/elf.h                                          |   3 +-
 elf/testobj6.c                                     |   3 +
 sysdeps/aarch64/dl-machine.h                       |   1 +
 sysdeps/alpha/dl-machine.h                         |   5 +-
 sysdeps/arm/dl-machine.h                           |   1 +
 sysdeps/generic/dl-machine.h                       |   7 +-
 sysdeps/hppa/dl-machine.h                          |   5 +-
 sysdeps/i386/dl-machine.h                          |   1 +
 sysdeps/ia64/dl-machine.h                          |   3 +-
 sysdeps/m68k/dl-machine.h                          |   1 +
 sysdeps/microblaze/dl-machine.h                    |   1 +
 sysdeps/mips/dl-machine.h                          |   1 +
 sysdeps/nios2/dl-machine.h                         |   1 +
 sysdeps/powerpc/powerpc32/dl-machine.h             |   1 +
 sysdeps/powerpc/powerpc64/a2/memcpy.S              |   2 +-
 sysdeps/powerpc/powerpc64/addmul_1.S               |   2 +-
 sysdeps/powerpc/powerpc64/cell/memcpy.S            |   2 +-
 sysdeps/powerpc/powerpc64/dl-machine.c             |  22 ++-
 sysdeps/powerpc/powerpc64/dl-machine.h             |  54 +++---
 sysdeps/powerpc/powerpc64/dl-trampoline.S          |   4 +-
 sysdeps/powerpc/powerpc64/fpu/s_ceil.S             |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_ceilf.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_copysign.S         |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_copysignl.S        |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_fabsl.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_floor.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_floorf.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_isnan.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_llrint.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_llrintf.S          |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S        |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S       |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_rint.S             |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_rintf.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_round.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_roundf.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_trunc.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_truncf.S           |   2 +-
 sysdeps/powerpc/powerpc64/lshift.S                 |   2 +-
 sysdeps/powerpc/powerpc64/memcpy.S                 |   2 +-
 sysdeps/powerpc/powerpc64/memset.S                 |   2 +-
 sysdeps/powerpc/powerpc64/mul_1.S                  |   2 +-
 .../powerpc/powerpc64/multiarch/stpncpy-power7.S   |   3 +
 .../powerpc/powerpc64/multiarch/stpncpy-power8.S   |   5 +
 .../powerpc/powerpc64/multiarch/strncpy-power7.S   |   3 +
 .../powerpc/powerpc64/multiarch/strncpy-power8.S   |   3 +
 .../powerpc/powerpc64/multiarch/strrchr-power8.S   |  21 +-
 .../powerpc/powerpc64/multiarch/strstr-power7.S    |   5 +
 sysdeps/powerpc/powerpc64/power4/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power4/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power4/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power4/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S     |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S  |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S  |   2 +-
 sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power6/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power6/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S    |   2 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S   |   2 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S  |   2 +-
 sysdeps/powerpc/powerpc64/power7/add_n.S           |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S    |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S     |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power7/memchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memmove.S         |   4 +-
 sysdeps/powerpc/powerpc64/power7/mempcpy.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/memrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power7/rawmemchr.S       |   2 +-
 sysdeps/powerpc/powerpc64/power7/strcasecmp.S      |   3 +-
 sysdeps/powerpc/powerpc64/power7/strchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strchrnul.S       |   2 +-
 sysdeps/powerpc/powerpc64/power7/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strlen.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strncpy.S         |   9 +-
 sysdeps/powerpc/powerpc64/power7/strnlen.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strstr.S          |  16 +-
 sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S    |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S     |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S    |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S   |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power8/strcasestr.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/strchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strlen.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strncpy.S         |  10 +-
 sysdeps/powerpc/powerpc64/power8/strnlen.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strspn.S          |   2 +-
 sysdeps/powerpc/powerpc64/power9/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power9/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/ppc-mcount.S             |   4 +-
 sysdeps/powerpc/powerpc64/start.S                  |   4 +-
 sysdeps/powerpc/powerpc64/strchr.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strcmp.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strlen.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strncmp.S                |   2 +-
 sysdeps/powerpc/powerpc64/sysdep.h                 | 213 ++++++++++-----------
 sysdeps/s390/s390-32/dl-machine.h                  |   1 +
 sysdeps/s390/s390-64/dl-machine.h                  |   1 +
 sysdeps/sh/dl-machine.h                            |   1 +
 sysdeps/sparc/sparc32/dl-machine.h                 |   1 +
 sysdeps/sparc/sparc64/dl-machine.h                 |   1 +
 sysdeps/tile/dl-machine.h                          |   3 +-
 .../sysv/linux/powerpc/powerpc64/makecontext.S     |  26 +--
 sysdeps/x86_64/dl-machine.h                        |   1 +
 130 files changed, 562 insertions(+), 274 deletions(-)


-- 
Alan Modra
Australia Development Lab, IBM

             reply	other threads:[~2017-06-01 13:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01 13:04 Alan Modra [this message]
2017-06-01 13:08 ` [PATCH 1/6] PowerPC64, fix calls to _mcount Alan Modra
2017-06-12 18:07   ` Tulio Magno Quites Machado Filho
2017-06-01 13:09 ` [PATCH 3/6] PowerPC64 sysdep.h tidy Alan Modra
2017-06-12 18:12   ` Tulio Magno Quites Machado Filho
2017-06-13  1:27     ` Alan Modra
2017-06-13 17:05       ` Tulio Magno Quites Machado Filho
2017-06-01 13:09 ` [PATCH 2/6] PowerPC64 FRAME_PARM_SAVE Alan Modra
2017-06-12 18:07   ` Tulio Magno Quites Machado Filho
2017-06-01 13:10 ` [PATCH 4/6] strncpy, stpncpy and strstr fixes Alan Modra
2017-06-12 19:55   ` Tulio Magno Quites Machado Filho
2017-06-01 13:11 ` [PATCH 5/6] PowerPC64 ENTRY_TOCLESS Alan Modra
2017-06-13 12:23   ` Tulio Magno Quites Machado Filho
2017-06-13 14:46     ` Alan Modra
2017-06-13 17:07       ` Tulio Magno Quites Machado Filho
2017-06-01 13:11 ` [PATCH 6/6] PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY Alan Modra
2017-06-03  4:46   ` Alan Modra
2017-06-13 18:21     ` Tulio Magno Quites Machado Filho
2017-06-14  8:53       ` Alan Modra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170601130442.GF8842@bubble.grove.modra.org \
    --to=amodra@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).