public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: bmahi496@linux.ibm.com, libc-alpha@sourceware.org
Cc: rajis@linux.ibm.com, bergner@linux.ibm.com,
	Mahesh Bodapati <mahesh.bodapati@ibm.com>
Subject: Re: [PATCH v5] PowerPC: Influence cpu/arch hwcap features via GLIBC_TUNABLES.
Date: Tue, 11 Jul 2023 10:45:39 -0300	[thread overview]
Message-ID: <2b71d409-0649-f47c-0626-f4726456256d@linaro.org> (raw)
In-Reply-To: <20230710182150.2376678-1-bmahi496@linux.ibm.com>



On 10/07/23 15:21, bmahi496@linux.ibm.com wrote:
> From: Mahesh Bodapati <mahesh.bodapati@ibm.com>
> 
> This patch enables the option to influence hwcaps used by PowerPC.
> The environment variable, GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz....,
> can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature xxx
> and zzz, where the feature name is case-sensitive and has to match the ones
> mentioned in the file{sysdeps/powerpc/dl-procinfo.c}.
> 
> Note that the tunable only handles the features which are really used
> in the IFUNC selection.  All others are ignored as the values are only
> used inside glibc.

It is still missing a regression test to check if tunable work as intended.

> ---
>  manual/tunables.texi                          |   5 +-
>  sysdeps/powerpc/cpu-features.c                |  97 ++++++++++++++++-
>  sysdeps/powerpc/cpu-features.h                | 102 ++++++++++++++++++
>  sysdeps/powerpc/dl-tunables.list              |   3 +
>  sysdeps/powerpc/hwcapinfo.c                   |   4 +
>  .../power4/multiarch/ifunc-impl-list.c        |   4 +-
>  .../powerpc32/power4/multiarch/init-arch.h    |  10 +-
>  sysdeps/powerpc/powerpc64/dl-machine.h        |   2 -
>  .../powerpc64/multiarch/ifunc-impl-list.c     |   7 +-
>  9 files changed, 222 insertions(+), 12 deletions(-)
> 
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index 4ca0e42a11..776fd93fd9 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -513,7 +513,10 @@ On s390x, the supported HWCAP and STFLE features can be found in
>  @code{sysdeps/s390/cpu-features.c}.  In addition the user can also set
>  a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features.
>  
> -This tunable is specific to i386, x86-64 and s390x.
> +On powerpc, the supported HWCAP and HWCAP2 features can be found in
> +@code{sysdeps/powerpc/dl-procinfo.c}.
> +
> +This tunable is specific to i386, x86-64, s390x and powerpc.
>  @end deftp
>  
>  @deftp Tunable glibc.cpu.cached_memopt
> diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c
> index 0ef3cf89d2..bf1c5353da 100644
> --- a/sysdeps/powerpc/cpu-features.c
> +++ b/sysdeps/powerpc/cpu-features.c
> @@ -19,14 +19,109 @@
>  #include <stdint.h>
>  #include <cpu-features.h>
>  #include <elf/dl-tunables.h>
> +#include <unistd.h>
> +#include <string.h>
> +#define MEMCMP_DEFAULT memcmp
> +#define STRLEN_DEFAULT strlen
> +
> +static void
> +TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp)
> +{
> +  /* The current IFUNC selection is always using the most recent
> +     features which are available via AT_HWCAP or AT_HWCAP2.  But in
> +     some scenarios it is useful to adjust this selection.
> +
> +     The environment variable:
> +
> +     GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,....
> +
> +     Can be used to enable HWCAP/HWCAP2 feature yyy, disable HWCAP/HWCAP2
> +     feature xxx, where the feature name is case-sensitive and has to match
> +     the ones mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. */
> +
> +  /* Copy the features from dl_powerpc_cpu_features, which contains the
> +     features provided by AT_HWCAP and AT_HWCAP2.  */
> +  struct cpu_features *cpu_features = &GLRO(dl_powerpc_cpu_features);
> +  unsigned long int tcbv_hwcap = cpu_features->hwcap;
> +  unsigned long int tcbv_hwcap2 = cpu_features->hwcap2;
> +  unsigned int tun_count;
> +  const char *token = valp->strval;
> +  tun_count = sizeof(hwcap_tunables)/sizeof(hwcap_tunables[0]);
> +  do
> +    {
> +      const char *token_end, *feature;
> +      bool disable;
> +      unsigned short offset=0;
> +      size_t token_len, i, feature_len;
> +      /* Find token separator or end of string.  */
> +      for (token_end = token; *token_end != ','; token_end++)
> +	if (*token_end == '\0')
> +	  break;
> +
> +      /* Determine feature.  */
> +      token_len = token_end - token;
> +      if (*token == '-')
> +	{
> +	  disable = true;
> +	  feature = token + 1;
> +	  feature_len = token_len - 1;
> +	}
> +      else
> +	{
> +	  disable = false;
> +	  feature = token;
> +	  feature_len = token_len;
> +	}
> +      for (i = 0; i < tun_count; ++i)
> +	{
> +	  const char *hwcap_name = hwcap_names + offset;
> +	  /* Check the tunable name on the supported list.  */
> +	  if (STRLEN_DEFAULT (hwcap_name) == feature_len
> +	      && MEMCMP_DEFAULT (feature, hwcap_name, feature_len)
> +	      == 0)
> +	    {
> +	      /* Update the hwcap and hwcap2 bits.  */
> +	      if (disable)
> +		{
> +		  /* Id is 1 for hwcap2 tunable.  */
> +		  if (hwcap_tunables[i].id)
> +		    cpu_features->hwcap2 &= ~(hwcap_tunables[i].mask);
> +		  else
> +		    cpu_features->hwcap &= ~(hwcap_tunables[i].mask);
> +		}
> +	      else
> +		{
> +		  /* Enable the features and also checking that no unsupported
> +		     features were enabled by user.  */
> +		  if (hwcap_tunables[i].id)
> +		    cpu_features->hwcap2 |= (tcbv_hwcap2 & hwcap_tunables[i].mask);
> +		  else
> +		    cpu_features->hwcap |= (tcbv_hwcap & hwcap_tunables[i].mask);
> +		}
> +	      break;
> +	    }
> +	  offset = offset + STRLEN_DEFAULT (hwcap_name) + 1;
> +	}
> +	token += token_len;
> +	/* ... and skip token separator for next round.  */
> +	if (*token == ',') token++;
> +    }
> +  while (*token != '\0');
> +}
>  
>  static inline void
> -init_cpu_features (struct cpu_features *cpu_features)
> +init_cpu_features (struct cpu_features *cpu_features, uint64_t hwcaps[])
>  {
> +  /* Fill the cpu_features with the supported hwcaps
> +     which are set by __tcb_parse_hwcap_and_convert_at_platform.  */
> +  cpu_features->hwcap = hwcaps[0];
> +  cpu_features->hwcap2 = hwcaps[1];
>    /* Default is to use aligned memory access on optimized function unless
>       tunables is enable, since for this case user can explicit disable
>       unaligned optimizations.  */
>    int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t,
>  					NULL);
>    cpu_features->use_cached_memopt = (cached_memfunc > 0);
> +  TUNABLE_GET (glibc, cpu, hwcaps, tunable_val_t *,
> +	       TUNABLE_CALLBACK (set_hwcaps));
>  }
> diff --git a/sysdeps/powerpc/cpu-features.h b/sysdeps/powerpc/cpu-features.h
> index d316dc3d64..e5fce88e5e 100644
> --- a/sysdeps/powerpc/cpu-features.h
> +++ b/sysdeps/powerpc/cpu-features.h
> @@ -19,10 +19,112 @@
>  # define __CPU_FEATURES_POWERPC_H
>  
>  #include <stdbool.h>
> +#include <sys/auxv.h>
>  
>  struct cpu_features
>  {
>    bool use_cached_memopt;
> +  unsigned long int hwcap;
> +  unsigned long int hwcap2;
> +};
> +
> +static const char hwcap_names[] = {
> +  "4xxmac\0"
> +  "altivec\0"
> +  "arch_2_05\0"
> +  "arch_2_06\0"
> +  "archpmu\0"
> +  "booke\0"
> +  "cellbe\0"
> +  "dfp\0"
> +  "efpdouble\0"
> +  "efpsingle\0"
> +  "fpu\0"
> +  "ic_snoop\0"
> +  "mmu\0"
> +  "notb\0"
> +  "pa6t\0"
> +  "power4\0"
> +  "power5\0"
> +  "power5+\0"
> +  "power6x\0"
> +  "ppc32\0"
> +  "ppc601\0"
> +  "ppc64\0"
> +  "ppcle\0"
> +  "smt\0"
> +  "spe\0"
> +  "true_le\0"
> +  "ucache\0"
> +  "vsx\0"
> +  "arch_2_07\0"
> +  "dscr\0"
> +  "ebb\0"
> +  "htm\0"
> +  "htm-nosc\0"
> +  "htm-no-suspend\0"
> +  "isel\0"
> +  "tar\0"
> +  "vcrypto\0"
> +  "arch_3_00\0"
> +  "ieee128\0"
> +  "darn\0"
> +  "scv\0"
> +  "arch_3_1\0"
> +  "mma\0"
> +};
> +
> +static const struct
> +{
> +  unsigned int mask;
> +  bool id;
> +} hwcap_tunables[] = {
> +   /* AT_HWCAP tunable masks.  */
> +   { PPC_FEATURE_HAS_4xxMAC,                 0 },
> +   { PPC_FEATURE_HAS_ALTIVEC,                0 },
> +   { PPC_FEATURE_ARCH_2_05,                  0 },
> +   { PPC_FEATURE_ARCH_2_06,                  0 },
> +   { PPC_FEATURE_PSERIES_PERFMON_COMPAT,     0 },
> +   { PPC_FEATURE_BOOKE,                      0 },
> +   { PPC_FEATURE_CELL_BE,                    0 },
> +   { PPC_FEATURE_HAS_DFP,                    0 },
> +   { PPC_FEATURE_HAS_EFP_DOUBLE,             0 },
> +   { PPC_FEATURE_HAS_EFP_SINGLE,             0 },
> +   { PPC_FEATURE_HAS_FPU,                    0 },
> +   { PPC_FEATURE_ICACHE_SNOOP,               0 },
> +   { PPC_FEATURE_HAS_MMU,                    0 },
> +   { PPC_FEATURE_NO_TB,                      0 },
> +   { PPC_FEATURE_PA6T,                       0 },
> +   { PPC_FEATURE_POWER4,                     0 },
> +   { PPC_FEATURE_POWER5,                     0 },
> +   { PPC_FEATURE_POWER5_PLUS,                0 },
> +   { PPC_FEATURE_POWER6_EXT,                 0 },
> +   { PPC_FEATURE_32,                         0 },
> +   { PPC_FEATURE_601_INSTR,                  0 },
> +   { PPC_FEATURE_64,                         0 },
> +   { PPC_FEATURE_PPC_LE,                     0 },
> +   { PPC_FEATURE_SMT,                        0 },
> +   { PPC_FEATURE_HAS_SPE,                    0 },
> +   { PPC_FEATURE_TRUE_LE,                    0 },
> +   { PPC_FEATURE_UNIFIED_CACHE,              0 },
> +   { PPC_FEATURE_HAS_VSX,                    0 },
> +
> +   /* AT_HWCAP2 tunable masks.  */
> +   { PPC_FEATURE2_ARCH_2_07,                 1 },
> +   { PPC_FEATURE2_HAS_DSCR,                  1 },
> +   { PPC_FEATURE2_HAS_EBB,                   1 },
> +   { PPC_FEATURE2_HAS_HTM,                   1 },
> +   { PPC_FEATURE2_HTM_NOSC,                  1 },
> +   { PPC_FEATURE2_HTM_NO_SUSPEND,            1 },
> +   { PPC_FEATURE2_HAS_ISEL,                  1 },
> +   { PPC_FEATURE2_HAS_TAR,                   1 },
> +   { PPC_FEATURE2_HAS_VEC_CRYPTO,            1 },
> +   { PPC_FEATURE2_ARCH_3_00,                 1 },
> +   { PPC_FEATURE2_HAS_IEEE128,               1 },
> +   { PPC_FEATURE2_DARN,                      1 },
> +   { PPC_FEATURE2_SCV,                       1 },
> +   { PPC_FEATURE2_ARCH_3_1,                  1 },
> +   { PPC_FEATURE2_MMA,                       1 },
>  };
>  
>  #endif /* __CPU_FEATURES_H  */
> diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list
> index 87d6235c75..807b7f8013 100644
> --- a/sysdeps/powerpc/dl-tunables.list
> +++ b/sysdeps/powerpc/dl-tunables.list
> @@ -24,5 +24,8 @@ glibc {
>        maxval: 1
>        default: 0
>      }
> +    hwcaps {
> +      type: STRING
> +    }
>    }
>  }
> diff --git a/sysdeps/powerpc/hwcapinfo.c b/sysdeps/powerpc/hwcapinfo.c
> index e26e64d99e..f2c473c556 100644
> --- a/sysdeps/powerpc/hwcapinfo.c
> +++ b/sysdeps/powerpc/hwcapinfo.c
> @@ -19,6 +19,7 @@
>  #include <unistd.h>
>  #include <shlib-compat.h>
>  #include <dl-procinfo.h>
> +#include <cpu-features.c>
>  
>  tcbhead_t __tcb __attribute__ ((visibility ("hidden")));
>  
> @@ -63,6 +64,9 @@ __tcb_parse_hwcap_and_convert_at_platform (void)
>    else if (h1 & PPC_FEATURE_POWER5)
>      h1 |= PPC_FEATURE_POWER4;
>  
> +  uint64_t array_hwcaps[] = { h1, h2 };
> +  init_cpu_features (&GLRO(dl_powerpc_cpu_features), array_hwcaps);
> +
>    /* Consolidate both HWCAP and HWCAP2 into a single doubleword so that
>       we can read both in a single load later.  */
>    __tcb.hwcap = (h1 << 32) | (h2 & 0xffffffff);
> diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
> index b4f80539e7..986c37d71e 100644
> --- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
> @@ -21,6 +21,7 @@
>  #include <wchar.h>
>  #include <ldsodefs.h>
>  #include <ifunc-impl-list.h>
> +#include <cpu-features.h>
>  
>  size_t
>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> @@ -28,7 +29,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>  {
>    size_t i = max;
>  
> -  unsigned long int hwcap = GLRO(dl_hwcap);
> +  const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features);
> +  unsigned long int hwcap = features->hwcap;
>    /* hwcap contains only the latest supported ISA, the code checks which is
>       and fills the previous supported ones.  */
>    if (hwcap & PPC_FEATURE_ARCH_2_06)
> diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
> index 3dd00e02ee..a0bbd12012 100644
> --- a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
> +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
> @@ -16,6 +16,7 @@
>     <https://www.gnu.org/licenses/>.  */
>  
>  #include <ldsodefs.h>
> +#include <cpu-features.h>
>  
>  /* The code checks if _rtld_global_ro was realocated before trying to access
>     the dl_hwcap field. The assembly is to make the compiler not optimize the
> @@ -32,11 +33,12 @@
>  # define __GLRO(value)  GLRO(value)
>  #endif
>  
> -/* dl_hwcap contains only the latest supported ISA, the macro checks which is
> -   and fills the previous ones.  */
> +/* Get the hardware information post the tunables set , the macro checks
> +   it and fills the previous ones.  */
>  #define INIT_ARCH() \
> -  unsigned long int hwcap = __GLRO(dl_hwcap); 			\
> -  unsigned long int __attribute__((unused)) hwcap2 = __GLRO(dl_hwcap2); \
> +  const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features);	\
> +  unsigned long int hwcap = features->hwcap;				\
> +  unsigned long int __attribute__((unused)) hwcap2 = features->hwcap2; \
>    bool __attribute__((unused)) use_cached_memopt =		\
>      __GLRO(dl_powerpc_cpu_features.use_cached_memopt);		\
>    if (hwcap & PPC_FEATURE_ARCH_2_06)				\
> diff --git a/sysdeps/powerpc/powerpc64/dl-machine.h b/sysdeps/powerpc/powerpc64/dl-machine.h
> index 9b8943bc91..449208e86f 100644
> --- a/sysdeps/powerpc/powerpc64/dl-machine.h
> +++ b/sysdeps/powerpc/powerpc64/dl-machine.h
> @@ -27,7 +27,6 @@
>  #include <dl-tls.h>
>  #include <sysdep.h>
>  #include <hwcapinfo.h>
> -#include <cpu-features.c>
>  #include <dl-static-tls.h>
>  #include <dl-funcdesc.h>
>  #include <dl-machine-rel.h>
> @@ -297,7 +296,6 @@ static inline void __attribute__ ((unused))
>  dl_platform_init (void)
>  {
>    __tcb_parse_hwcap_and_convert_at_platform ();
> -  init_cpu_features (&GLRO(dl_powerpc_cpu_features));
>  }
>  #endif
>  
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> index ebe9434052..fc26dd0e17 100644
> --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> @@ -17,6 +17,7 @@
>     <https://www.gnu.org/licenses/>.  */
>  
>  #include <assert.h>
> +#include <cpu-features.h>
>  #include <string.h>
>  #include <wchar.h>
>  #include <ldsodefs.h>
> @@ -27,9 +28,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>  			size_t max)
>  {
>    size_t i = max;
> -
> -  unsigned long int hwcap = GLRO(dl_hwcap);
> -  unsigned long int hwcap2 = GLRO(dl_hwcap2);
> +  const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features);
> +  unsigned long int hwcap = features->hwcap;
> +  unsigned long int hwcap2 = features->hwcap2;
>  #ifdef SHARED
>    int cacheline_size = GLRO(dl_cache_line_size);
>  #endif

  parent reply	other threads:[~2023-07-11 13:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-10 18:21 bmahi496
2023-07-10 21:28 ` Peter Bergner
2023-07-11 12:24   ` Adhemerval Zanella Netto
2023-07-11 16:05     ` MAHESH BODAPATI
2023-07-11 13:45 ` Adhemerval Zanella Netto [this message]
2023-07-11 16:03   ` MAHESH BODAPATI
2023-07-11 16:23     ` Adhemerval Zanella Netto
2023-07-11 16:23     ` Adhemerval Zanella Netto
2023-07-12 11:04       ` MAHESH BODAPATI
2023-07-12 14:07         ` Adhemerval Zanella Netto
2023-07-13 13:17           ` MAHESH BODAPATI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b71d409-0649-f47c-0626-f4726456256d@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=bergner@linux.ibm.com \
    --cc=bmahi496@linux.ibm.com \
    --cc=libc-alpha@sourceware.org \
    --cc=mahesh.bodapati@ibm.com \
    --cc=rajis@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).