From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by sourceware.org (Postfix) with ESMTPS id 6EDEA385780D for ; Wed, 13 Mar 2024 19:31:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6EDEA385780D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6EDEA385780D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1029 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710358284; cv=none; b=U/SqjsfwC36y4OZ/VF8q/N+ZDh2zWN1Hm7bgTzBvsYj5l1DzdJFSVj+y0717MJUz2Y+zfWVxaD4crZ+ItuMFnDQixHLkmPFAQAZ8hkKNnkwNMdzqEOvCoKo0h9c+/yD28di8l+/rnLN4ev14RJddX/ROg02IO5c3x+yjiwJtX30= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710358284; c=relaxed/simple; bh=3XFN6s2Z9iPlsB+HR4rZRiEYGltHjlQEiARsBt4kamM=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=QgNTufrx73miCl09lmv779w1MTROprngYuXv1D2W5ldJVzmnE98D2R2qVfXbLh/PE4iRFqr4LL0zfoPTpOEqaPLkgbaUpOrz9Lo4RhDnytsYtkM/C2UiPbr5pcX9wzndcqB7H0VjzmL+GNm06/nY7nlzixmQhvKMzm44Q9OBI8Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-29c23f53eabso212305a91.0 for ; Wed, 13 Mar 2024 12:31:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710358280; x=1710963080; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yKBpuOS0ySXqbVlN563BIfEIMqME+EO+54PuOPM2As4=; b=eT63TPL/ZFvRS4SNOM/fQ/VDjqoijGLlu6VMytTThMsjkHWQlqTHopRGLzGQF3IeyB GeI5RdGOY8YZG/QPn1VAVajS9SCxjf9F9MSIxJO9yrejBRIDjrQGxGdUgaP6uPCIprxO WFeDihfzTQennTIbHr4qVTDITV9QT8O8gq+WVnfRIDtKGH7ygaMWza+zLOZjE+FRAiYz IUeisgTEJ8bAvOSOMr201pdqXK1dxTS8hghW8hYYhhzQ8Hu+WMZTnDevgk2km2Dp9QB4 uqaG6ljlsiMo6ysP0z1ONPRpDY8+gf3Sf1Ad8V9tjHWUIJV5n6KF3ULGyZbEuIBPA/91 IK/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710358280; x=1710963080; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yKBpuOS0ySXqbVlN563BIfEIMqME+EO+54PuOPM2As4=; b=fqOuugp9scItzjqvrPF0gdlZv2bRJhFpvoAODCHmtSpUhBKUrWsz8nI2ndZy+xC+tA +/xp5HnAyp8XDjnh5GTwuv16VqcvfMTzqU917ymOFwHIDDWQHVeCMB8H0Go4VD77GFmx voKMrtaJVlBUKMIPQT/OxchpUzv1mWtOGdIaZQ49sfn5we0fmqKLDLd2kr1AT2mYgC5n mofabNdaNSNd+3/+I92iG16HH81pKziHXw4IOUAzEAB99DcXVFUudgcTLwCoaNoUlOLA MO+XBS5vdxPqhT1zfY0apUHNfd3dEjbv8MudElz3AEM65x0UOKq9ZWBSr3uCyYT9mMYp v/Yw== X-Gm-Message-State: AOJu0Yxn66gEYirfJAN0RwIrDNMQAJfAiEh3YJ0dokhuaYzBI9keUiFY n5i8uoHACC6gsxACirPjjtPcK/aFoBLc80i3W5WOgm+ueaP/HIhUQi8/3f0V148l+JrSnJzDpr2 xIf4vZc+GCvQVlQLZOb3BvCB81AE6RMHsJIA= X-Google-Smtp-Source: AGHT+IHtKE97BLxyQf2A+wv0tETfTPfYXajsPMf94gMuk006kCv5Fx/sOZ6ZtqIDU2DEy0remFXNZMosuvKeBkbkd9s= X-Received: by 2002:a17:90a:460c:b0:29b:c4b5:1b3a with SMTP id w12-20020a17090a460c00b0029bc4b51b3amr11217360pjg.6.1710358280473; Wed, 13 Mar 2024 12:31:20 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andrew Pinski Date: Wed, 13 Mar 2024 12:31:08 -0700 Message-ID: Subject: Re: [PATCH] AArch64: Check kernel version for SVE ifuncs To: Wilco Dijkstra Cc: GNU C Library Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Mar 13, 2024 at 7:32=E2=80=AFAM Wilco Dijkstra wrote: > > > Older Linux kernels may disable SVE after certain syscalls. Calling the > SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. > As a result, applications with a high use of syscalls may run slower with > the SVE memcpy. Avoid this by checking the kernel version and enable the > SVE-optimized memcpy/memmove ifuncs only on Linux kernel 6.2 or newer. > > Passes regress, OK for commit? > > --- > > diff --git a/sysdeps/aarch64/cpu-features.h b/sysdeps/aarch64/cpu-feature= s.h > index 77a782422af1b6e4b2af32bfebfda37874111510..5f2da91ebbd0adafb0d84ec50= 3b0f902f566da5a 100644 > --- a/sysdeps/aarch64/cpu-features.h > +++ b/sysdeps/aarch64/cpu-features.h > @@ -71,6 +71,7 @@ struct cpu_features > /* Currently, the GLIBC memory tagging tunable only defines 8 bits. *= / > uint8_t mte_state; > bool sve; > + bool prefer_sve_ifuncs; > bool mops; > }; > > diff --git a/sysdeps/aarch64/multiarch/init-arch.h b/sysdeps/aarch64/mult= iarch/init-arch.h > index c52860efb22d70eb4bdf356781f51c7de8ec67dc..61dc40088f4d9e5e06b57bdc7= d54bde1e2a686a4 100644 > --- a/sysdeps/aarch64/multiarch/init-arch.h > +++ b/sysdeps/aarch64/multiarch/init-arch.h > @@ -36,5 +36,7 @@ > MTE_ENABLED (); = \ > bool __attribute__((unused)) sve =3D = \ > GLRO(dl_aarch64_cpu_features).sve; = \ > + bool __attribute__((unused)) prefer_sve_ifuncs =3D = \ > + GLRO(dl_aarch64_cpu_features).prefer_sve_ifuncs; = \ > bool __attribute__((unused)) mops =3D = \ > GLRO(dl_aarch64_cpu_features).mops; > diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiar= ch/memcpy.c > index d12eccfca51f4bcfef6ccf5aa286edb301e361ac..ce53567dab33c2f00b89b4069= 235abd4651811a6 100644 > --- a/sysdeps/aarch64/multiarch/memcpy.c > +++ b/sysdeps/aarch64/multiarch/memcpy.c > @@ -47,7 +47,7 @@ select_memcpy_ifunc (void) > { > if (IS_A64FX (midr)) > return __memcpy_a64fx; > - return __memcpy_sve; > + return prefer_sve_ifuncs ? __memcpy_sve : __memcpy_generic; Does it make sense to check prefer_sve_ifuncs before checking IS_A64FX because this is about the use of SVE registers and the kernel version and the a64fx versions use SVE too? That is doing: if (!prefer_sve_ifuncs) return __memcpy_generic; if (IS_A64FX (midr)) return __memcpy_a64fx; return __memcpy_sve; Thanks, Andrew Pinski > } > > if (IS_THUNDERX (midr)) > diff --git a/sysdeps/aarch64/multiarch/memmove.c b/sysdeps/aarch64/multia= rch/memmove.c > index 2081eeb4d40e0240e67a7b26b64576f44eaf18e3..fe95037be391896c7670ef606= bf4d3ba7dfb6a00 100644 > --- a/sysdeps/aarch64/multiarch/memmove.c > +++ b/sysdeps/aarch64/multiarch/memmove.c > @@ -47,7 +47,7 @@ select_memmove_ifunc (void) > { > if (IS_A64FX (midr)) > return __memmove_a64fx; > - return __memmove_sve; > + return prefer_sve_ifuncs ? __memmove_sve : __memmove_generic; > } > > if (IS_THUNDERX (midr)) > diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/uni= x/sysv/linux/aarch64/cpu-features.c > index b1a3f673f067280bdacfddd92723a81e418023e5..13b02c45df80b493516b3c9d4= acbbbffaa47af92 100644 > --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c > +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > #include > > #define DCZID_DZP_MASK (1 << 4) > @@ -62,6 +63,41 @@ get_midr_from_mcpu (const struct tunable_str_t *mcpu) > return UINT64_MAX; > } > > +/* Parse kernel version without calling any library functions. > + Allow 2 digits for kernel version and 3 digits for major version, > + separated by '.': "kk.mmm.". > + Return kernel version * 1000 + major version, or -1 on failure. */ > + > +static inline int > +kernel_version (void) > +{ > + struct utsname buf; > + const char *p =3D &buf.release[0]; > + int kernel =3D 0; > + int major =3D 0; > + > + if (__uname (&buf) < 0) > + return -1; > + > + if (*p >=3D '0' && *p <=3D '9') > + kernel =3D (kernel * 10) + *p++ - '0'; > + if (*p >=3D '0' && *p <=3D '9') > + kernel =3D (kernel * 10) + *p++ - '0'; > + if (*p !=3D '.') > + return -1; > + p++; > + if (*p >=3D '0' && *p <=3D '9') > + major =3D (major * 10) + *p++ - '0'; > + if (*p >=3D '0' && *p <=3D '9') > + major =3D (major * 10) + *p++ - '0'; > + if (*p >=3D '0' && *p <=3D '9') > + major =3D (major * 10) + *p++ - '0'; > + if (*p !=3D '.' && *p !=3D '\0') > + return -1; > + > + return kernel * 1000 + major; > +} > + > static inline void > init_cpu_features (struct cpu_features *cpu_features) > { > @@ -126,6 +162,10 @@ init_cpu_features (struct cpu_features *cpu_features= ) > /* Check if SVE is supported. */ > cpu_features->sve =3D GLRO (dl_hwcap) & HWCAP_SVE; > > + /* Prefer using SVE in string ifuncs from Linux 6.2 onwards. */ > + cpu_features->prefer_sve_ifuncs =3D > + cpu_features->sve && kernel_version () >=3D 6002; > + > /* Check if MOPS is supported. */ > cpu_features->mops =3D GLRO (dl_hwcap2) & HWCAP2_MOPS; > } >