public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Paul E Murphy <murphyp@linux.ibm.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH v3 7/9] powerpc64: Add optimized chacha20
Date: Wed, 20 Apr 2022 16:23:01 -0300	[thread overview]
Message-ID: <bc76b66f-918e-193d-460a-c1f807988e0c@linaro.org> (raw)
In-Reply-To: <d2f96cb4-314e-165c-8a63-63357c9bd322@linux.ibm.com>



On 20/04/2022 15:38, Paul E Murphy wrote:
> 
> 
> On 4/19/22 4:28 PM, Adhemerval Zanella via Libc-alpha wrote:
>> It adds vectorized ChaCha20 implementation based on libgcrypt
>> cipher/chacha20-ppc.c.  It targets POWER8 and it is used on
>> default for LE.
> 
>> diff --git a/sysdeps/powerpc/powerpc64/chacha20-ppc.c b/sysdeps/powerpc/powerpc64/chacha20-ppc.c
>> new file mode 100644
>> index 0000000000..e2567c379a
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/chacha20-ppc.c
> 
> How difficult is it to keep this synchronized with the upstream version in libgcrypt?  Also, this seems like it would be a better placed in the power8 subdirectory.

It would be somewhat complicate because libgcrypt also implements the
poly1305 on the same file (which uses common macros and definition
for chacha20) and it adds final XOR based on input stream (which
for arc4random usage is not required since it does not add any
hardening).

It would require to refactor libgcrypt code a bit to split the
chacha and poly1305 and to add a macro to XOR the input.

> 
>> diff --git a/sysdeps/powerpc/powerpc64/chacha20_arch.h b/sysdeps/powerpc/powerpc64/chacha20_arch.h
>> new file mode 100644
>> index 0000000000..a18115392f
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/chacha20_arch.h
>> @@ -0,0 +1,47 @@
>> +/* PowerPC optimization for ChaCha20.
>> +   Copyright (C) 2022 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#include <stdbool.h>
>> +#include <ldsodefs.h>
>> +
>> +unsigned int __chacha20_power8_blocks4 (uint32_t *state, uint8_t *dst,
>> +                    const uint8_t *src, size_t nblks)
>> +     attribute_hidden;
>> +
>> +static void
>> +chacha20_crypt (uint32_t *state, uint8_t *dst,
>> +        const uint8_t *src, size_t bytes)
>> +{
>> +  _Static_assert (CHACHA20_BUFSIZE % 4 == 0,
>> +          "CHACHA20_BUFSIZE not multiple of 4");
>> +  _Static_assert (CHACHA20_BUFSIZE >= CHACHA20_BLOCK_SIZE * 4,
>> +          "CHACHA20_BUFSIZE < CHACHA20_BLOCK_SIZE * 4");
>> +
>> +#ifdef __LITTLE_ENDIAN__
>> +  __chacha20_power8_blocks4 (state, dst, src,
>> +                 CHACHA20_BUFSIZE / CHACHA20_BLOCK_SIZE);
>> +#else
>> +  unsigned long int hwcap = GLRO(dl_hwcap);
>> +  unsigned long int hwcap2 = GLRO(dl_hwcap2);
>> +  if (hwcap2 & PPC_FEATURE2_ARCH_2_07 && hwcap & PPC_FEATURE_HAS_ALTIVEC)
>> +    __chacha20_power8_blocks4 (state, dst, src,
>> +                   CHACHA20_BUFSIZE / CHACHA20_BLOCK_SIZE);
>> +  else
>> +    chacha20_crypt_generic (state, dst, src, bytes);
>> +#endif
> 
> This file doesn't seem to obey the multiarch conventions of other powerpc64 specific bits. Is it possible to implement multiarch support similar to the libc/libm routines?

I am not very found of the powerpc multiarch convention and it would
require some more boilerplate code to handle BE, but it is doable.

So LE will continue to use __chacha20_power8_blocks4 as 
default, while BE will just select if --with-arch=power8 is defined
for for default build.  With --disable-multi-arch the power8 will be
select iff --with-arch=power8 is set.

---

diff --git a/sysdeps/powerpc/powerpc64/Makefile b/sysdeps/powerpc/powerpc64/Makefile
index 18943ef09e..679d5e49ba 100644
--- a/sysdeps/powerpc/powerpc64/Makefile
+++ b/sysdeps/powerpc/powerpc64/Makefile
@@ -66,9 +66,6 @@ tst-setjmp-bug21895-static-ENV = \
 endif
 
 ifeq ($(subdir),stdlib)
-sysdep_routines += chacha20-ppc
-CFLAGS-chacha20-ppc.c += -mcpu=power8
-
 CFLAGS-tst-ucontext-ppc64-vscr.c += -maltivec
 tests += tst-ucontext-ppc64-vscr
 endif
diff --git a/sysdeps/powerpc/powerpc64/be/multiarch/Makefile b/sysdeps/powerpc/powerpc64/be/multiarch/Makefile
new file mode 100644
index 0000000000..8c75165f7f
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/be/multiarch/Makefile
@@ -0,0 +1,4 @@
+ifeq ($(subdir),stdlib)
+sysdep_routines += chacha20-ppc
+CFLAGS-chacha20-ppc.c += -mcpu=power8
+endif
diff --git a/sysdeps/powerpc/powerpc64/be/multiarch/chacha20-ppc.c b/sysdeps/powerpc/powerpc64/be/multiarch/chacha20-ppc.c
new file mode 100644
index 0000000000..cf9e735326
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/be/multiarch/chacha20-ppc.c
@@ -0,0 +1 @@
+#include <sysdeps/powerpc/powerpc64/power8/chacha20-ppc.c>
diff --git a/sysdeps/powerpc/powerpc64/chacha20_arch.h b/sysdeps/powerpc/powerpc64/be/multiarch/chacha20_arch.h
similarity index 92%
rename from sysdeps/powerpc/powerpc64/chacha20_arch.h
rename to sysdeps/powerpc/powerpc64/be/multiarch/chacha20_arch.h
index a18115392f..6d2762d82b 100644
--- a/sysdeps/powerpc/powerpc64/chacha20_arch.h
+++ b/sysdeps/powerpc/powerpc64/be/multiarch/chacha20_arch.h
@@ -32,10 +32,6 @@ chacha20_crypt (uint32_t *state, uint8_t *dst,
   _Static_assert (CHACHA20_BUFSIZE >= CHACHA20_BLOCK_SIZE * 4,
 		  "CHACHA20_BUFSIZE < CHACHA20_BLOCK_SIZE * 4");
 
-#ifdef __LITTLE_ENDIAN__
-  __chacha20_power8_blocks4 (state, dst, src,
-			     CHACHA20_BUFSIZE / CHACHA20_BLOCK_SIZE);
-#else
   unsigned long int hwcap = GLRO(dl_hwcap);
   unsigned long int hwcap2 = GLRO(dl_hwcap2);
   if (hwcap2 & PPC_FEATURE2_ARCH_2_07 && hwcap & PPC_FEATURE_HAS_ALTIVEC)
@@ -43,5 +39,4 @@ chacha20_crypt (uint32_t *state, uint8_t *dst,
 			       CHACHA20_BUFSIZE / CHACHA20_BLOCK_SIZE);
   else
     chacha20_crypt_generic (state, dst, src, bytes);
-#endif
 }
diff --git a/sysdeps/powerpc/powerpc64/power8/Makefile b/sysdeps/powerpc/powerpc64/power8/Makefile
index 71a59529f3..abb0aa3f11 100644
--- a/sysdeps/powerpc/powerpc64/power8/Makefile
+++ b/sysdeps/powerpc/powerpc64/power8/Makefile
@@ -1,3 +1,8 @@
 ifeq ($(subdir),string)
 sysdep_routines += strcasestr-ppc64
 endif
+
+ifeq ($(subdir),stdlib)
+sysdep_routines += chacha20-ppc
+CFLAGS-chacha20-ppc.c += -mcpu=power8
+endif
diff --git a/sysdeps/powerpc/powerpc64/chacha20-ppc.c b/sysdeps/powerpc/powerpc64/power8/chacha20-ppc.c
similarity index 100%
rename from sysdeps/powerpc/powerpc64/chacha20-ppc.c
rename to sysdeps/powerpc/powerpc64/power8/chacha20-ppc.c
diff --git a/sysdeps/powerpc/powerpc64/power8/chacha20_arch.h b/sysdeps/powerpc/powerpc64/power8/chacha20_arch.h
new file mode 100644
index 0000000000..270c71130f
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/power8/chacha20_arch.h
@@ -0,0 +1,37 @@
+/* PowerPC optimization for ChaCha20.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdbool.h>
+#include <ldsodefs.h>
+
+unsigned int __chacha20_power8_blocks4 (uint32_t *state, uint8_t *dst,
+					const uint8_t *src, size_t nblks)
+     attribute_hidden;
+
+static void
+chacha20_crypt (uint32_t *state, uint8_t *dst,
+		const uint8_t *src, size_t bytes)
+{
+  _Static_assert (CHACHA20_BUFSIZE % 4 == 0,
+		  "CHACHA20_BUFSIZE not multiple of 4");
+  _Static_assert (CHACHA20_BUFSIZE >= CHACHA20_BLOCK_SIZE * 4,
+		  "CHACHA20_BUFSIZE < CHACHA20_BLOCK_SIZE * 4");
+
+  __chacha20_power8_blocks4 (state, dst, src,
+			     CHACHA20_BUFSIZE / CHACHA20_BLOCK_SIZE);
+}

  reply	other threads:[~2022-04-20 19:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19 21:28 [PATCH v3 0/9] Add arc4random support Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 1/9] stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) Adhemerval Zanella
2022-04-19 21:52   ` H.J. Lu
2022-04-20 12:38     ` Adhemerval Zanella
2022-04-22 13:54   ` Yann Droneaud
2022-04-25 12:15     ` Adhemerval Zanella
2022-04-25 12:20       ` Adhemerval Zanella
2022-04-25  2:22   ` Mark Harris
2022-04-25 12:26     ` Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 2/9] stdlib: Add arc4random tests Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 3/9] benchtests: Add arc4random benchtest Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 4/9] aarch64: Add optimized chacha20 Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 5/9] x86: Add SSE2 " Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 6/9] x86: Add AVX2 " Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 7/9] powerpc64: Add " Adhemerval Zanella
2022-04-20 18:38   ` Paul E Murphy
2022-04-20 19:23     ` Adhemerval Zanella [this message]
2022-04-22 21:09       ` Paul E Murphy
2022-04-19 21:28 ` [PATCH v3 8/9] s390x: " Adhemerval Zanella
2022-04-19 21:28 ` [PATCH v3 9/9] stdlib: Add TLS optimization to arc4random Adhemerval Zanella
2022-04-22 16:02   ` Yann Droneaud
2022-04-25 12:36     ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc76b66f-918e-193d-460a-c1f807988e0c@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=murphyp@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).