From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id 120453885C3C for ; Wed, 19 Jun 2024 15:20:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 120453885C3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 120453885C3C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718810433; cv=none; b=nYXRXBdpMBiqHtQnMoG2CkC+JS11WVwMk8p35nS7sp2vbtRRzFgRDoJZYDdtDzzzlMv9ATIlGVAzj5AKXMhw35bs4eMSZc7b9WCvn34nC1leOwypBYp8WrE356AInf09IOl3C0A0y8Fnih8ZZPl3TE+OmeyqLt6XcM7GZu3GJNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718810433; c=relaxed/simple; bh=C8PRr/z04atINZyFSXPH4scAKmffkl85ksniCYqSAdE=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=XxaPb+vw1y3DX1cAUFiy2BFdkq+fUV1Z4/vybSjOwxgJidfRFLS1PH13AWaukqzihvJPg9l5HJvdAoK8/32FQ2IRu2K1wqrLkgRQSHJrEuRjHqEkTEoOWe4+o80UWLemm1CEPtuBBzfc/c+fMWgAq4v7psfpB8rpRGe3rGYbkNo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-421820fc26dso52338955e9.2 for ; Wed, 19 Jun 2024 08:20:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718810424; x=1719415224; darn=gcc.gnu.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=E5S+VYWEhRbPBs4CRE9Xze3PU7MEeLotBzleBIUYdqs=; b=J30Mf3qRy81LmLQACosMopDZlfXm+rgggRJRSkg0WAs9ltggO0lkXe0rlhUwdA0eEh Sy+IzeYntSfdJMyAj8iVRaIWMbCaZg1RKPcJGNEOsa6hzyaG3bG1HeKkRm/i3UclYS2x ql851DQQpBkgIVzLhXp9Y/XFOMHkC8z0JHi4B3/+t34jEuhqNKs6CEAt/PWPg0gnqlVJ HhAzrLDb0hO53OlVytyWafGlGvmYIZsKQRIeCJzeqw9i90tKKlcK5O2ZZe3U/UD69128 gbmw01yxZPmIEuqs0iKAqX/F4UBBhvltccVUwIWyuYf04lIqixUmH65tVhuCYbRdfWjl Mmqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718810424; x=1719415224; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=E5S+VYWEhRbPBs4CRE9Xze3PU7MEeLotBzleBIUYdqs=; b=WnpX0tVtI3BTQ/hbLJnIZQqTDEq63sUQTU96As4Vju0E56VVbmlzu5LMDvkQWFFU3Q 0ZhldQ6hz9YAwa/tArxm93agEIwKAQPRhqwmyoQgG+ZNQXy2sLC96KNe1FR8dRtjx/8h BY9VOxWq95P0GPpTEnHo4IIODocCOKvfIohlXA4m+JdFtZYymi8sf2jW6Fj0smyzKRwy eRBe6ggQvlx9f05TIKw3LWgqwXjxphDjpgklstGIHN4QijTawxkhinX94P5vSV4IFoo6 MtH51+u86gs3iJdzkRNhYbZfWWg6zIfWQIoo3RgtUmtYL8qm5WojkOJQFwR/Ab8A1i10 0okg== X-Forwarded-Encrypted: i=1; AJvYcCXZc90M87I8ZAJeSzzy/VKNvrSAGXGu8ZgwR08Lk67aCxPL0FEv9+VMqGPPX5736I5CtbV3zBvNGjvxkE50FmGGQ4LPU5BMjw== X-Gm-Message-State: AOJu0YyqFN1A9vle3eSDTz7KoNIPdDXL/bxHjydPGaYwIpnOswCK0bH0 Dg4MyUZ8RzHP0McwvvQx99uacJm1T2bYgd7hjXx/6WPQk00x3jUuJObue6j27Www2Zb7kbbVkgX O3HHLTndxNbLr8RVRMYoUbF8nrbQ= X-Google-Smtp-Source: AGHT+IHUJHzYhY6Qskrl/sbWqGisI3/INGz1mn4lsRQfHn+Kue4cIKUgR8BFg8yo77ehAX9tYWtj1mYfCNk64ouRrFE= X-Received: by 2002:a05:600c:929:b0:424:79cf:7b2a with SMTP id 5b1f17b1804b1-42479cf7b5emr8912175e9.1.1718810423634; Wed, 19 Jun 2024 08:20:23 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Mariam Arutunian Date: Wed, 19 Jun 2024 19:20:12 +0400 Message-ID: Subject: Re: [RFC/RFA] [PATCH 06/12] aarch64: Implement new expander for efficient CRC computation To: Mariam Arutunian , GCC Patches , richard.sandiford@arm.com Content-Type: multipart/alternative; boundary="000000000000bcb799061b3fbc87" X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000bcb799061b3fbc87 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Jun 8, 2024 at 3:41=E2=80=AFPM Richard Sandiford wrote: > Mariam Arutunian writes: > > This patch introduces two new expanders for the aarch64 backend, > > dedicated to generate optimized code for CRC computations. > > The new expanders are designed to leverage specific hardware capabiliti= es > > to achieve faster CRC calculations, > > particularly using the pmul or crc32 instructions when supported by the > > target architecture. > > Thanks for porting this to aarch64! > > > Expander 1: Bit-Forward CRC (crc4) > > For targets that support pmul instruction (TARGET_AES), > > the expander will generate code that uses the pmul (crypto_pmulldi) > > instruction for CRC computation. > > > > Expander 2: Bit-Reversed CRC (crc_rev4) > > The expander first checks if the target supports the CRC32 instruction > set > > (TARGET_CRC32) > > and the polynomial in use is 0x1EDC6F41 (iSCSI). If the conditions are > met, > > it emits calls to the corresponding crc32 instruction (crc32b, crc32h, > > crc32w, or crc32x depending on the data size). > > If the target does not support crc32 but supports pmul, it then uses the > > pmul (crypto_pmulldi) instruction for bit-reversed CRC computation. > > > > Otherwise table-based CRC is generated. > > > > gcc/config/aarch64/ > > > > * aarch64-protos.h (aarch64_expand_crc_using_clmul): New extern > > function declaration. > > (aarch64_expand_reversed_crc_using_clmul): Likewise. > > * aarch64.cc (aarch64_expand_crc_using_clmul): New function. > > (aarch64_expand_reversed_crc_using_clmul): Likewise. > > * aarch64.md (UNSPEC_CRC, UNSPEC_CRC_REV): New unspecs. > > (crc_rev4): New expander for reversed CRC. > > (crc4): New expander for reversed CRC. > > * iterators.md (crc_data_type): New mode attribute. > > > > gcc/testsuite/gcc.target/aarch64/ > > > > * crc-1-pmul.c: Likewise. > > * crc-10-pmul.c: Likewise. > > * crc-12-pmul.c: Likewise. > > * crc-13-pmul.c: Likewise. > > * crc-14-pmul.c: Likewise. > > * crc-17-pmul.c: Likewise. > > * crc-18-pmul.c: Likewise. > > * crc-21-pmul.c: Likewise. > > * crc-22-pmul.c: Likewise. > > * crc-23-pmul.c: Likewise. > > * crc-4-pmul.c: Likewise. > > * crc-5-pmul.c: Likewise. > > * crc-6-pmul.c: Likewise. > > * crc-7-pmul.c: Likewise. > > * crc-8-pmul.c: Likewise. > > * crc-9-pmul.c: Likewise. > > * crc-CCIT-data16-pmul.c: Likewise. > > * crc-CCIT-data8-pmul.c: Likewise. > > * crc-coremark-16bitdata-pmul.c: Likewise. > > * crc-crc32-data16.c: New test. > > * crc-crc32-data32.c: Likewise. > > * crc-crc32-data8.c: Likewise. > > > > Signed-off-by: Mariam Arutunian > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > > index 1d3f94c813e..167e1140f0d 100644 > > --- a/gcc/config/aarch64/aarch64-protos.h > > +++ b/gcc/config/aarch64/aarch64-protos.h > > @@ -1117,5 +1117,8 @@ extern void mingw_pe_encode_section_info (tree, > rtx, int); > > > > bool aarch64_optimize_mode_switching (aarch64_mode_entity); > > void aarch64_restore_za (rtx); > > +void aarch64_expand_crc_using_clmul (rtx *); > > +void aarch64_expand_reversed_crc_using_clmul (rtx *); > > + > > > > #endif /* GCC_AARCH64_PROTOS_H */ > > diff --git a/gcc/config/aarch64/aarch64.cc > b/gcc/config/aarch64/aarch64.cc > > index ee12d8897a8..05cd0296d38 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -30265,6 +30265,135 @@ aarch64_retrieve_sysreg (const char *regname, > bool write_p, bool is128op) > > return sysreg->encoding; > > } > > > > +/* Generate assembly to calculate CRC > > + using carry-less multiplication instruction. > > + OPERANDS[1] is input CRC, > > + OPERANDS[2] is data (message), > > + OPERANDS[3] is the polynomial without the leading 1. */ > > + > > +void > > +aarch64_expand_crc_using_clmul (rtx *operands) > > This should probably be pmul rather than clmul. > > > +{ > > + /* Check and keep arguments. */ > > + gcc_assert (!CONST_INT_P (operands[0])); > > + gcc_assert (CONST_INT_P (operands[3])); > > + rtx crc =3D operands[1]; > > + rtx data =3D operands[2]; > > + rtx polynomial =3D operands[3]; > > + > > + unsigned HOST_WIDE_INT > > + crc_size =3D GET_MODE_BITSIZE (GET_MODE (operands[0])).to_consta= nt > (); > > + gcc_assert (crc_size <=3D 32); > > + unsigned HOST_WIDE_INT > > + data_size =3D GET_MODE_BITSIZE (GET_MODE (data)).to_constant (); > > We could instead make the interface: > > void > aarch64_expand_crc_using_pmul (scalar_mode crc_mode, scalar_mode data_mod= e, > rtx *operands) > > so that the lines above don't need the to_constant. This should "just > work" on the .md file side, since the modes being passed are naturally > scalar_mode. > > I think it'd be worth asserting also that data_size <=3D crc_size. > (Although we could handle any MAX (data_size, crc_size) <=3D 32 > with some adjustment.) > > > + > > + /* Calculate the quotient. */ > > + unsigned HOST_WIDE_INT > > + q =3D gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_siz= e + > 1); > > + > > + /* CRC calculation's main part. */ > > + if (crc_size > data_size) > > + crc =3D expand_shift (RSHIFT_EXPR, DImode, crc, crc_size - data_si= ze, > > + NULL_RTX, 1); > > + > > + rtx t0 =3D gen_reg_rtx (DImode); > > + aarch64_emit_move (t0, gen_int_mode (q, DImode)); > > It's only a minor simplification, but this could instead be: > > rtx t0 =3D force_reg (DImode, gen_int_mode (q, DImode)); > > > + rtx t1 =3D gen_reg_rtx (DImode); > > + aarch64_emit_move (t1, polynomial); > > If polynomial is a constant operand of mode crc_mode, GCC's standard > CONST_INT representation is to sign-extend the constant to 64 bits. > E.g. a QImode value of 0b1000_0000 would be represented as -128. > > I think here we want the zero-extended form, so it might be safer to do: > > polynomial =3D simplify_gen_unary (ZERO_EXTEND, DImode, polynomial, > crc_mode); > rtx t1 =3D force_reg (DImode, polynomial); > > > + > > + rtx a0 =3D expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1, > > + OPTAB_WIDEN); > > + > > + rtx clmul_res =3D gen_reg_rtx (TImode); > > + emit_insn (gen_aarch64_crypto_pmulldi (clmul_res, a0, t0)); > > + a0 =3D gen_lowpart (DImode, clmul_res); > > + > > + a0 =3D expand_shift (RSHIFT_EXPR, DImode, a0, crc_size, NULL_RTX, 1); > > + > > + emit_insn (gen_aarch64_crypto_pmulldi (clmul_res, a0, t1)); > > + a0 =3D gen_lowpart (DImode, clmul_res); > > + > > + if (crc_size > data_size) > > + { > > + rtx crc_part =3D expand_shift (LSHIFT_EXPR, DImode, operands[1], > data_size, > > + NULL_RTX, 0); > > + a0 =3D expand_binop (DImode, xor_optab, a0, crc_part, NULL_RTX,= 1, > > + OPTAB_DIRECT); > > Formatting nit: extra space after "a0 =3D " > > > + } > > + /* Zero upper bits beyond crc_size. */ > > + rtx num_shift =3D gen_int_mode (64 - crc_size, DImode); > > + a0 =3D expand_shift (LSHIFT_EXPR, DImode, a0, 64 - crc_size, NULL_R= TX, > 0); > > + a0 =3D expand_shift (RSHIFT_EXPR, DImode, a0, 64 - crc_size, NULL_R= TX, > 1); > > Rather than shift left and then right, I think we should just AND: > > rtx mask =3D gen_int_mode (GET_MODE_MASK (crc_mode), DImode); > a0 =3D expand_binop (DImode, and_optab, a0, mask, NULL_RTX, 1, > OPTAB_DIRECT); > > That said, it looks like operands[0] has crc_mode. The register bits > above crc_size therefore shouldn't matter, since they're undefined on rea= d. > E.g. even though (reg:SI R) is stored in an X register, only the low 32 > bits are defined; the upper 32 bits can be any value. > > So I'd expect we could replace this and... > > > + > > + rtx tgt =3D simplify_gen_subreg (DImode, operands[0], > > + GET_MODE (operands[0]), 0); > > + aarch64_emit_move (tgt, a0); > > ...this with just: > > aarch64_emitmove (operands[0], gen_lowpart (crc_mode, a0)); > > Perhaps that would break down if operands[0] is a subreg with > SUBREG_PROMOTED_VAR_P set, but I think it's up to target-independent > code to handle that case. > > > @@ -4543,6 +4545,63 @@ > > [(set_attr "type" "crc")] > > ) > > > > +;; Reversed CRC > > +(define_expand "crc_rev4" > > + ;; return value (calculated CRC) > > + [(set (match_operand:ALLX 0 "register_operand" "=3Dr") > > + ;; initial CRC > > + (unspec:ALLX [(match_operand:ALLX 1 "register_operand" "r") > > + ;; data > > + (match_operand:ALLI 2 "register_operand" "r") > > + ;; polynomial without leading 1 > > + (match_operand:ALLX 3)] > > + UNSPEC_CRC_REV))] > > Since we (rightly) never generate the RTL above, I think this can just be: > > (define_expand "crc_rev4" > [;; return value (calculated CRC) > (match_operand:ALLX 0 "register_operand") > ;; initial CRC > (match_operand:ALLX 1 "register_operand") > ;; data > (match_operand:ALLI 2 "register_operand") > ;; polynomial without leading 1 > (match_operand:ALLX 3)] > > without the unspec and constraints. > > > + "" > > + { > > + /* If the polynomial is the same as the polynomial of crc32 > instruction, > > + put that instruction. crc32 uses iSCSI polynomial > (0x1EDC6F41). */ > > + if (TARGET_CRC32 && INTVAL (operands[3]) =3D=3D 517762881) > > The hex constant feels a little easier to read. I think it'd also > be worth checking mode =3D=3D SImode, even though it's current= ly > redundant (given that no other choice would allow that polynomial). > > > + { > > + rtx crc_result =3D gen_reg_rtx (SImode); > > + rtx crc =3D operands[1]; > > + rtx data =3D operands[2]; > > + emit_insn (gen_aarch64_crc32c (crc_result, cr= c, > > + data)); > > + emit_move_insn (operands[0], > > + gen_lowpart (GET_MODE (operands[0]), crc_result)); > > If operands[0] has ALLX mode (=3D=3D SImode), it looks like we should be > able to use operands[0] directly as the result of the CRC32C. > > FWIW, there's also CRC32 for the HDLC etc. polynomial 0x04C11DB7. > > > + } > > + else if (TARGET_AES) > > I think we also need to check <=3D for this. > Similarly for the unreversed CRC pattern. > > Thanks again for doing this. I realise RISC-V is the lead target for > this work, so you've gone above and beyond by doing a full AArch64 > port too. It'd be perfectly valid to ask Arm developers to deal > with the comments above, so please let me know if you'd prefer that. > The patch looks close to ready to me though. > Thanks for your suggestions and explanations, and thank you for recognizing my work. I'll resolve all the comments. Best regards, Mariam > Richard > > > + aarch64_expand_reversed_crc_using_clmul (operands); > > + else > > + { > > + /* Otherwise, generate table-based CRC. */ > > + expand_reversed_crc_table_based (operands[0], operands[1], > operands[2], > > + operands[3], GET_MODE > (operands[2]), > > + > generate_reflecting_code_standard); > > + } > > + DONE; > > + } > > +) > > + > > +;; Bit-forward CRC > > +(define_expand "crc4" > > + ;; return value (calculated CRC) > > + [(set (match_operand:ALLX 0 "register_operand" "=3Dr") > > + ;; initial CRC > > + (unspec:ALLX [(match_operand:ALLX 1 "register_operand" "r") > > + ; data > > + (match_operand:ALLI 2 "register_operand" "r") > > + ;; polynomial without leading 1 > > + (match_operand:ALLX 3)] > > + UNSPEC_CRC))] > > + "TARGET_AES" > > + { > > + aarch64_expand_crc_using_clmul (operands); > > + DONE; > > + } > > +) > > + > > + > > (define_insn "*csinc2_insn" > > [(set (match_operand:GPI 0 "register_operand" "=3Dr") > > (plus:GPI (match_operand 2 "aarch64_comparison_operation" "") > > diff --git a/gcc/config/aarch64/iterators.md > b/gcc/config/aarch64/iterators.md > > index 99cde46f1ba..86e4863d684 100644 > > --- a/gcc/config/aarch64/iterators.md > > +++ b/gcc/config/aarch64/iterators.md > > @@ -1276,6 +1276,10 @@ > > ;; Map a mode to a specific constraint character. > > (define_mode_attr cmode [(QI "q") (HI "h") (SI "s") (DI "d")]) > > > > +;; Map a mode to a specific constraint character for calling > > +;; appropriate version of crc. > > +(define_mode_attr crc_data_type [(QI "b") (HI "h") (SI "w") (DI "x")]) > > + > > ;; Map modes to Usg and Usj constraints for SISD right shifts > > (define_mode_attr cmode_simd [(SI "g") (DI "j")]) > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c > > new file mode 100644 > > index 00000000000..2bea6280762 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c > > @@ -0,0 +1,8 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } = */ > > + > > +#include "../../gcc.c-torture/execute/crc-1.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c > > new file mode 100644 > > index 00000000000..846eecbaa85 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-10.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c > > new file mode 100644 > > index 00000000000..0eea6aa6741 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } = */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-12.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c > > new file mode 100644 > > index 00000000000..7ff8fbcb665 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-13.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c > > new file mode 100644 > > index 00000000000..80766daf487 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-14.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c > > new file mode 100644 > > index 00000000000..0e32fffa0b6 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-17.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c > > new file mode 100644 > > index 00000000000..87f4c63b5ea > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-18.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c > > new file mode 100644 > > index 00000000000..6eeac8cf97f > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-21.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c > > new file mode 100644 > > index 00000000000..76e3c00ce9f > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-22.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c > > new file mode 100644 > > index 00000000000..e3a5e99ffba > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-23.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c > > new file mode 100644 > > index 00000000000..528006c0099 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-4.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c > > new file mode 100644 > > index 00000000000..41e1f8202bc > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -w -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-5.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c > > new file mode 100644 > > index 00000000000..83db99ccb8b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-6.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c > > new file mode 100644 > > index 00000000000..7ad777aac8c > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-7.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c > > new file mode 100644 > > index 00000000000..da1b619c418 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-8.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c > > new file mode 100644 > > index 00000000000..33bbe0bfb26 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-9.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c > > new file mode 100644 > > index 00000000000..0c452c1c0f4 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-w -march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-CCIT-data16.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c > > new file mode 100644 > > index 00000000000..87a0b4489a2 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-w -march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > + > > +#include "../../gcc.c-torture/execute/crc-CCIT-data8.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git > a/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c > b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c > > new file mode 100644 > > index 00000000000..75ed5aff80b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c > > @@ -0,0 +1,9 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-w -march=3Darmv8-a+crypto -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include "../../gcc.c-torture/execute/crc-coremark16-data16.c" > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c > > new file mode 100644 > > index 00000000000..d5aeee7c0c4 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c > > @@ -0,0 +1,53 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crc -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include > > +#include > > + > > +__attribute__ ((noinline,optimize(0))) > > +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 8; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +uint32_t _crc32 (uint32_t crc, uint16_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 8; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +int main () > > +{ > > + uint32_t crc =3D 0x0D800D80; > > + for (uint16_t i =3D 0; i < 0xffff; i++) > > + { > > + uint32_t res1 =3D _crc32_O0 (crc, i); > > + uint32_t res2 =3D _crc32 (crc, i); > > + if (res1 !=3D res2) > > + abort (); > > + crc =3D res1; > > + } > > +} > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ > > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c > > new file mode 100644 > > index 00000000000..f0e319b3ab8 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c > > @@ -0,0 +1,52 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crc -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include > > +#include > > +__attribute__ ((noinline,optimize(0))) > > +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 32; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +uint32_t _crc32 (uint32_t crc, uint32_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 32; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +int main () > > +{ > > + uint32_t crc =3D 0x0D800D80; > > + for (uint8_t i =3D 0; i < 0xff; i++) > > + { > > + uint32_t res1 =3D _crc32_O0 (crc, i); > > + uint32_t res2 =3D _crc32 (crc, i); > > + if (res1 !=3D res2) > > + abort (); > > + crc =3D res1; > > + } > > +} > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ > > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c > > new file mode 100644 > > index 00000000000..95ffde6a9d2 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c > > @@ -0,0 +1,53 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-march=3Darmv8-a+crc -O2 -fdump-rtl-dfinish > -fdump-tree-crc-details" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ > > + > > +#include > > +#include > > + > > +__attribute__ ((noinline,optimize(0))) > > +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 8; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +uint32_t _crc32 (uint32_t crc, uint8_t data) { > > + int i; > > + crc =3D crc ^ data; > > + > > + for (i =3D 0; i < 8; i++) { > > + if (crc & 1) > > + crc =3D (crc >> 1) ^ 0x82F63B78; > > + else > > + crc =3D (crc >> 1); > > + } > > + > > + return crc; > > +} > > + > > +int main () > > +{ > > + uint32_t crc =3D 0x0D800D80; > > + for (uint8_t i =3D 0; i < 0xff; i++) > > + { > > + uint32_t res1 =3D _crc32_O0 (crc, i); > > + uint32_t res2 =3D _crc32 (crc, i); > > + if (res1 !=3D res2) > > + abort (); > > + crc =3D res1; > > + } > > +} > > + > > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ > > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC > code." 0 "crc"} } */ > > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ > > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ > --000000000000bcb799061b3fbc87--