From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by sourceware.org (Postfix) with ESMTPS id 759CE3858409 for ; Fri, 10 Feb 2023 23:07:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 759CE3858409 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x102c.google.com with SMTP id o13so6910822pjg.2 for ; Fri, 10 Feb 2023 15:07:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=yT2LGXph7feImNttWmuqxOLJQwMBD1eSJdMvOq2elAk=; b=J+hvZgRAzqY9WU+4lH09X0JAmdTmnyo4disIWyEcxz7utO1rSwfsR3uiJZKzXaKCAN VjSxHcaiTeJgJyNPBVS/MitQ8u8XiMM4AgL3MUyPg2St5ev9hroDK8SxFVzc3BEy0KL+ yIxXH+ms39Pkd+L5SVWcEMgFCC2xwqI/K6Sk934/CXiaTwoiqRwggEX7gAI4i3Ra3w6n RZrqfsJeY9zC7M6GDbFPUwlQ94QQcAm7asdg24U4CLMxGIyAC85H1XY0klhazrzmyezr 9MoofC7mrKnWFhwNRIMHJsOskvHps4C0TuV934rZ0qO6FERHT2WHxTcKHzwpnhJm7WuE slfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yT2LGXph7feImNttWmuqxOLJQwMBD1eSJdMvOq2elAk=; b=klxG5WMl4OcwbGXDE0mN8UK4Geec008oVlv9iQ+iR/UZRiOBAtrL7NuZA5FaNEyDid oe0HS1lY7BcsjksOVsM6JHIVyu6AvuoP1ZHb1hpHZv7oMDSyTz5ZrrCm/iudLgKpe0/S b9+Gj8qJUbfjhhq6RlWHPWpjcBiWYX2M6EqJg1TUzdE7u5t2J6WoWmLS4yqcQcDoz9Oy knErBdeYZeazLek7FlGCA7LxTL20VrRo7vZfeuBy7lR3CqfGmwkNDyyaeeTPQH8R1UuP EjCg7aujPrzN1dc6hwrwLgwXvE3ySiUQYgOI/Ful3dIBA/QdpGBrSkMXQkHTBZADKxTP SBEQ== X-Gm-Message-State: AO0yUKW2Z1msAOhEiuC3yJxQzh5QlrCTz/7xwnDrmznRNhVnagfgoRYM GX+IHcwj7/FMAKmTKG+hrwntzRvoQcvL1oXQQvc= X-Google-Smtp-Source: AK7set+caKKUXtdXNUFdTOSqLBVkSRupK/dkq/bfeZlI16ASruOi5aaO5iZb//LTBQdPOYPI05Mweq37jwS0Yv64Bk8= X-Received: by 2002:a17:903:1315:b0:199:4934:9d29 with SMTP id iy21-20020a170903131500b0019949349d29mr2897597plb.5.1676070455215; Fri, 10 Feb 2023 15:07:35 -0800 (PST) MIME-Version: 1.0 References: <20230210224150.2801962-1-philipp.tomsich@vrull.eu> <20230210224150.2801962-9-philipp.tomsich@vrull.eu> In-Reply-To: <20230210224150.2801962-9-philipp.tomsich@vrull.eu> From: Andrew Pinski Date: Fri, 10 Feb 2023 15:07:22 -0800 Message-ID: Subject: Re: [RFC PATCH v1 08/10] ifcvt: add if-conversion to conditional-zero instructions To: Philipp Tomsich Cc: gcc-patches@gcc.gnu.org, Kito Cheng , Christoph Muellner , Palmer Dabbelt , Andrew Waterman , Vineet Gupta Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich wrote: > > Some architectures, as it the case on RISC-V with the proposed > ZiCondOps and the vendor-defined XVentanaCondOps, define a > conditional-zero instruction that is equivalent to: > - the positive form: rd = (rc != 0) ? rs : 0 > - the negated form: rd = (rc == 0) ? rs : 0 > > While noce_try_store_flag_mask will somewhat work for this case, it > will generate a number of atomic RTX that will misdirect the cost > calculation and may be too long (i.e., 4 RTX and more) to successfully > merge at combine-time. Can you expand on this? Especially when there are patterns that use (if_then_else) already. > > Instead, we add two new transforms that attempt to build up what we > define as the canonical form of a conditional-zero expression: > > (set (match_operand 0 "register_operand" "=r") > (and (neg (eq_or_ne (match_operand 1 "register_operand" "r") > (const_int 0))) > (match_operand 2 "register_operand" "r"))) Again why are you not using: (set (reg) (if_then_else (eq_ne (reg) (const_int 0)) (reg) (const_int 0))) Form instead of the really bad "canonical" form of the above? Thanks, Andrew Pinski > > Architectures that provide a conditional-zero are thus expected to > define an instruction matching this pattern in their backend. > > Based on this, we support the following cases: > - noce_try_condzero: > a ? a : b > a ? b : 0 (and then/else swapped) > !a ? b : 0 (and then/else swapped) > - noce_try_condzero_arith: > conditional-plus, conditional-minus, conditional-and, > conditional-or, conditional-xor, conditional-shift, > conditional-and > > Given that this is hooked into the CE passes, it is less powerful than > a tree-pass (e.g., it can not transform cases where an extension, such > as for uint16_t operations is in either the then or else-branch > together with the arithmetic) but already covers a good array of cases > and triggers across SPEC CPU 2017. > > Adding transformations in a tree pass should come in a future > improvement. > > gcc/ChangeLog: > > * ifcvt.cc (noce_emit_insn): Add prototype. > (noce_emit_condzero): Helper for noce_try_condzero and > noce_try_condzero_arith transforms. > (noce_try_condzero): New transform. > (noce_try_condzero_arith): New transform for conditional > arithmetic that can be built up by exploiting that the > conditional-zero instruction will inject 0, which acts > as the neutral element for operations. > (noce_process_if_block): Call noce_try_condzero and > noce_try_condzero_arith. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/xventanacondops-and-01.c: New test. > * gcc.target/riscv/xventanacondops-and-02.c: New test. > * gcc.target/riscv/xventanacondops-eq-01.c: New test. > * gcc.target/riscv/xventanacondops-eq-02.c: New test. > * gcc.target/riscv/xventanacondops-lt-01.c: New test. > * gcc.target/riscv/xventanacondops-ne-01.c: New test. > * gcc.target/riscv/xventanacondops-xor-01.c: New test. > > Signed-off-by: Philipp Tomsich > --- > > gcc/ifcvt.cc | 216 ++++++++++++++++++ > .../gcc.target/riscv/zicond-and-01.c | 16 ++ > .../gcc.target/riscv/zicond-and-02.c | 15 ++ > gcc/testsuite/gcc.target/riscv/zicond-eq-01.c | 11 + > gcc/testsuite/gcc.target/riscv/zicond-eq-02.c | 14 ++ > gcc/testsuite/gcc.target/riscv/zicond-lt-01.c | 16 ++ > gcc/testsuite/gcc.target/riscv/zicond-ne-01.c | 10 + > .../gcc.target/riscv/zicond-xor-01.c | 14 ++ > 8 files changed, 312 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-01.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-02.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-01.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-02.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-lt-01.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ne-01.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-xor-01.c > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > index 008796838f7..7ac3bd8f18e 100644 > --- a/gcc/ifcvt.cc > +++ b/gcc/ifcvt.cc > @@ -97,6 +97,7 @@ static int find_if_case_2 (basic_block, edge, edge); > static int dead_or_predicable (basic_block, basic_block, basic_block, > edge, int); > static void noce_emit_move_insn (rtx, rtx); > +static rtx_insn *noce_emit_insn (rtx); > static rtx_insn *block_has_only_trap (basic_block); > static void need_cmov_or_rewire (basic_block, hash_set *, > hash_map *); > @@ -787,6 +788,9 @@ static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **); > static int noce_try_minmax (struct noce_if_info *); > static int noce_try_abs (struct noce_if_info *); > static int noce_try_sign_mask (struct noce_if_info *); > +static rtx noce_emit_condzero (struct noce_if_info *, rtx, bool = false); > +static int noce_try_condzero (struct noce_if_info *); > +static int noce_try_condzero_arith (struct noce_if_info *); > > /* Return the comparison code for reversed condition for IF_INFO, > or UNKNOWN if reversing the condition is not possible. */ > @@ -1664,6 +1668,214 @@ noce_try_addcc (struct noce_if_info *if_info) > return FALSE; > } > > +/* Helper to noce_try_condzero: cond ? a : 0. */ > +static rtx > +noce_emit_condzero (struct noce_if_info *if_info, rtx a, bool reverse) > +{ > + /* The canonical form for a conditional-zero-or-value is: > + (set (match_operand 0 "register_operand" "=r") > + (and (neg (eq_or_ne (match_operand 1 "register_operand" "r") > + (const_int 0))) > + (match_operand 2 "register_operand" "r"))) > + */ > + > + machine_mode opmode = GET_MODE (if_info->x); > + enum rtx_code code = GET_CODE (if_info->cond); > + rtx cond; > + rtx op_a = XEXP (if_info->cond, 0); > + rtx op_b = XEXP (if_info->cond, 1); > + > + /* If it is not a EQ/NE comparison against const0_rtx, canonicalize > + by first synthesizing a truth-value and then building a NE > + condition around it. */ > + if ((code != EQ && code != NE) || XEXP (if_info->cond, 1) != const0_rtx) > + { > + rtx tmp = gen_reg_rtx (opmode); > + > + start_sequence (); > + cond = gen_rtx_fmt_ee (code, opmode, op_a, op_b); > + if (!noce_emit_insn (gen_rtx_SET (tmp, cond))) > + { > + end_sequence (); > + > + /* If we can't emit this pattern, try to reverse it and > + invert the polarity of the second test. */ > + start_sequence (); > + cond = gen_rtx_fmt_ee (reverse_condition (code), opmode, op_a, op_b); > + if (!noce_emit_insn (gen_rtx_SET (tmp, cond))) { > + end_sequence (); > + return NULL_RTX; > + } > + > + /* We have recovered by reversing the first comparison, > + so we need change the second one around as well... */ > + reverse = !reverse; > + } > + rtx_insn *seq = get_insns (); > + end_sequence (); > + emit_insn (seq); > + > + /* Set up the second comparison that will be embedded in the > + canonical conditional-zero-or-value RTX. */ > + code = NE; > + op_a = tmp; > + op_b = const0_rtx; > + } > + > + cond = gen_rtx_fmt_ee (reverse ? reverse_condition (code) : code, > + opmode, op_a, op_b); > + > + /* Build (and (neg (eq_or_ne ... const0_rtx)) (reg )) */ > + rtx target = gen_reg_rtx (opmode); > + rtx czero = gen_rtx_AND (opmode, gen_rtx_NEG (opmode, cond), a); > + noce_emit_move_insn (target, czero); > + > + return target; > +} > + > +/* Use a conditional-zero instruction for "if (test) x = 0;", if available. */ > +static int > +noce_try_condzero (struct noce_if_info *if_info) > +{ > + rtx target; > + rtx_insn *seq; > + int reversep = 0; > + /* Keep local copies of the constituent elements of if_info, as we > + may be changing them. We are not allowed to modify if_info > + though, as we may fail in this function and can't leave different > + semantics behind for the next functions. */ > + rtx a = if_info->a; > + rtx b = if_info->b; > + rtx x = if_info->x; > + rtx cond = if_info->cond; > + enum rtx_code code = GET_CODE (cond); > + rtx cond_arg0 = XEXP (cond, 0); > + rtx cond_arg1 = XEXP (cond, 1); > + rtx orig_b = NULL_RTX; > + > + if (!noce_simple_bbs (if_info)) > + return FALSE; > + > + /* We may encounter the form "(b != 0) ? b : a", which can be > + simplified to "b | ((b != 0) ? 0 : a)". */ > + if (code == NE && cond_arg1 == const0_rtx && > + REG_P (b) && rtx_equal_p (b, cond_arg0)) > + { > + orig_b = b; > + b = const0_rtx; > + } > + > + /* We may encounter the form "(b == 0) ? b : a", which can be > + simplied to "(b == 0) ? 0 : a". */ > + if (code == EQ && cond_arg1 == const0_rtx && > + REG_P (b) && rtx_equal_p (b, cond_arg0)) > + { > + b = const0_rtx; > + } > + > + start_sequence (); > + > + if ((a == const0_rtx && (REG_P (b) || rtx_equal_p (b, x))) > + || ((reversep = (noce_reversed_cond_code (if_info) != UNKNOWN)) > + && b == const0_rtx && (REG_P (a) || rtx_equal_p (a, x)))) > + { > + target = noce_emit_condzero(if_info, reversep ? a : b, reversep); > + > + /* Handle the case where we replace b in "(b != 0) ? b : a" with > + with const0_rtx to then emit "b | ((b != 0) ? 0 : a)". */ > + if (orig_b && target) > + target = expand_simple_binop (GET_MODE (x), IOR, orig_b, > + target, x, 0, OPTAB_WIDEN); > + > + if (target) > + { > + if (target != if_info->x) > + noce_emit_move_insn (if_info->x, target); > + > + seq = end_ifcvt_sequence (if_info); > + if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info)) > + return FALSE; > + > + emit_insn_before_setloc (seq, if_info->jump, > + INSN_LOCATION (if_info->insn_a)); > + if_info->transform_name = "noce_try_condzero"; > + > + return TRUE; > + } > + } > + > + end_sequence (); > + > + return FALSE; > +} > + > +/* Convert "if (test) x op= a;" to a branchless sequence using the > + canonical form for a conditional-zero. */ > +static int > +noce_try_condzero_arith (struct noce_if_info *if_info) > +{ > + rtx target; > + rtx_insn *seq; > + rtx_code op = GET_CODE (if_info->a); > + const rtx arg0 = XEXP (if_info->a, 0); > + const rtx arg1 = XEXP (if_info->a, 1); > + > + if (!noce_simple_bbs (if_info)) > + return FALSE; > + > + /* Check for no else condition. */ > + if (!rtx_equal_p (if_info->x, if_info->b)) > + return FALSE; > + > + if (op != PLUS && op != MINUS && op != IOR && op != XOR && > + op != ASHIFT && op != ASHIFTRT && op != LSHIFTRT && op != AND) > + return FALSE; > + > + if (!rtx_equal_p (if_info->x, arg0)) > + return FALSE; > + > + start_sequence (); > + > + target = noce_emit_condzero(if_info, arg1, op != AND ? true : false); > + > + if (target) > + { > + rtx op1 = if_info->x; > + > + if (op == AND) > + { > + /* Emit "tmp = x & val;" followed by "tmp |= !cond ? x : 0;" */ > + op1 = expand_simple_binop (GET_MODE (if_info->x), AND, op1, > + arg1, NULL_RTX, 0, OPTAB_WIDEN); > + op = IOR; > + } > + > + if (op1) > + target = expand_simple_binop (GET_MODE (if_info->x), op, op1, > + target, if_info->x, 0, OPTAB_WIDEN); > + } > + > + if (target) > + { > + if (target != if_info->x) > + noce_emit_move_insn (if_info->x, target); > + > + seq = end_ifcvt_sequence (if_info); > + if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info)) > + return FALSE; > + > + emit_insn_before_setloc(seq, if_info->jump, > + INSN_LOCATION(if_info->insn_a)); > + if_info->transform_name = "noce_try_condzero_arith"; > + > + return TRUE; > + } > + > + end_sequence (); > + > + return FALSE; > +} > + > /* Convert "if (test) x = 0;" to "x &= -(test == 0);" */ > > static int > @@ -3967,8 +4179,12 @@ noce_process_if_block (struct noce_if_info *if_info) > { > if (noce_try_addcc (if_info)) > goto success; > + if (noce_try_condzero (if_info)) > + goto success; > if (noce_try_store_flag_mask (if_info)) > goto success; > + if (noce_try_condzero_arith (if_info)) > + goto success; > if (HAVE_conditional_move > && noce_try_cmove_arith (if_info)) > goto success; > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-and-01.c b/gcc/testsuite/gcc.target/riscv/zicond-and-01.c > new file mode 100644 > index 00000000000..d9b0ff00756 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-and-01.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */ > + > +long and1(long a, long b, long c, long d) > +{ > + if (c < d) > + a &= b; > + > + return a; > +} > + > +/* { dg-final { scan-assembler-times "and\t" 1 } } */ > +/* { dg-final { scan-assembler-times "slt" 1 } } */ > +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */ > +/* { dg-final { scan-assembler-times "or\t" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-and-02.c b/gcc/testsuite/gcc.target/riscv/zicond-and-02.c > new file mode 100644 > index 00000000000..80f417cfb54 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-and-02.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */ > + > +int and2(int a, int b, long c) > +{ > + if (c) > + a &= b; > + > + return a; > +} > + > +/* { dg-final { scan-assembler-times "and\t" 1 } } */ > +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */ > +/* { dg-final { scan-assembler-times "or\t" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c b/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c > new file mode 100644 > index 00000000000..4f933c1db60 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ > + > +long > +eq1 (long a, long b) > +{ > + return (a == 0) ? b : 0; > +} > + > +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c b/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c > new file mode 100644 > index 00000000000..a7bc747ab1d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ > + > +long > +eq2 (long a, long b) > +{ > + if (a == 0) > + return b; > + > + return 0; > +} > + > +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c b/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c > new file mode 100644 > index 00000000000..830bfc6449f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */ > + > +long long sink (long long); > + > +long long lt3 (long long a, long long b) > +{ > + if (a < b) > + b = 0; > + > + return sink(b); > +} > + > +/* { dg-final { scan-assembler-times "czero.nez\t" 1 } } */ > +/* { dg-final { scan-assembler-times "slt\t" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c b/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c > new file mode 100644 > index 00000000000..f25e601ae3c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ > + > +long long ne1(long long a, long long b) > +{ > + return (a != 0) ? b : 0; > +} > + > +/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c b/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c > new file mode 100644 > index 00000000000..c45a3be2680 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */ > + > +long xor1(long crc, long poly) > +{ > + if (crc & 1) > + crc ^= poly; > + > + return crc; > +} > + > +/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */ > +/* { dg-final { scan-assembler-times "xor\t" 1 } } */ > -- > 2.34.1 >