public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: HAO CHEN GUI <guihaoc@linux.ibm.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 Segher Boessenkool <segher@kernel.crashing.org>,
	David <dje.gcc@gmail.com>,  "Kewen.Lin" <linkw@linux.ibm.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	 "Vladimir N. Makarov" <vmakarov@redhat.com>
Subject: Re: [PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]
Date: Thu, 16 Mar 2023 08:57:08 +0100	[thread overview]
Message-ID: <CAFiYyc068n0cFZDAsZzYpERUXYpScHZfe0LZQoPvrb1esCBATA@mail.gmail.com> (raw)
In-Reply-To: <22e83da3-a81f-dd61-c04b-a39b459a965f@linux.ibm.com>

On Thu, Mar 16, 2023 at 6:35 AM HAO CHEN GUI via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>   Currently, rs6000 directly expands to 2 insns if an integer constant is the
> second operand and it needs two insns. For example, addi/addis and ori/oris.
> It may not benefit when the constant is used for more than 2 times in an
> extended basic block, just like the case in PR shows.
>
>   One possible solution is to force the constant in pseudo at expand and let
> propagation pass and combine pass decide if the pseudo should be replaced
> with the constant or not by comparing the rtx/insn cost.
>
>   It generates a constant move if the constant is forced to a pseudo. There
> is one constant move if it's used only once. The combine pass can combine
> the constant move and add/ior/xor insn and eliminate the move as the insn
> cost reduces. There are multiple moves if the constant is used for several
> times. In an extended basic block, these constant moves are merged to one by
> propagation pass. The combine pass can't replace the pseudo with the constant
> as it is no cost saving.
>
>   In an extreme case, the constant is used twice in an extended basic block.
> The cost(latency) is unchanged between putting constant in pseudo and
> generating 2 insns. The dependence of instructions reduces but one more
> register is used. In other case, it should be always optimal to put constant
> in a pseudo.
>
>   This patch changes the expander of integer add and force constant to a
> pseudo when it needs 2 insn. Also a combine and split pattern is defined.

So this is one way around the lack of CSE/PRE of constant operands.  I'd
argue that a better spot for this _might_ be LRA (split the constant out if
there's a free register available), postreload-[g]cse (CSE the constants) and
then maybe cprop_hardreg to combine back single-use constants?

I'm not sure if careful constraints massaging like adding magic letters to
alternatives with constants to pessimize them for LRA, making them
more expensive than spilling the constant to a register but avoid
secondary reloads with spilling a register to the stack to make room
for the constant, is possible - but in theory a special constraint modifier
for this purpose could be invented.

Richard.

>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
>
> Thanks
> Gui Haochen
>
> ChangeLog
> 2023-03-14  Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
>         * config/rs6000/predicates.md (add_2insn_cint_operand): New predicate
>         which returns true when op is a 32-bit but not a 16-bit signed
>         integer constant.
>         * config/rs6000/rs6000.md (add<mode>3): Put the second operand into
>         register when it's a constant and need 2 add insns.
>         (*add<mode>_2insn): New insn_and_split for 2-insn add.
>
>
> patch.diff
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index a1764018545..09e59a48cd3 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -282,6 +282,13 @@ (define_predicate "s32bit_cint_operand"
>    (and (match_code "const_int")
>         (match_test "(0x80000000 + UINTVAL (op)) >> 32 == 0")))
>
> +;; Return 1 if op is a 32-bit but not 16-bit constant signed integer
> +(define_predicate "add_2insn_cint_operand"
> +  (and (match_code "const_int")
> +       (and (match_operand 0 "s32bit_cint_operand")
> +           (and (not (match_operand 0 "short_cint_operand"))
> +                (not (match_operand 0 "upper16_cint_operand"))))))
> +
>  ;; Return 1 if op is a constant 32-bit unsigned
>  (define_predicate "c32bit_cint_operand"
>    (and (match_code "const_int")
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6011f5bf76a..dba41e3df90 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -1796,12 +1796,44 @@ (define_expand "add<mode>3"
>        /* The ordering here is important for the prolog expander.
>          When space is allocated from the stack, adding 'low' first may
>          produce a temporary deallocation (which would be bad).  */
> -      emit_insn (gen_add<mode>3 (tmp, operands[1], GEN_INT (rest)));
> -      emit_insn (gen_add<mode>3 (operands[0], tmp, GEN_INT (low)));
> -      DONE;
> +      if (!can_create_pseudo_p ())
> +       {
> +         emit_insn (gen_add<mode>3 (tmp, operands[1], GEN_INT (rest)));
> +         emit_insn (gen_add<mode>3 (operands[0], tmp, GEN_INT (low)));
> +         DONE;
> +       }
> +
> +      operands[2] = force_reg (<MODE>mode, operands[2]);
>      }
>  })
>
> +/* The ordering here is important for the prolog expander.
> +   When space is allocated from the stack, adding 'low' first may
> +   produce a temporary deallocation (which would be bad).  */
> +
> +(define_insn_and_split "*add<mode>_2insn"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=b")
> +       (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%b")
> +                 (match_operand:GPR 2 "add_2insn_cint_operand" "n")))]
> +  "!TARGET_PREFIXED"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 0)
> +       (plus:GPR (match_dup 1)
> +                 (match_dup 3)))
> +   (set (match_dup 0)
> +       (plus:GPR (match_dup 0)
> +                 (match_dup 4)))]
> +{
> +  HOST_WIDE_INT val = INTVAL (operands[2]);
> +  HOST_WIDE_INT low = sext_hwi (val, 16);
> +  HOST_WIDE_INT rest = trunc_int_for_mode (val - low, <MODE>mode);
> +
> +  operands[3] = GEN_INT (rest);
> +  operands[4] = GEN_INT (low);
> +}
> +  [(set_attr "length" "8")])
> +
>  (define_insn "*add<mode>3"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
>         (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")

  reply	other threads:[~2023-03-16  7:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-16  5:34 HAO CHEN GUI
2023-03-16  7:57 ` Richard Biener [this message]
2023-03-16  9:04   ` HAO CHEN GUI
2023-03-16 10:36     ` Richard Biener
2023-03-17  2:48       ` HAO CHEN GUI
2023-03-22  3:08   ` HAO CHEN GUI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc068n0cFZDAsZzYpERUXYpScHZfe0LZQoPvrb1esCBATA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guihaoc@linux.ibm.com \
    --cc=linkw@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    --cc=vmakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).