public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Richard Biener <rguenther@suse.de>,
	Richard Sandiford <richard.sandiford@arm.com>,
	Eric Botcazou <botcazou@adacore.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] dse: Handle SUBREGs of word REGs differently for WORD_REGISTER_OPERATIONS targets [PR109040]
Date: Wed, 5 Apr 2023 16:51:52 +0200	[thread overview]
Message-ID: <ZC2LCD6SIJJLVv0v@tucnak> (raw)
In-Reply-To: <8e0e3cd5-e4db-ce8a-b7dc-baac32aed516@gmail.com>

On Wed, Apr 05, 2023 at 07:14:23AM -0600, Jeff Law wrote:
> > The following testcase is miscompiled on riscv since the addition
> > of *mvconst_internal define_insn_and_split.
> > I believe the bug is in DSE.  We have:
> > (insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame)
> >                  (const_int -64 [0xffffffffffffffc0])) [2  S4 A128])
> >          (reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal}
> >       (expr_list:REG_DEAD (reg:SI 166)
> >          (nil)))
> > (insn 39 36 40 2 (set (reg:SI 171)
> >          (zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame)
> >                      (const_int -64 [0xffffffffffffffc0])) [0  S2 A128]))) "pr109040.c":9:11 111 {*zero_extendhisi2}
> >       (nil))
> > and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles
> > even different modes like in the above case, and so it optimizes it into:
> > (insn 47 35 39 2 (set (reg:HI 175)
> >          (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal}
> >       (expr_list:REG_DEAD (reg:SI 166)
> >          (nil)))
> > (insn 39 47 40 2 (set (reg:SI 171)
> >          (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111 {*zero_extendhisi2}
> >       (expr_list:REG_DEAD (reg:HI 175)
> >          (nil)))
> > Pseudo 166 is result of AND with 0x8084c constant (forced into a register).
> Right.  But do we agree that the two above are equivalent?  If they are then
> changing DSE just papers over the combine issue downstream.

It is true that an instruction like
(insn 8 7 9 2 (set (reg:HI 141)
        (subreg:HI (reg:SI 142) 0)) "aauu.c":6:18 181 {*movhi_internal}
     (nil))
can appear in the IL on WORD_REGISTER_OPERATIONS target, but I think the
upper bits shouldn't be random garbage in that case, it should be zero
extended or sign extended.

What happens in combine is we enter combine.cc (simplify_set) with
(set (reg:HI 175)
    (subreg:HI (and:SI (reg:SI 167 [ m ])
            (reg:SI 168)) 0))
and there trigger the
  /* If we have (set x (subreg:m1 (op:m2 ...) 0)) with OP being some operation,
     and X being a REG or (subreg (reg)), we may be able to convert this to
     (set (subreg:m2 x) (op)).

     We can always do this if M1 is narrower than M2 because that means that
     we only care about the low bits of the result.

     However, on machines without WORD_REGISTER_OPERATIONS defined, we cannot
     perform a narrower operation than requested since the high-order bits will
     be undefined.  On machine where it is defined, this transformation is safe
     as long as M1 and M2 have the same number of words.  */
transformation into:
(set (subreg:SI (reg:HI 175) 0)
    (and:SI (reg:SI 167 [ m ])
        (reg:SI 168)))
Though, it is !paradoxical_subreg_p (src) in that case, so it is done
regardless of WORD_REGISTER_OPERATIONS I think.

Then after that try_combine we do:
13325		record_value_for_reg (dest, record_dead_insn,
13326				      WORD_REGISTER_OPERATIONS
13327				      && word_register_operation_p (SET_SRC (setter))
13328				      && paradoxical_subreg_p (SET_DEST (setter))
13329				      ? SET_SRC (setter)
13330				      : gen_lowpart (GET_MODE (dest),
13331						     SET_SRC (setter)));
and the 3 conditions are true here and so record value of the whole setter.
That then records among other things nonzero_bits as 0x8084c.

Next when trying to combine
(insn 39 47 40 2 (set (reg:SI 171)
        (zero_extend:SI (reg:HI 175))) "pr109040.c":10:11 111 {*zero_extendhisi2}
     (expr_list:REG_DEAD (reg:HI 175)
        (nil)))
into
(insn 40 39 43 2 (set (reg:SI 172)
        (leu:SI (reg:SI 171)
            (const_int 5 [0x5]))) "pr109040.c":10:11 291 {*sleu_sisi}
     (expr_list:REG_DEAD (reg:SI 171)
        (nil)))
we i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
and that correctly simplifies it into
(and:SI (subreg:SI (reg:HI 175) 0)
    (const_int 2124 [0x84c]))
We substitute that
(leu:SI (and:SI (subreg:SI (reg:HI 175) 0)
        (const_int 2124 [0x84c]))
    (const_int 5 [0x5]))
but then trigger the WORD_REGISTER_OPERATIONS block in simplify_comparison:
          /* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1
             fits in both M1 and M2 and the SUBREG is either paradoxical
             or represents the low part, permute the SUBREG and the AND
             and try again.  */
          if (GET_CODE (XEXP (op0, 0)) == SUBREG
              && CONST_INT_P (XEXP (op0, 1)))
            {
              unsigned HOST_WIDE_INT c1 = INTVAL (XEXP (op0, 1));
              /* Require an integral mode, to avoid creating something like
                 (AND:SF ...).  */
              if ((is_a <scalar_int_mode>
                   (GET_MODE (SUBREG_REG (XEXP (op0, 0))), &tmode))
                  /* It is unsafe to commute the AND into the SUBREG if the
                     SUBREG is paradoxical and WORD_REGISTER_OPERATIONS is
                     not defined.  As originally written the upper bits
                     have a defined value due to the AND operation.
                     However, if we commute the AND inside the SUBREG then
                     they no longer have defined values and the meaning of
                     the code has been changed.
                     Also C1 should not change value in the smaller mode,
                     see PR67028 (a positive C1 can become negative in the
                     smaller mode, so that the AND does no longer mask the
                     upper bits).  */
                  && ((WORD_REGISTER_OPERATIONS
                       && mode_width > GET_MODE_PRECISION (tmode)
                       && mode_width <= BITS_PER_WORD
                       && trunc_int_for_mode (c1, tmode) == (HOST_WIDE_INT) c1)
                      || (mode_width <= GET_MODE_PRECISION (tmode)
                          && subreg_lowpart_p (XEXP (op0, 0))))
                  && mode_width <= HOST_BITS_PER_WIDE_INT
                  && HWI_COMPUTABLE_MODE_P (tmode)
                  && (c1 & ~mask) == 0
                  && (c1 & ~GET_MODE_MASK (tmode)) == 0
                  && c1 != mask
                  && c1 != GET_MODE_MASK (tmode))
                {
                  op0 = simplify_gen_binary (AND, tmode,
                                             SUBREG_REG (XEXP (op0, 0)),
                                             gen_int_mode (c1, tmode));
                  op0 = gen_lowpart (mode, op0);
                  continue;
                }
            }
c1 is 0x84c.  I believe this is the exact spot where things go wrong,
and is because for WORD_REGISTER_OPERATIONS we assume something that the
DSE added instruction didn't guarantee.

	Jakub


  reply	other threads:[~2023-04-05 14:52 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05  9:16 Jakub Jelinek
2023-04-05 13:14 ` Jeff Law
2023-04-05 14:51   ` Jakub Jelinek [this message]
2023-04-05 16:17     ` Jeff Law
2023-04-05 16:48       ` Jakub Jelinek
2023-04-05 17:31         ` Jeff Law
2023-04-06  9:31           ` Richard Sandiford
2023-04-06  9:37             ` Li, Pan2
2023-04-06 14:49               ` Jeff Law
2023-04-06 14:45             ` Jeff Law
2023-04-06 10:15           ` Eric Botcazou
2023-04-06 10:31             ` [PATCH] combine: Fix simplify_comparison AND handling " Jakub Jelinek
2023-04-06 10:51               ` Eric Botcazou
2023-04-06 11:37                 ` Jakub Jelinek
2023-04-06 14:21                   ` Eric Botcazou
2023-04-09  0:25                     ` Jeff Law
2023-04-10  7:10                       ` Jakub Jelinek
2023-04-12  1:26                         ` Jeff Law
2023-04-12  6:21                           ` Jakub Jelinek
2023-04-12 10:02                             ` [PATCH] combine, v3: Fix " Jakub Jelinek
2023-04-12 14:17                               ` Jeff Law
2023-04-12 14:30                                 ` Jakub Jelinek
2023-04-12 15:24                               ` Segher Boessenkool
2023-04-12 16:58                               ` [PATCH] combine, v4: " Jakub Jelinek
2023-04-13  4:05                                 ` Jeff Law
2023-04-13 10:57                                   ` Segher Boessenkool
2023-04-13 12:35                                     ` Jeff Law
2023-04-13 13:45                                       ` [PATCH] loop-iv: Fix up bounds computation Jakub Jelinek
2023-04-13 15:07                                         ` Jeff Law
2023-04-13 19:37                                         ` Jeff Law
2023-04-12 13:29                             ` [PATCH] combine: Fix simplify_comparison AND handling for WORD_REGISTER_OPERATIONS targets [PR109040] Jeff Law
2023-04-09  1:15                   ` Jeff Law
2023-04-10  5:13                     ` Hongtao Liu
2023-04-10  5:15                       ` Hongtao Liu
2023-04-06 14:35               ` Jeff Law
2023-04-06 15:06               ` Jeff Law
2023-04-06 14:53             ` [PATCH] dse: Handle SUBREGs of word REGs differently " Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZC2LCD6SIJJLVv0v@tucnak \
    --to=jakub@redhat.com \
    --cc=botcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).