public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/96965] New: combine RMW and flags
@ 2020-09-07 22:09 aoliva at gcc dot gnu.org
  2020-09-07 22:16 ` [Bug rtl-optimization/96965] " aoliva at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: aoliva at gcc dot gnu.org @ 2020-09-07 22:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96965

            Bug ID: 96965
           Summary: combine RMW and flags
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: segher at gcc dot gnu.org
          Reporter: aoliva at gcc dot gnu.org
  Target Milestone: ---

Consider:

typedef unsigned char T;
T i[2];
int f() {
  T *p = &i[0], *q = &i[1];
  T c = __builtin_add_overflow(*p, 1, p);
  *q += c;
}

The desired code sequence on x86_64 is:

  addb $1, i(%rip)
  adcb $0, i+1(%rip)

What we get instead of the desired addb are separate load, addb, and store
instructions.  There are two reasons why we don't combine them to form the
addb:

- when we try_combine the 3 of them, the flag-store insn is still present,
between M (add) and W (store), thus can_combine_p fails.  after we combine the
flag-store into adcb, we do not retry

- if I manually force the retry, we break up the M parallel insn into a naked
add in i2, and a flag-setting non-canonical compare in i0.  we substitute R and
M into W, for an add without flag-setting.  finally, we deal with added_sets,
building a new parallel to hold the RMW add and appending the flag-setter as
the second item, after the combined add.  alas, recog won't match them in this
order.  *add<mode>3_cc_overflow_1 requires the flag-setter before the
reg-setter.

Here's discussion and combine dumbs from a slightly different testcase that
triggers the same combine behavior:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553242.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/96965] combine RMW and flags
  2020-09-07 22:09 [Bug rtl-optimization/96965] New: combine RMW and flags aoliva at gcc dot gnu.org
@ 2020-09-07 22:16 ` aoliva at gcc dot gnu.org
  2020-09-08 19:04 ` segher at gcc dot gnu.org
  2021-05-04 12:31 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: aoliva at gcc dot gnu.org @ 2020-09-07 22:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96965

--- Comment #1 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
One nit: I wrote the flag-setting non-canonical compare ended up in i0, but it
actually becomes i1, with the original i1 (R) moved to i0.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/96965] combine RMW and flags
  2020-09-07 22:09 [Bug rtl-optimization/96965] New: combine RMW and flags aoliva at gcc dot gnu.org
  2020-09-07 22:16 ` [Bug rtl-optimization/96965] " aoliva at gcc dot gnu.org
@ 2020-09-08 19:04 ` segher at gcc dot gnu.org
  2021-05-04 12:31 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: segher at gcc dot gnu.org @ 2020-09-08 19:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96965

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-09-08
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
insn_cost 8 for     6: r91:SI=zero_extend([`i'])
insn_cost 4 for     8:
{flags:CCC=cmp(r91:SI#0+0x1,r91:SI#0);r92:QI=r91:SI#0+0x1
;}
      REG_DEAD r91:SI
insn_cost 4 for    33: r89:QI=ltu(flags:CCC,0)
      REG_DEAD flags:CCC
insn_cost 4 for    17: [`i']=r92:QI
      REG_DEAD r92:QI

8+17 does not work, because the combined insn replaces 17, but it has to
stay before 33.

8+33+17 is never tried.  I don't immediately see why not?

It may still not work because of the irregularities in the x86 ISA, but
6+8+33+17 should work (if that would be tried...  but if 8+33+17 already
is not done, that needs to be fixed first).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/96965] combine RMW and flags
  2020-09-07 22:09 [Bug rtl-optimization/96965] New: combine RMW and flags aoliva at gcc dot gnu.org
  2020-09-07 22:16 ` [Bug rtl-optimization/96965] " aoliva at gcc dot gnu.org
  2020-09-08 19:04 ` segher at gcc dot gnu.org
@ 2021-05-04 12:31 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-04 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96965

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-04 12:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-07 22:09 [Bug rtl-optimization/96965] New: combine RMW and flags aoliva at gcc dot gnu.org
2020-09-07 22:16 ` [Bug rtl-optimization/96965] " aoliva at gcc dot gnu.org
2020-09-08 19:04 ` segher at gcc dot gnu.org
2021-05-04 12:31 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).