public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp
@ 2021-10-15 21:07 christophm30 at gmail dot com
  2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: christophm30 at gmail dot com @ 2021-10-15 21:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

            Bug ID: 102793
           Summary: AArch64: sequential comparisons with equal conditional
                    blocks don't use ccmp
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christophm30 at gmail dot com
  Target Milestone: ---

The following code:

int ccmp(uint64_t* s1, uint64_t* s2, int(*foo)(void))
{
    uint64_t d1, d2, bar;

    d1 = *s1++;
    d2 = *s2++;
    bar = (d1 ^ d2) & 0xabcd;
    if (bar == 0 || d1 != d2)
      return foo();
    return 0;
}

int noccmp(uint64_t* s1, uint64_t* s2, int(*foo)(void))
{
    uint64_t d1, d2, bar;

    d1 = *s1++;
    d2 = *s2++;
    bar = (d1 ^ d2) & 0xabcd;
    if (bar == 0)
      return foo();
    if (d1 != d2)
      return foo();
    return 0;
}

...produces (GCC master or earlier, ARM64, with -O3):

ccmp:
        ldr     x3, [x0]
        mov     x4, 43981
        ldr     x0, [x1]
        eor     x1, x3, x0
        tst     x1, x4
        ccmp    x3, x0, 0, ne
        bne     .L5
        mov     w0, 0
        ret
.L5:
        mov     x16, x2
        br      x16
noccmp:
        ldr     x3, [x0]
        mov     x4, 43981
        ldr     x0, [x1]
        eor     x1, x3, x0
        tst     x1, x4
        beq     .L8
        cmp     x3, x0
        beq     .L9
.L8:
        mov     x16, x2
        br      x16
.L9:
        mov     w0, 0
        ret

Since both conditional blocks do exactly the same (they call foo()), both
functions could use the ccmp instruction.
I just observed this and have not analysed this any further.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
@ 2021-10-15 21:10 ` pinskia at gcc dot gnu.org
  2021-10-15 21:19 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-15 21:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-10-15
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
          Component|middle-end                  |tree-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a gimple level missed optimization where the indirect function call is
not "commonized".  There are a few other bugs which are similar to this too.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
  2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
@ 2021-10-15 21:19 ` pinskia at gcc dot gnu.org
  2021-10-18  6:35 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-15 21:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> This is a gimple level missed optimization where the indirect function call
> is not "commonized".  There are a few other bugs which are similar to this
> too.

I should mention the "commonization" happens at the RTL level which makes it
too late to convert to ccmp.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
  2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
  2021-10-15 21:19 ` pinskia at gcc dot gnu.org
@ 2021-10-18  6:35 ` rguenth at gcc dot gnu.org
  2021-10-18  6:43 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-18  6:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |aarch64

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I wonder why tail-merging doesn't do it's job here.  It does (on x86_64-linux):

find_duplicates: <bb 3> duplicate of <bb 5>
Removing basic block 5
;; basic block 5, loop depth 0
;;  pred:
_12 = foo_10(D) ();
;;  succ:       6

and .optimized:


int noccmp (uint64_t * s1, uint64_t * s2, int (*<T3aa>) (void) foo)
{
  uint64_t bar;
  uint64_t d2;
  uint64_t d1;
  long unsigned int _1;
  int _2;
  int _14;

  <bb 2> [local count: 1073741824]:
  d1_6 = *s1_4(D);
  d2_8 = *s2_7(D);
  _1 = d1_6 ^ d2_8;
  bar_9 = _1 & 43981;
  if (bar_9 == 0)
    goto <bb 3>; [34.00%]
  else
    goto <bb 4>; [66.00%]

  <bb 3> [local count: 719407024]:
  _14 = foo_10(D) (); [tail call]
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 708669601]:
  if (d1_6 != d2_8)
    goto <bb 3>; [50.00%]
  else
    goto <bb 5>; [50.00%]

  <bb 5> [local count: 1073741824]:
  # _2 = PHI <_14(3), 0(4)>
  return _2;

}

but what's missing is possibly some if-combine?

IMHO ccmp expansion should be re-written to a pre RTL expansion GIMPLE
transform.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-18  6:35 ` rguenth at gcc dot gnu.org
@ 2021-10-18  6:43 ` pinskia at gcc dot gnu.org
  2023-08-26  7:27 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-18  6:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> I wonder why tail-merging doesn't do it's job here.  It does (on
> x86_64-linux):

Oh I missed that.

> but what's missing is possibly some if-combine?

Yes because ifcombine happens way way too early. I wonder if we should not have
another one or move the currently one to be after the loop optimizations are
done.

> IMHO ccmp expansion should be re-written to a pre RTL expansion GIMPLE
> transform.

basically ccmp expension is just ifcombine really.
That is:
cc = bar cmp 0
cc = cc.eq ? ne : d1 cmp d2
cset cc.ne
So this is why doing a late pass ifcombine might be good for aarch64; I don't
know if how much compile time it would cost though.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
                   ` (3 preceding siblings ...)
  2021-10-18  6:43 ` pinskia at gcc dot gnu.org
@ 2023-08-26  7:27 ` pinskia at gcc dot gnu.org
  2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
  2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-26  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> basically ccmp expension is just ifcombine really.
> That is:
> cc = bar cmp 0
> cc = cc.eq ? ne : d1 cmp d2
> cset cc.ne
> So this is why doing a late pass ifcombine might be good for aarch64; I
> don't know if how much compile time it would cost though.

Or maybe isel can do this too ...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
                   ` (4 preceding siblings ...)
  2023-08-26  7:27 ` pinskia at gcc dot gnu.org
@ 2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
  2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
  6 siblings, 0 replies; 8+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2024-04-05 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

Manolis Tsamis <manolis.tsamis at vrull dot eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |manolis.tsamis at vrull dot eu

--- Comment #6 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Created attachment 57886
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57886&action=edit
Patch with proposed fix for the issue.

Adding a copy of the ifcombine pass after pre solves this issues and improves
code generation is some other cases too. I specifically chose after the pre
pass because it deduplicates basic blocks and helps in maximizing the effect of
ifcombine.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
  2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
                   ` (5 preceding siblings ...)
  2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
@ 2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
  6 siblings, 0 replies; 8+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2024-04-05 12:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793

--- Comment #7 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Also submitted in the lists:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648856.html

I should note that I needed to modify the test uninit-pred-6_c.c and remove
this check:

  if (l)
    if (n > 12)
      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */

because it was inconsistent. If the branch is merged to a simple if (l && (n >
12)) is fails (on master). With the proposed patch we get more ifcombining and
better codegen but this condition fails. I believe this would need an
enhancement in tree-ssa-uninit.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-05 12:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
2021-10-15 21:19 ` pinskia at gcc dot gnu.org
2021-10-18  6:35 ` rguenth at gcc dot gnu.org
2021-10-18  6:43 ` pinskia at gcc dot gnu.org
2023-08-26  7:27 ` pinskia at gcc dot gnu.org
2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
2024-04-05 12:41 ` manolis.tsamis at vrull dot eu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).