public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization
@ 2015-09-19  4:09 olegendo at gcc dot gnu.org
  2015-09-30 12:49 ` [Bug rtl-optimization/67635] " olegendo at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-09-19  4:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67635

            Bug ID: 67635
           Summary: [SH] ifcvt missed optimization
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
  Target Milestone: ---
            Target: sh*-*-*

At least on SH, the following:

bool test (int a, int b, int* r)
{
  return __builtin_mul_overflow (a, b, r);
}

compiled with -m4 -ml -O2 results in:

        dmuls.l r5,r4
        mov     #0,r0
        sts     macl,r1
        sts     mach,r2
        cmp/gt  r1,r0
        subc    r3,r3
        cmp/eq  r2,r3    <<
        bf      .L6      <<
.L2:
        rts
        mov.l   r1,@r6
        .align 1
.L6:
        bra     .L2
        mov     #1,r0


The expected code would be:
        dmuls.l r5,r4
        mov     #0,r0
        sts     macl,r1
        sts     mach,r2
        cmp/gt  r1,r0
        subc    r3,r3
        cmp/eq  r2,r3    // T = r2 == r3
        mov     #-1,r0
        negc    r0,r0    // T = r2 != r3
        rts
        mov.l   r1,@r6


Alternatively, in cases zero-displacement branches are fast (e.g. SH4):
        dmuls.l r5,r4
        mov     #0,r0
        sts     macl,r1
        sts     mach,r2
        cmp/gt  r1,r0
        subc    r3,r3
        cmp/eq  r2,r3
        bt      0f
        mov     #1,r0
0:
        rts
        mov.l   r1,@r6


I think I have seen similar cases before, where something tries to preserve the
zero constant in a reg.  Instead of overwriting it with a cstore value {0,1},
conditional branches are created for the non-zero paths.

If conditional move patterns are enabled on SH with -mpretend-cmove the
conditional branches go away:

        dmuls.l r5,r4
        mov     #-1,r0
        sts     macl,r2
        sts     mach,r3
        sts     macl,r1
        shll    r2
        subc    r2,r2
        cmp/eq  r3,r2
        mov.l   r1,@r6
        rts
        negc    r0,r0

But enabling -mpretend-cmove on SH has some other side effects and currently is
not safe (PR 58517).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/67635] [SH] ifcvt missed optimization
  2015-09-19  4:09 [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization olegendo at gcc dot gnu.org
@ 2015-09-30 12:49 ` olegendo at gcc dot gnu.org
  2015-10-19 16:06 ` ktkachov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-09-30 12:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67635

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kyrylo.tkachov at arm dot com

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Here are some more missed ifcvt cases.  I haven't done any analysis what's
happening, just dumping them here for later.  When compiling those cases, they
result in conditional branches, although they could be done without.

I ran into these when adding the addsicc pattern to SH (PR 54236 attachment
36012).  However, I haven't committed the actual addsicc part, because it
doesn't seem to do anything on SH.

Kyrill, maybe some of those cases could be interesting for you, since you've
been doing some ifcvt work recently.

// this is a GCC 6 regression.  it works on GCC 5
// if "y != x" is changed to "y == x" it also works.
int test_01_3 (int x, int y)
{
  if (y != x)
    ++x;
  return x;

/*
GCC 5:
        cmp/eq  r4,r5
        mov     #-1,r1
        mov     r4,r0
        rts     
        subc    r1,r0
*/
}

// another GCC 6 regression.  works on GCC 5
int test_01_5 (int x, float y, float z)
{
  if (y >= z)
    ++x;

  return x;
}
/*
good:
        fpchg
        fcmp/gt fr4,fr5
        mov     #0,r0
        bt      .L6
        fcmp/eq fr4,fr5
.L6:
        addc    r4,r0        << expected addc
        fpchg
        rts     
        nop
*/


int test_00 (int x, int y)
{
  return x == 0 ? 1 : x;
}

int test_00_1 (int x, int y)
{
  return x ? x += 1 : x;
}

unsigned int test_02 (unsigned int x)
{
  return x == 0 ? x + 1 : x;  // will not use addcc in ifcvt
}

int test_02 (int x)
{
  return x == 0 ? x + 1 : x; // will not use addc
}


// doesn't work with unsigned int, but ...
unsigned int test_03 (unsigned int x)
{
  return x > 0 ? x + 1 : x;
}

// ... works with signed int (in pr54236-5.c)
int test_05_3 (int x)
{
  return x > 0 ? x + 1 : x;
}


int test_05_1 (int x)
{
  return x != 0 ? x + 1 : x;
}

int test_05_2 (int x)
{
  return x == 0 ? x + 1 : x;
}

unsigned int check_0 (unsigned int x)
{
  return x == 0 ? 1 : x;
}

unsigned int test_130 (unsigned int x)
{
  return x != 0 ? (x - 1) : x;
}

unsigned int test_101 (unsigned int x, unsigned int y)
{
  return x == y ? 1 : x;
}

unsigned int test_102 (unsigned int x)
{
  return x == 3 ? 1 : x;
}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/67635] [SH] ifcvt missed optimization
  2015-09-19  4:09 [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization olegendo at gcc dot gnu.org
  2015-09-30 12:49 ` [Bug rtl-optimization/67635] " olegendo at gcc dot gnu.org
@ 2015-10-19 16:06 ` ktkachov at gcc dot gnu.org
  2015-10-21 13:51 ` olegendo at gcc dot gnu.org
  2023-08-08  7:02 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-10-19 16:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67635

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktkachov at gcc dot gnu.org

--- Comment #2 from ktkachov at gcc dot gnu.org ---
I had a look at:
unsigned int test_03 (unsigned int x)
{
  return x > 0 ? x + 1 : x;
}

The cse1 pass transforms this into something not quite comfortable for ifcvt.
Basically, it transforms the RTL equivalent of:

compare (x, 0);
if (x > 0)
  temp := x + 1;
else
  temp := x;
return temp;

into:
compare (x, 0);
if (x > 0)
  temp := x + 1;
else
  temp := 0;
return temp;

which is a valid transformation to make as far as cse is concerned.
Unfortunately, when ifcvt comes along, it's asked to optimise a conditional
select between '0' and 'x + 1', which is not something it can do with an
addc-style pattern.

I initially thought of looking into the condition to see if it contains x (so
that we can deduce that the 'else' arm could actually be replaced with x,
undoing the cse in that case), but the condition rtl contains a comparison with
the T-register, rather than the original comparison with 0.

Perhaps we can consider teaching cse to not transform these kinds of
expressions (c ? x : x + a) if the target has a store_flag/addcc instruction of
the appropriate mode?

But I'm not familiar with the cse code, so I don't know how easy/clean that is


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/67635] [SH] ifcvt missed optimization
  2015-09-19  4:09 [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization olegendo at gcc dot gnu.org
  2015-09-30 12:49 ` [Bug rtl-optimization/67635] " olegendo at gcc dot gnu.org
  2015-10-19 16:06 ` ktkachov at gcc dot gnu.org
@ 2015-10-21 13:51 ` olegendo at gcc dot gnu.org
  2023-08-08  7:02 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-10-21 13:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67635

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to ktkachov from comment #2)
> 
> Perhaps we can consider teaching cse to not transform these kinds of
> expressions (c ? x : x + a) if the target has a store_flag/addcc instruction
> of the appropriate mode?

Sounds a bit shaky.  These sequences could also be introduced by other things
than the CSE pass.  I guess a more "stable" solution would be to make ifcvt
realize that in one path x must be zero and effectively un-CSE it, then proceed
as usual.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/67635] [SH] ifcvt missed optimization
  2015-09-19  4:09 [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization olegendo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-10-21 13:51 ` olegendo at gcc dot gnu.org
@ 2023-08-08  7:02 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-08  7:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67635

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-08-08
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-08  7:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-19  4:09 [Bug rtl-optimization/67635] New: [SH] ifcvt missed optimization olegendo at gcc dot gnu.org
2015-09-30 12:49 ` [Bug rtl-optimization/67635] " olegendo at gcc dot gnu.org
2015-10-19 16:06 ` ktkachov at gcc dot gnu.org
2015-10-21 13:51 ` olegendo at gcc dot gnu.org
2023-08-08  7:02 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).