public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations
@ 2015-10-28 14:39 ktkachov at gcc dot gnu.org
  2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-10-28 14:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136

            Bug ID: 68136
           Summary: missed tree-level optimization with redundant
                    computations
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

Take the testcase gcc.dg/ifcvt-3.c:
typedef long long s64;

int
foo (s64 a, s64 b, s64 c)
{
 s64 d = a - b;

  if (d == 0)
    return a + c;
  else
    return b + d + c;
}


on aarch64 this produces the simplest possible:
foo:
        add     w0, w2, w0
        ret


However, this is due to RTL-level ifconversion.
The final tree dump is the more complex:
foo (s64D.2694 aD.2695, s64D.2694 bD.2696, s64D.2694 cD.2697)
{
  s64D.2694 dD.2700;
  intD.7 _1;
  unsigned int _5;
  unsigned int _7;
  unsigned int _8;
  intD.7 _9;
  unsigned int _10;
  unsigned int _11;
  unsigned int _13;
  unsigned int _14;
  intD.7 _15;
  unsigned int _17;

;;   basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;    prev block 0, next block 3, flags: (NEW, REACHABLE)
;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
  d_4 = a_2(D) - b_3(D);
  if (d_4 == 0)
    goto <bb 3>;
  else
    goto <bb 4>;
;;    succ:       3 [39.0%]  (TRUE_VALUE,EXECUTABLE)
;;                4 [61.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 3, loop depth 0, count 0, freq 3900, maybe hot
;;    prev block 2, next block 4, flags: (NEW, REACHABLE)
;;    pred:       2 [39.0%]  (TRUE_VALUE,EXECUTABLE)
  # RANGE [0, 4294967295]
  _5 = (unsigned int) a_2(D);
  # RANGE [0, 4294967295]
  _7 = (unsigned int) c_6(D);
  # RANGE [0, 4294967295]
  _8 = _5 + _7;
  _9 = (intD.7) _8;
  goto <bb 5>;
;;    succ:       5 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 4, loop depth 0, count 0, freq 6100, maybe hot
;;    prev block 3, next block 5, flags: (NEW, REACHABLE)
;;    pred:       2 [61.0%]  (FALSE_VALUE,EXECUTABLE)
  # RANGE [0, 4294967295]
  _10 = (unsigned int) b_3(D);
  # RANGE [0, 4294967295]
  _11 = (unsigned int) d_4;
  # RANGE [0, 4294967295]
  _13 = (unsigned int) c_6(D);
  # RANGE [0, 4294967295]
  _17 = _10 + _13;
  # RANGE [0, 4294967295]
  _14 = _11 + _17;
  _15 = (intD.7) _14;
;;    succ:       5 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 5, loop depth 0, count 0, freq 10000, maybe hot
;;    prev block 4, next block 1, flags: (NEW, REACHABLE)
;;    pred:       3 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                4 [100.0%]  (FALLTHRU,EXECUTABLE)
  # _1 = PHI <_9(3), _15(4)>
  # VUSE <.MEM_16(D)>
  return _1;
;;    succ:       EXIT [100.0%] 

}

It's probably a good idea to detect this earlier and produce a " return a + c;"
at the tree level


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
  2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
@ 2015-10-28 14:39 ` ktkachov at gcc dot gnu.org
  2015-10-28 15:04 ` rguenth at gcc dot gnu.org
  2021-07-26 23:08 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-10-28 14:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136

--- Comment #1 from ktkachov at gcc dot gnu.org ---
This came out of pr67462


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
  2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
  2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
@ 2015-10-28 15:04 ` rguenth at gcc dot gnu.org
  2021-07-26 23:08 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-28 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-10-28
                 CC|                            |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
int
foo (s64 a, s64 b, s64 c)
{
 s64 d = a - b;

  if (d == 0)
    return a + c;
  else
    return b + d + c;
}

ok, so the issue here is that the additions are narrowed to unsigned int
because of the appearant cast to 'int' for the return while the
d = a - b assignment is not narrowed in this way.

 (u32)b + (u32)(a - b) + (u32)c

can not be easily simplified to the desired (u32)a + (u32)c.

As 'd' is multi-use this way we don't change the compare to a == b either
which would expose the then single-use 'd' to reassoc (which could get
some tricks to perform narrowing).

I don't see a very good answer here.  Pattern-matching the whole thing
is of course an option but I don't see that as a scalable solution.

Maybe hacking up reassoc to do that narrowing trick and also consider
non-single-use chains (marked specially so they only participate in
simplification but never actual association or code-gen).


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
  2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
  2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
  2015-10-28 15:04 ` rguenth at gcc dot gnu.org
@ 2021-07-26 23:08 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 23:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2015-10-28 00:00:00         |2021-7-26

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Improved for GCC 11 by r11-3207 . The testcase even was changed to show that it
is not fully fixed:

typedef long long s64;
int
foo (s64 a, s64 b, s64 c)
{
 s64 d = a - b;

  if (d == 0)
    return a + c;
  else
    return b + c + d;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-26 23:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
2015-10-28 15:04 ` rguenth at gcc dot gnu.org
2021-07-26 23:08 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).