public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop
@ 2022-12-09 22:45 law at gcc dot gnu.org
  2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2022-12-09 22:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

            Bug ID: 108041
           Summary: ivopts results in extra instruction in simple loop
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
                CC: rzinsly at ventanamicro dot com
  Target Milestone: ---

ivopts seems to make a bit of a mess out of this code resulting in the loop
having an unnecessary instruction.  Compile with rv64 -O2:

typedef struct network
{
  long nr_group, full_groups, max_elems;
} network_t;
void marc_arcs(network_t* net)
{
  while (net->full_groups < 0) {
    net->full_groups = net->nr_group + net->full_groups;
    net->max_elems--;
  }
}





After slp1 we have this loop:
;;   basic block 3, loop depth 0
;;    pred:       2
  _1 = net_8(D)->nr_group;
  net__max_elems_lsm.4_16 = net_8(D)->max_elems;
;;    succ:       4

;;   basic block 4, loop depth 1
;;    pred:       7
;;                3
  # _13 = PHI <_2(7), _11(3)>
  # net__max_elems_lsm.4_5 = PHI <_4(7), net__max_elems_lsm.4_16(3)>
  _2 = _1 + _13;
  _4 = net__max_elems_lsm.4_5 + -1;
  if (_2 < 0)
    goto <bb 7>; [89.00%]
  else
    goto <bb 5>; [11.00%]
;;    succ:       7
;;                5

;;   basic block 7, loop depth 1
;;    pred:       4
  goto <bb 4>; [100.00%]
;;    succ:       4

;;   basic block 5, loop depth 0
;;    pred:       4
  # _12 = PHI <_2(4)>
  # _17 = PHI <_4(4)>
  net_8(D)->full_groups = _12;
  net_8(D)->max_elems = _17;
;;    succ:       6


Of particular interest is the max_elems computation into _4.  We accumulate it
in the loop, then do the final store after the loop (thank you LSM!).  After
ivopts we have:


;;   basic block 3, loop depth 0
;;    pred:       2
  _1 = net_8(D)->nr_group;
  net__max_elems_lsm.4_16 = net_8(D)->max_elems;
  _22 = net__max_elems_lsm.4_16 + -1;
  ivtmp.10_21 = (unsigned long) _22;
;;    succ:       4

;;   basic block 4, loop depth 1
;;    pred:       7
;;                3
  # _13 = PHI <_2(7), _11(3)>
  # ivtmp.10_3 = PHI <ivtmp.10_18(7), ivtmp.10_21(3)>
  _2 = _1 + _13;
  _4 = (long int) ivtmp.10_3;
  ivtmp.10_18 = ivtmp.10_3 - 1;
  if (_2 < 0)
    goto <bb 7>; [89.00%]
  else
    goto <bb 5>; [11.00%]
;;    succ:       7
;;                5

;;   basic block 7, loop depth 1
;;    pred:       4 
  goto <bb 4>; [100.00%]
;;    succ:       4

;;   basic block 5, loop depth 0
;;    pred:       4
  # _12 = PHI <_2(4)>
  # _17 = PHI <_4(4)>
  net_8(D)->full_groups = _12;
  net_8(D)->max_elems = _17;
;;    succ:       6

Note the introduction of the IV and its relationship to _4.  Essentially we
compute both in the loop even _4 is always one greater than the IV.  Worse yet,
the IV is only used to compute _4!  And since they differ by 1, we actually
compute both and keep them alive resulting in this final code for rv64:




.L3:
        add     a5,a5,a2
        mv      a3,a4
        addi    a4,a4,-1
        blt     a5,zero,.L3
        sd      a5,8(a0)
        sd      a3,16(a0)


Note how we had to "stash away" the value of a4 before the decrement so that we
could store it after the loop.  The induction variable doesn't really buy us
anything in this loop -- it's actively harmful.  Not using the IV would
probably be best.  Second best would be to realize that _4 (aka a3) can be
derived from the IV (a4) after the loop by adding 1.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
  2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
@ 2022-12-11 12:26 ` rguenth at gcc dot gnu.org
  2022-12-11 12:27 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-11 12:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think IVOPTs does not consider the use on the exit edge and its effect on
liveness but it does consider both IVs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
  2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
  2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
@ 2022-12-11 12:27 ` rguenth at gcc dot gnu.org
  2023-05-29 16:36 ` law at gcc dot gnu.org
  2023-05-30 20:40 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-11 12:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-12-11

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed also on x86_64 btw.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
  2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
  2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
  2022-12-11 12:27 ` rguenth at gcc dot gnu.org
@ 2023-05-29 16:36 ` law at gcc dot gnu.org
  2023-05-30 20:40 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2023-05-29 16:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

--- Comment #3 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Created attachment 55185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55185&action=edit
(Incomplete) Patch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
  2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-05-29 16:36 ` law at gcc dot gnu.org
@ 2023-05-30 20:40 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2023-05-30 20:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

--- Comment #4 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Patch was for a different problem.  Sorry.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-30 20:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
2022-12-11 12:27 ` rguenth at gcc dot gnu.org
2023-05-29 16:36 ` law at gcc dot gnu.org
2023-05-30 20:40 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).