public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/112746] New: Missed optimization for redundancy computation elimination (fre1(tree) for global variable)
@ 2023-11-28 16:10 652023330028 at smail dot nju.edu.cn
  2023-11-28 16:28 ` [Bug tree-optimization/112746] Missed optimization for exact division with multi-use addition chain rguenth at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-11-28 16:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112746

            Bug ID: 112746
           Summary: Missed optimization for redundancy computation
                    elimination (fre1(tree) for global variable)
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: 652023330028 at smail dot nju.edu.cn
  Target Milestone: ---

Hello, we noticed that maybe there is a missed optimization for redundancy
computation elimination.

When b is a local variable, the computation of m optimizes to 3 as expected.
When b appears as a global variable, it misses this optimization.

https://godbolt.org/z/j3cKsrKb6

int n,m;
int b;
void test(int b){
    b=n+n;
    m=(n+b)/(n);
}
void test2(){
    b=n+n;
    m=(n+b)/(n);
}

But GCC -O3 -fwrapv or GCC -O3:
test(int):
        mov     DWORD PTR m[rip], 3
        ret
test2():
        mov     ecx, DWORD PTR n[rip]
        lea     eax, [rcx+rcx]
        mov     DWORD PTR b[rip], eax
        add     eax, ecx
        cdq
        idiv    ecx
        mov     DWORD PTR m[rip], eax
        ret

Expected code (Clang):
test2():                              # @test2()
        mov     eax, dword ptr [rip + n]
        add     eax, eax
        mov     dword ptr [rip + b], eax
        mov     dword ptr [rip + m], 3
        ret

We see a difference in fre1(tree) for these two functions, which might help
with this issue:
void test (int b)
{
  int n.0_1;
  int _3;
  int _5;

  <bb 2> :
  # DEBUG BEGIN_STMT
  n.0_1 = n;
  b_7 = n.0_1 * 2;
  # DEBUG b => b_7
  # DEBUG BEGIN_STMT
  _3 = n.0_1 * 3;
  _5 = 3;
  m = 3;
  return;

}

void test2 ()
{
  int n.3_1;
  int _2;
  int _5;
  int _7;

  <bb 2> :
  # DEBUG BEGIN_STMT
  n.3_1 = n;
  _2 = n.3_1 * 2;
  b = _2;
  # DEBUG BEGIN_STMT
  _5 = n.3_1 + _2;
  _7 = _5 / n.3_1;
  m = _7;
  return;

}


Thank you very much for your time and effort! We look forward to hearing from
you.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug tree-optimization/112746] Missed optimization for exact division with multi-use addition chain
  2023-11-28 16:10 [Bug tree-optimization/112746] New: Missed optimization for redundancy computation elimination (fre1(tree) for global variable) 652023330028 at smail dot nju.edu.cn
@ 2023-11-28 16:28 ` rguenth at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-28 16:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112746

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
            Summary|Missed optimization for     |Missed optimization for
                   |redundancy computation      |exact division with
                   |elimination (fre1(tree) for |multi-use addition chain
                   |global variable)            |
   Last reconfirmed|                            |2023-11-28

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue with test2 is that with value-numbering we have

(n.3_1 * 2 + n.3_1) / n.3_1

and that does not simplify.  That is, we do not simplify 2 * n + n to
3 * n as in general that wouldn't be profitable if 2 * n is live after
the use is elided (and it is live since it's stored to 'b').

Which means we ask for (n.3_1 * 2 + n.3_1) / n.3_1 which we currently
cannot simplify to a constant.  Handling cases like this in match.pd
feels wrong.

Note that with -fwrapv the optimization wouldn't be valid.

Other passes face the same issue, 'b' keeps n*2 live so an add
looks cheaper than to compute n*3 for the division.  We're not
anticipating the division here.

But:

  _3 = n.3_1 + _2;
  _4 = _3 / n.3_1;

could be simplified to

  _5 = _2 / n.3_1;
  _4 = _5 + 1;

if we know _2 is a multiple of n.3_1 which then could be simplified as well.

Note that all passes doing analyses on addition chains also stop at the
multi-use, so we'd need to improve at that point somehow.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-11-28 16:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-28 16:10 [Bug tree-optimization/112746] New: Missed optimization for redundancy computation elimination (fre1(tree) for global variable) 652023330028 at smail dot nju.edu.cn
2023-11-28 16:28 ` [Bug tree-optimization/112746] Missed optimization for exact division with multi-use addition chain rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).