public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on
@ 2015-06-10 14:30 ktkachov at gcc dot gnu.org
  2015-06-10 14:37 ` [Bug rtl-optimization/66489] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-06-10 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66489

            Bug ID: 66489
           Summary: combine fails to merge insns if some are reused later
                    on
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

Consider code:
int f2(int a, int b, int c, int d)
{
  a = -a;
  b += a * c;
  c += a * d;
  return b ^ c;
}

On aarch64 this could be written with to msub instructions, RTL code:

(set (reg/i:SI 0 x0)
     (minus:SI
       (reg:SI 1 x1 [ b ])
       (mult:SI (reg:SI 0 x0 [ a ])
                (reg:SI 2 x2 [ c ])))) 

However, combine doesn't merge the neg and the multiply-adds and generates:
f2:
        neg     w4, w0
        madd    w0, w4, w2, w1
        madd    w3, w4, w3, w2
        eor     w0, w0, w3
        ret


If we modify the code to only do a single multiply-accumulate:
int f2(int a, int b, int c, int d)
{
  a = -a;
  b += a * c;
  return b;
}

Then they the expected single msub instruction is generated.
I think this is due to combine being scared of the negated 'a' being used in
two places.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/66489] combine fails to merge insns if some are reused later on
  2015-06-10 14:30 [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on ktkachov at gcc dot gnu.org
@ 2015-06-10 14:37 ` pinskia at gcc dot gnu.org
  2015-06-10 14:41 ` ktkachov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-06-10 14:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66489

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This was the reason why the pass on the tree level to fused multiply add was
added.  Maybe we can do the same.

Do you know how often this shows up?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/66489] combine fails to merge insns if some are reused later on
  2015-06-10 14:30 [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on ktkachov at gcc dot gnu.org
  2015-06-10 14:37 ` [Bug rtl-optimization/66489] " pinskia at gcc dot gnu.org
@ 2015-06-10 14:41 ` ktkachov at gcc dot gnu.org
  2015-06-10 14:54 ` pinskia at gcc dot gnu.org
  2023-06-02  4:53 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-06-10 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66489

--- Comment #2 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #1)
> This was the reason why the pass on the tree level to fused multiply add was
> added.  Maybe we can do the same.
> 
> Do you know how often this shows up?

My understanding is that it occurs in SPEC2006 and leslie3d in particular, but
with FP code there (just substitute int for double in the testcase and pick
something other than eor).

Don't have any numbers on how often or how much of a bottleneck it is to
performance...


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/66489] combine fails to merge insns if some are reused later on
  2015-06-10 14:30 [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on ktkachov at gcc dot gnu.org
  2015-06-10 14:37 ` [Bug rtl-optimization/66489] " pinskia at gcc dot gnu.org
  2015-06-10 14:41 ` ktkachov at gcc dot gnu.org
@ 2015-06-10 14:54 ` pinskia at gcc dot gnu.org
  2023-06-02  4:53 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-06-10 14:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66489

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For floating point, tree-ssa-math-opts.c  handles the FMA and then expand is
supposed to handle the neg inside the FMA_EXPR but for some reason on this
testcase it is not. 
        def0 = get_def_for_expr (treeop0, NEGATE_EXPR);
        /* The multiplication is commutative - look at its 2nd operand
           if the first isn't fed by a negate.  */
        if (!def0)
          {
            def0 = get_def_for_expr (treeop1, NEGATE_EXPR);
            /* Swap operands if the 2nd operand is fed by a negate.  */
            if (def0)
              {
                tree tem = treeop0;
                treeop0 = treeop1;
                treeop1 = tem;
              }
          }
        def2 = get_def_for_expr (treeop2, NEGATE_EXPR);

        op0 = op2 = NULL;

        if (def0 && def2
            && optab_handler (fnms_optab, mode) != CODE_FOR_nothing)
          {
            opt = fnms_optab;
            op0 = expand_normal (gimple_assign_rhs1 (def0));
            op2 = expand_normal (gimple_assign_rhs1 (def2));
          }
        else if (def0
                 && optab_handler (fnma_optab, mode) != CODE_FOR_nothing)
          {
            opt = fnma_optab;
            op0 = expand_normal (gimple_assign_rhs1 (def0));
          }
        else if (def2
                 && optab_handler (fms_optab, mode) != CODE_FOR_nothing)
          {
            opt = fms_optab;
            op2 = expand_normal (gimple_assign_rhs1 (def2));
          }

So you are going to need to debug why expr.c is not handling this case for
floating point.

Again for integer, it is a different story.  And I doubt for integer it shows
up that often and if it did, the neg could most likely schedule with some other
instruction.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/66489] combine fails to merge insns if some are reused later on
  2015-06-10 14:30 [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on ktkachov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-06-10 14:54 ` pinskia at gcc dot gnu.org
@ 2023-06-02  4:53 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-02  4:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66489

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=85160
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED
   Target Milestone|---                         |9.0

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed in GCC 9 with 2->2 combing being done now (r9-2064-gc4c5ad1d6d1e1e1fe7a):

Trying 11 -> 12:
   11: r96:SI=-r115:SI
      REG_DEAD r115:SI
   12: r108:SI=r96:SI*r103:SI
Failed to match this instruction:
(parallel [
        (set (reg:SI 108)
            (mult:SI (neg:SI (reg:SI 115))
                (reg/v:SI 103 [ cD.4384 ])))
        (set (reg/v:SI 96 [ aD.4382 ])
            (neg:SI (reg:SI 115)))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:SI 108)
            (mult:SI (neg:SI (reg:SI 115))
                (reg/v:SI 103 [ cD.4384 ])))
        (set (reg/v:SI 96 [ aD.4382 ])
            (neg:SI (reg:SI 115)))
    ])
Successfully matched this instruction:
(set (reg/v:SI 96 [ aD.4382 ])
    (neg:SI (reg:SI 115)))
Successfully matched this instruction:
(set (reg:SI 108)
    (mult:SI (neg:SI (reg:SI 115))
        (reg/v:SI 103 [ cD.4384 ])))
allowing combination of insns 11 and 12
original costs 4 + 12 = 16
replacement costs 4 + 12 = 16
modifying insn i2    11: r96:SI=-r115:SI
deferring rescan insn with uid = 11.
modifying insn i3    12: r108:SI=-r115:SI*r103:SI
      REG_DEAD r115:SI
deferring rescan insn with uid = 12.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-02  4:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-10 14:30 [Bug rtl-optimization/66489] New: combine fails to merge insns if some are reused later on ktkachov at gcc dot gnu.org
2015-06-10 14:37 ` [Bug rtl-optimization/66489] " pinskia at gcc dot gnu.org
2015-06-10 14:41 ` ktkachov at gcc dot gnu.org
2015-06-10 14:54 ` pinskia at gcc dot gnu.org
2023-06-02  4:53 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).