public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO
@ 2021-02-12 23:32 jamborm at gcc dot gnu.org
  2021-02-13  8:53 ` [Bug target/99083] " ubizjak at gmail dot com
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-02-12 23:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

            Bug ID: 99083
           Summary: Big run-time regressions of 519.lbm_r with LTO
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: ubizjak at gmail dot com
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

On AMD Zen2 CPUs, 519.lbm_r is 62.12% slower when built with -O2 and
-flto than when not using LTO.  It is also 62.12% slower than when
using GCC 10 with the two options.  My measurements match those from
LNT on a different zen2:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=325.477.0&plot.1=312.477.0&plot.2=349.477.0&plot.3=278.477.0&plot.4=401.477.0&plot.5=298.477.0

On the same CPU, compiling the benchmark with -Ofast -march=native
-flto is slower than non-LTO, by 8.07% on Zen2 and 6.06% on Zen3.  The
Zen2 case has also been caught by LNT:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=295.477.0&plot.1=293.477.0&plot.2=287.477.0&plot.3=286.477.0&

I have bisected both of these regressions (on Zen2s) to:

  commit 4c61e35f20fe2ffeb9421dbd6f26c767a234a4a0
  Author: Uros Bizjak <ubizjak@gmail.com>
  Date:   Wed Dec 9 21:06:07 2020 +0100

      i386: Remove REG_ALLOC_ORDER definition

      REG_ALLOC_ORDER just defines what the default is set to.

      2020-12-09  Uroš Bizjak  <ubizjak@gmail.com>

      gcc/    
              * config/i386/i386.h (REG_ALLOC_ORDER): Remove

...which looks like it was supposed to be a no-op, but I looked at the
-O2 LTO case and the assembly generated by this commit definitely
differs from the assembly produced by the previous one in instruction
selection, spilling and even some scheduling.  For example, I see
hunks like:

@@ -994,10 +996,10 @@
        movapd  %xmm13, %xmm9
        movsd   96(%rsp), %xmm13
        subsd   %xmm12, %xmm9
-       movsd   256(%rsp), %xmm12
+       movq    %rbx, %xmm12
+       mulsd   %xmm6, %xmm12
        movsd   %xmm5, 15904(%rdx)
        movsd   72(%rax), %xmm5
-       mulsd   %xmm6, %xmm12
        mulsd   %xmm0, %xmm9
        subsd   %xmm10, %xmm5
        movsd   216(%rsp), %xmm10

The -Ofast native LTO assemblies also differ.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
@ 2021-02-13  8:53 ` ubizjak at gmail dot com
  2021-02-15  8:20 ` marxin at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-13  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
This should be a no-op. According to the documentation:

--q--
Macro: REG_ALLOC_ORDER

    If defined, an initializer for a vector of integers, containing the numbers
of hard registers in the order in which GCC should prefer to use them (from
most preferred to least).

    If this macro is not defined, registers are used lowest numbered first (all
else being equal).

    One use of this macro is on machines where the highest numbered registers
must always be saved and the save-multiple-registers instruction supports only
sequences of consecutive registers. On such machines, define REG_ALLOC_ORDER to
be an initializer that lists the highest numbered allocable register first. 
--/q--

and the patch removed:

-#define REG_ALLOC_ORDER                                                       
\
-{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,                       
\
-  16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,      \
-  32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,      \
-  48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,      \
-  64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 }

It is trivial to revert the offending commit, but I think that this PR warrants
some more analysis of the underlying problem, presumably in the generic code.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
  2021-02-13  8:53 ` [Bug target/99083] " ubizjak at gmail dot com
@ 2021-02-15  8:20 ` marxin at gcc dot gnu.org
  2021-02-15  8:22 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-02-15  8:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |marxin at gcc dot gnu.org
   Last reconfirmed|                            |2021-02-15
     Ever confirmed|0                           |1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
  2021-02-13  8:53 ` [Bug target/99083] " ubizjak at gmail dot com
  2021-02-15  8:20 ` marxin at gcc dot gnu.org
@ 2021-02-15  8:22 ` rguenth at gcc dot gnu.org
  2021-02-15  9:57 ` ubizjak at gmail dot com
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-15  8:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The following ira-color.c piece has heuristics that get triggered differently:

/* Return TRUE if spilling pseudo-registers whose numbers are in array
   REGNOS is better than spilling pseudo-registers with numbers in
   OTHER_REGNOS for reload with given IN and OUT for INSN.  The
   function used by the reload pass to make better register spilling
   decisions.  */
bool
ira_better_spill_reload_regno_p (int *regnos, int *other_regnos,
                                 rtx in, rtx out, rtx_insn *insn)
{
..
#ifdef REG_ALLOC_ORDER
  if (hard_regno >= 0 && other_hard_regno >= 0)
    return (inv_reg_alloc_order[hard_regno]
            < inv_reg_alloc_order[other_hard_regno]);
#else
  if (call_used_count != other_call_used_count)
    return call_used_count > other_call_used_count;
#endif
  return false;
}

it somehow reads to me as if that should have tested HONOR_REG_ALLOC_ORDER
instead?  Not sure how likely it is that we run into this (last) condition.

Interestingly x86 defines ADJUST_REG_ALLOC_ORDER.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-02-15  8:22 ` rguenth at gcc dot gnu.org
@ 2021-02-15  9:57 ` ubizjak at gmail dot com
  2021-02-15 12:00 ` ubizjak at gmail dot com
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15  9:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
It looks to me another one is in reload1.c, find_reg:

          if (this_cost < best_cost
              /* Among registers with equal cost, prefer caller-saved ones, or
                 use REG_ALLOC_ORDER if it is defined.  */
              || (this_cost == best_cost
#ifdef REG_ALLOC_ORDER
                  && (inv_reg_alloc_order[regno]
                      < inv_reg_alloc_order[best_reg])
#else
                  && crtl->abi->clobbers_full_reg_p (regno)
                  && !crtl->abi->clobbers_full_reg_p (best_reg)
#endif
                  ))
            {
              best_reg = regno;
              best_cost = this_cost;
            }

According to the comment, REG_ALLOC_ORDER has to be defined to use preferences.

As mentioned by Richard in Comment #2, x86 defines ADJUST_REG_ALLOC_ORDER,
where the real allocation order is computed. But the documentation doesn't
mention that REG_ALLOC_ORDER also needs to be defined. It explicitly says even:

     The macro body should not assume anything about the contents of
     'reg_alloc_order' before execution of the macro.

But, we want to use the order from reg_alloc_order, so x86 should define
HONOR_REG_ALLOC_ORDER:

 -- Macro: HONOR_REG_ALLOC_ORDER
     Normally, IRA tries to estimate the costs for saving a register in
     the prologue and restoring it in the epilogue.  This discourages it
     from using call-saved registers.  If a machine wants to ensure that
     IRA allocates registers in the order given by REG_ALLOC_ORDER even
     if some call-saved registers appear earlier than call-used ones,
     then define this macro as a C expression to nonzero.  Default is 0.

But...

x86_order_regs_for_local_alloc lists general call_used_or_fixed_regs first, so
it should not matter anyway as far as call_used regs are concerned.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-02-15  9:57 ` ubizjak at gmail dot com
@ 2021-02-15 12:00 ` ubizjak at gmail dot com
  2021-02-15 12:03 ` ubizjak at gmail dot com
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15 12:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 50185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50185&action=edit
Proposed patch

Proposed patch that fixes ira-color.c and introduces HONOR_REG_ALLOC_ORDER.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-02-15 12:00 ` ubizjak at gmail dot com
@ 2021-02-15 12:03 ` ubizjak at gmail dot com
  2021-02-15 12:08 ` ubizjak at gmail dot com
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15 12:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
Martin, can you please benchmark the patch from Comment #4?

The patch is not totally trivial, because it introduces HONOR_REG_ALLOC_ORDER
to x86 and this define disables some other code in ira-color.c,
assign_hard_reg:

      if (!HONOR_REG_ALLOC_ORDER)
        {
          if ((saved_nregs = calculate_saved_nregs (hard_regno, mode)) != 0)
          /* We need to save/restore the hard register in
             epilogue/prologue.  Therefore we increase the cost.  */
          {
            rclass = REGNO_REG_CLASS (hard_regno);
            add_cost = ((ira_memory_move_cost[mode][rclass][0]
                         + ira_memory_move_cost[mode][rclass][1])
                        * saved_nregs / hard_regno_nregs (hard_regno,
                                                          mode) - 1);
            cost += add_cost;
            full_cost += add_cost;
          }
        }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-02-15 12:03 ` ubizjak at gmail dot com
@ 2021-02-15 12:08 ` ubizjak at gmail dot com
  2021-02-15 12:47 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
As a side note, it is strange that ADJUST_REG_ALLOC_ORDER somehow require
REG_ALLOC_ORDER to be defined (c.f. Comment #3), while its documentation says:

     The macro body should not assume anything about the contents of
     'reg_alloc_order' before execution of the macro.

This mess begs for the redefinition of REG_ALLOC_ORDER/ADJUST_REG_ALLOC_ORDER
as a target hook.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-02-15 12:08 ` ubizjak at gmail dot com
@ 2021-02-15 12:47 ` rguenth at gcc dot gnu.org
  2021-02-15 13:08 ` ubizjak at gmail dot com
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-15 12:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, for GCC 11 it might be tempting to simply revert the "no-op" change?

There are a lot of targets that define REG_ALLOC_ORDER ^ HONOR_REG_ALLOC_ORDER
and thus are affected by this change...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-02-15 12:47 ` rguenth at gcc dot gnu.org
@ 2021-02-15 13:08 ` ubizjak at gmail dot com
  2021-02-15 13:11 ` jamborm at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15 13:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #7)
> Btw, for GCC 11 it might be tempting to simply revert the "no-op" change?

I agree, this is the safest way at this time. The situation now looks like
going into rabbit hole.

> There are a lot of targets that define REG_ALLOC_ORDER ^
> HONOR_REG_ALLOC_ORDER and thus are affected by this change...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-02-15 13:08 ` ubizjak at gmail dot com
@ 2021-02-15 13:11 ` jamborm at gcc dot gnu.org
  2021-02-15 13:31 ` ubizjak at gmail dot com
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-02-15 13:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #9 from Martin Jambor <jamborm at gcc dot gnu.org> ---
I will benchmark the patch later this week, just so that we know, but I agree
that reverting the patch and applying it again at the beginning of stage1 is
probably the best.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-02-15 13:11 ` jamborm at gcc dot gnu.org
@ 2021-02-15 13:31 ` ubizjak at gmail dot com
  2021-02-21 17:45 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-15 13:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #10 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #7)
> There are a lot of targets that define REG_ALLOC_ORDER ^
> HONOR_REG_ALLOC_ORDER and thus are affected by this change...

The following patch should solve this issue:

--cut here--
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 91216593e75..2af4add0c05 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1047,7 +1047,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.
 If not, see
 #endif

 #ifndef HONOR_REG_ALLOC_ORDER
-#define HONOR_REG_ALLOC_ORDER 0
+# if defined REG_ALLOC_ORDER
+#  define HONOR_REG_ALLOC_ORDER 1
+# else
+#  define HONOR_REG_ALLOC_ORDER 0
+# endif
 #endif

 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
--cut here--

So, if REG_ALLOC_ORDER is defined, then IRA should obey the order by default.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-02-15 13:31 ` ubizjak at gmail dot com
@ 2021-02-21 17:45 ` ubizjak at gmail dot com
  2021-02-21 17:45 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-21 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed by revert of 4c61e35f20fe2ffeb9421dbd6f26c767a234a4a0.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2021-02-21 17:45 ` ubizjak at gmail dot com
@ 2021-02-21 17:45 ` ubizjak at gmail dot com
  2021-02-23 17:59 ` jamborm at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-21 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2021-02-21 17:45 ` ubizjak at gmail dot com
@ 2021-02-23 17:59 ` jamborm at gcc dot gnu.org
  2021-02-25  9:50 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-02-23 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #12 from Martin Jambor <jamborm at gcc dot gnu.org> ---
For the record, I have benchmarked the patches from comment #4 and comment #10
on top of commit 6b1633378b7 (for which I already have unpatched benchmark
results) and the regression of 519.lbm_r compiled with -O2 LTO dropped from 62%
to 8%.  

The -Ofast -march=native -flto vs. non-LTO regression also dropped from 8% to
about 5% (GCC 10 also has non-LTO 2.5% faster than LTO, but at least both times
improved vs. GCC 10).

The only notable regression brought about the patch was 538.imagick_r when
compiled at -Ofast -march=native without LTO, which was 6% slower with the
patch.

All of the measurements were done on a Zen2 machine.

Thank you for reverting the patch, now we need to look for LNT to pick up the
changes.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2021-02-23 17:59 ` jamborm at gcc dot gnu.org
@ 2021-02-25  9:50 ` ubizjak at gmail dot com
  2021-04-27 11:40 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: ubizjak at gmail dot com @ 2021-02-25  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
           Keywords|                            |patch
           Assignee|unassigned at gcc dot gnu.org      |ubizjak at gmail dot com
         Resolution|FIXED                       |---

--- Comment #13 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Martin Jambor from comment #12)
> For the record, I have benchmarked the patches from comment #4 and comment
> #10 on top of commit 6b1633378b7 (for which I already have unpatched
> benchmark results) and the regression of 519.lbm_r compiled with -O2 LTO
> dropped from 62% to 8%.  
> 
> The -Ofast -march=native -flto vs. non-LTO regression also dropped from 8%
> to about 5% (GCC 10 also has non-LTO 2.5% faster than LTO, but at least both
> times improved vs. GCC 10).
> 
> The only notable regression brought about the patch was 538.imagick_r when
> compiled at -Ofast -march=native without LTO, which was 6% slower with the
> patch.
> 
> All of the measurements were done on a Zen2 machine.
> 
> Thank you for reverting the patch, now we need to look for LNT to pick up
> the changes.

The complete patch that presumably corrects HONOR_REG_ALLOC_ORDER usage is at
[1],
but IIUC the above measurements, there is still a regression of 8% vs unpatched
compiler.  With the complete patch [1], ira_better_spill_reload_regno_p change
should be a NO-OP, but the new default also disables the internal calculations
in assign_hard_reg, please see [2] for reasoning.

Based on the above benchmarks, it looks that disabling the internal
calculations in assign_hard_reg is harmful even for HONOR_REG_ALLOC_ORDER
targets, at least patched x86 compiler shows this effect. Maybe Vlad could
comment this part.

Let's reopen this PR to keep the discussions in one place.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565640.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565699.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2021-02-25  9:50 ` ubizjak at gmail dot com
@ 2021-04-27 11:40 ` jakub at gcc dot gnu.org
  2021-07-28  7:05 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-27 11:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.0                        |11.2

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.1 has been released, retargeting bugs to GCC 11.2.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2021-04-27 11:40 ` jakub at gcc dot gnu.org
@ 2021-07-28  7:05 ` rguenth at gcc dot gnu.org
  2022-04-21  7:48 ` rguenth at gcc dot gnu.org
  2023-05-29 10:04 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-28  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.2                        |11.3

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.2 is being released, retargeting bugs to GCC 11.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2021-07-28  7:05 ` rguenth at gcc dot gnu.org
@ 2022-04-21  7:48 ` rguenth at gcc dot gnu.org
  2023-05-29 10:04 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-21  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.3                        |11.4

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 is being released, retargeting bugs to GCC 11.4.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/99083] Big run-time regressions of 519.lbm_r with LTO
  2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2022-04-21  7:48 ` rguenth at gcc dot gnu.org
@ 2023-05-29 10:04 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-29 10:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.4                        |11.5

--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.4 is being released, retargeting bugs to GCC 11.5.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-05-29 10:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-12 23:32 [Bug target/99083] New: Big run-time regressions of 519.lbm_r with LTO jamborm at gcc dot gnu.org
2021-02-13  8:53 ` [Bug target/99083] " ubizjak at gmail dot com
2021-02-15  8:20 ` marxin at gcc dot gnu.org
2021-02-15  8:22 ` rguenth at gcc dot gnu.org
2021-02-15  9:57 ` ubizjak at gmail dot com
2021-02-15 12:00 ` ubizjak at gmail dot com
2021-02-15 12:03 ` ubizjak at gmail dot com
2021-02-15 12:08 ` ubizjak at gmail dot com
2021-02-15 12:47 ` rguenth at gcc dot gnu.org
2021-02-15 13:08 ` ubizjak at gmail dot com
2021-02-15 13:11 ` jamborm at gcc dot gnu.org
2021-02-15 13:31 ` ubizjak at gmail dot com
2021-02-21 17:45 ` ubizjak at gmail dot com
2021-02-21 17:45 ` ubizjak at gmail dot com
2021-02-23 17:59 ` jamborm at gcc dot gnu.org
2021-02-25  9:50 ` ubizjak at gmail dot com
2021-04-27 11:40 ` jakub at gcc dot gnu.org
2021-07-28  7:05 ` rguenth at gcc dot gnu.org
2022-04-21  7:48 ` rguenth at gcc dot gnu.org
2023-05-29 10:04 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).