public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
@ 2023-06-06  6:36 chenglulu at loongson dot cn
  2023-06-06  6:54 ` [Bug target/110136] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: chenglulu at loongson dot cn @ 2023-06-06  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

            Bug ID: 110136
           Summary: After optimization, the $r1 register will be broken
                    when jumping to the jump table, resulting in a
                    significant increase in the false prediction rate of
                    branch prediction.
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
          Assignee: unassigned at gcc dot gnu.org
          Reporter: chenglulu at loongson dot cn
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---
            Target: loongarch64-*-linux

Created attachment 55267
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55267&action=edit
perlbench.ltrans15.ltrans.args.0

tag: releases/gcc-12.2.0
The code here replicates the problem.

$ ./libexec/gcc/loongarch64-linux-gnu/12.2.0/lto1 -quiet -dumpbase
./perlbench.ltrans15.ltrans -mabi=lp64d -march=loongarch64 -mfpu=64
-mcmodel=normal -mtune=loongarch64 -g -g -Ofast -Ofast -version -fno-openmp
-fno-openacc -fcf-protection=none -fno-omit-frame-pointer -funroll-all-loops 
-fltrans @./perlbench.ltrans15.ltrans.args.0 -fdump-rtl-all -o
./perlbench.ltrans15.ltrans.s -fpie

Perl_sv_upgrade:
...
 5908         addi.w  $r18,$r0,15                     # 0xf
 5909         bgtu    $r23,$r18,.L502
 5910         la.local        $r16,.L504
 5911         slli.d  $r19,$r23,3
 5912         ldx.d   $r20,$r16,$r19
 5913         add.d   $r1,$r16,$r20
 5914         jr      $r1
...

In the regrename passover optimization, replace the registers of lines 5193 and
5194 with $r1.

I tried debugging and found that the problem would be solved if hook
HARD_REGNO_RENAME_OK was defined, but found that this was just an accident and
there is no guarantee that this register will not be replaced with $r1 when
jumping to the jump table.

The patch that defines the HARD_REGNO_RENAME_OK is as follows:
diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
index 5c9a33c14f7..0df0ae15c3e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -5782,6 +5782,19 @@ loongarch_starting_frame_offset (void)
   return crtl->outgoing_args_size;
 }

+/* Return nonzero if register FROM_REGNO can be renamed to register
+   TO_REGNO.  */
+
+bool
+loongarch_hard_regno_rename_ok (unsigned from_regno ATTRIBUTE_UNUSED,
+                           unsigned to_regno)
+{
+  return df_regs_ever_live_p (to_regno);
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
diff --git a/gcc/config/loongarch/loongarch.h
b/gcc/config/loongarch/loongarch.h
index f9de9a6e4fb..b22b439eaac 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -563,6 +563,8 @@ enum reg_class
 #define IMM_BITS 12
 #define IMM_REACH (1LL << IMM_BITS)

+#define HARD_REGNO_RENAME_OK(FROM, TO) loongarch_hard_regno_rename_ok (FROM,
TO)
+
 /* True if VALUE is an unsigned 6-bit number.  */


Is there a way to make sure that the $r1 register is not corrupted when jumping
to the table?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
@ 2023-06-06  6:54 ` pinskia at gcc dot gnu.org
  2023-06-06  6:57 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-06  6:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|lto                         |target

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>In the regrename passover optimization

I am trying to understand the issue.

5912         ldx.d   $r20,$r16,$r19
 5913         add.d   $r1,$r16,$r20
 5914         jr      $r1

Is the issue is jr does not like r1 register or some other kind of performance
issue?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
  2023-06-06  6:54 ` [Bug target/110136] " pinskia at gcc dot gnu.org
@ 2023-06-06  6:57 ` pinskia at gcc dot gnu.org
  2023-06-06  9:55 ` chenglulu at loongson dot cn
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-06  6:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> >In the regrename passover optimization
> 
> I am trying to understand the issue.
> 
> 5912         ldx.d   $r20,$r16,$r19
>  5913         add.d   $r1,$r16,$r20
>  5914         jr      $r1
> 
> Is the issue is jr does not like r1 register or some other kind of
> performance issue?

If it is just r1 that is the issue, you could change the pattern in
loongarch.md to discourage r1 by changing the constraints there.
Because right now it assumes all registers are similar in cost:

(define_insn "@indirect_jump<mode>"
  [(set (pc) (match_operand:P 0 "register_operand" "r"))]
  ""
  "jr\t%0"
  [(set_attr "type" "jump")
   (set_attr "mode" "none")])

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
  2023-06-06  6:54 ` [Bug target/110136] " pinskia at gcc dot gnu.org
  2023-06-06  6:57 ` pinskia at gcc dot gnu.org
@ 2023-06-06  9:55 ` chenglulu at loongson dot cn
  2023-06-06  9:58 ` chenglulu at loongson dot cn
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: chenglulu at loongson dot cn @ 2023-06-06  9:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #3 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Andrew Pinski from comment #1)
> >In the regrename passover optimization
> 
> I am trying to understand the issue.
> 
> 5912         ldx.d   $r20,$r16,$r19
>  5913         add.d   $r1,$r16,$r20
>  5914         jr      $r1
> 
> Is the issue is jr does not like r1 register or some other kind of
> performance issue?

This is because if you break $r1 when jumping to the jump table, it will affect
the branching prediction rate of the hardware.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (2 preceding siblings ...)
  2023-06-06  9:55 ` chenglulu at loongson dot cn
@ 2023-06-06  9:58 ` chenglulu at loongson dot cn
  2023-06-15  8:15 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: chenglulu at loongson dot cn @ 2023-06-06  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #4 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > >In the regrename passover optimization
> > 
> > I am trying to understand the issue.
> > 
> > 5912         ldx.d   $r20,$r16,$r19
> >  5913         add.d   $r1,$r16,$r20
> >  5914         jr      $r1
> > 
> > Is the issue is jr does not like r1 register or some other kind of
> > performance issue?
> 
> If it is just r1 that is the issue, you could change the pattern in
> loongarch.md to discourage r1 by changing the constraints there.
> Because right now it assumes all registers are similar in cost:
> 
> (define_insn "@indirect_jump<mode>"
>   [(set (pc) (match_operand:P 0 "register_operand" "r"))]
>   ""
>   "jr\t%0"
>   [(set_attr "type" "jump")
>    (set_attr "mode" "none")])

Thank you very much, I modified the template to have a try.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (3 preceding siblings ...)
  2023-06-06  9:58 ` chenglulu at loongson dot cn
@ 2023-06-15  8:15 ` cvs-commit at gcc dot gnu.org
  2023-06-15  8:24 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-15  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by LuluCheng <chenglulu@gcc.gnu.org>:

https://gcc.gnu.org/g:5430c86e71927492399129f3df80824c6c334ddf

commit r14-1866-g5430c86e71927492399129f3df80824c6c334ddf
Author: Lulu Cheng <chenglulu@loongson.cn>
Date:   Wed Jun 7 10:21:58 2023 +0800

    LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

    Micro-architecture unconditionally treats a "jr $ra" as "return from
subroutine",
    hence doing "jr $ra" would interfere with both subroutine return prediction
and
    the more general indirect branch prediction.

    Therefore, a problem like PR110136 can cause a significant increase in
branch error
    prediction rate and affect performance. The same problem exists with
"indirect_jump".

    gcc/ChangeLog:

            PR target/110136
            * config/loongarch/loongarch.md: Modify the register constraints
for template
            "jumptable" and "indirect_jump" from "r" to "e".

    Co-authored-by: Andrew Pinski <apinski@marvell.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (4 preceding siblings ...)
  2023-06-15  8:15 ` cvs-commit at gcc dot gnu.org
@ 2023-06-15  8:24 ` cvs-commit at gcc dot gnu.org
  2023-06-15  8:26 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-15  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by LuluCheng
<chenglulu@gcc.gnu.org>:

https://gcc.gnu.org/g:ddec24e5abe99033c8d6bbe544b4c2b35a0232f2

commit r12-9698-gddec24e5abe99033c8d6bbe544b4c2b35a0232f2
Author: Lulu Cheng <chenglulu@loongson.cn>
Date:   Wed Jun 7 10:21:58 2023 +0800

    LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

    Micro-architecture unconditionally treats a "jr $ra" as "return from
subroutine",
    hence doing "jr $ra" would interfere with both subroutine return prediction
and
    the more general indirect branch prediction.

    Therefore, a problem like PR110136 can cause a significant increase in
branch error
    prediction rate and affect performance. The same problem exists with
"indirect_jump".

    gcc/ChangeLog:

            PR target/110136
            * config/loongarch/loongarch.md: Modify the register constraints
for template
            "jumptable" and "indirect_jump" from "r" to "e".

    Co-authored-by: Andrew Pinski <apinski@marvell.com>

    (cherry picked from commit 5430c86e71927492399129f3df80824c6c334ddf)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (5 preceding siblings ...)
  2023-06-15  8:24 ` cvs-commit at gcc dot gnu.org
@ 2023-06-15  8:26 ` cvs-commit at gcc dot gnu.org
  2023-06-15  8:31 ` chenglulu at loongson dot cn
  2023-06-15  8:51 ` xry111 at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-15  8:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by LuluCheng
<chenglulu@gcc.gnu.org>:

https://gcc.gnu.org/g:f829733b5c92877247727347246d9f927372f0c1

commit r13-7448-gf829733b5c92877247727347246d9f927372f0c1
Author: Lulu Cheng <chenglulu@loongson.cn>
Date:   Wed Jun 7 10:21:58 2023 +0800

    LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

    Micro-architecture unconditionally treats a "jr $ra" as "return from
subroutine",
    hence doing "jr $ra" would interfere with both subroutine return prediction
and
    the more general indirect branch prediction.

    Therefore, a problem like PR110136 can cause a significant increase in
branch error
    prediction rate and affect performance. The same problem exists with
"indirect_jump".

    gcc/ChangeLog:

            PR target/110136
            * config/loongarch/loongarch.md: Modify the register constraints
for template
            "jumptable" and "indirect_jump" from "r" to "e".

    Co-authored-by: Andrew Pinski <apinski@marvell.com>

    (cherry picked from commit 5430c86e71927492399129f3df80824c6c334ddf)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (6 preceding siblings ...)
  2023-06-15  8:26 ` cvs-commit at gcc dot gnu.org
@ 2023-06-15  8:31 ` chenglulu at loongson dot cn
  2023-06-15  8:51 ` xry111 at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: chenglulu at loongson dot cn @ 2023-06-15  8:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

chenglulu <chenglulu at loongson dot cn> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from chenglulu <chenglulu at loongson dot cn> ---
This issue is resolved

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/110136] After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction.
  2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
                   ` (7 preceding siblings ...)
  2023-06-15  8:31 ` chenglulu at loongson dot cn
@ 2023-06-15  8:51 ` xry111 at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-06-15  8:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.4
                 CC|                            |xry111 at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-06-15  8:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-06  6:36 [Bug lto/110136] New: After optimization, the $r1 register will be broken when jumping to the jump table, resulting in a significant increase in the false prediction rate of branch prediction chenglulu at loongson dot cn
2023-06-06  6:54 ` [Bug target/110136] " pinskia at gcc dot gnu.org
2023-06-06  6:57 ` pinskia at gcc dot gnu.org
2023-06-06  9:55 ` chenglulu at loongson dot cn
2023-06-06  9:58 ` chenglulu at loongson dot cn
2023-06-15  8:15 ` cvs-commit at gcc dot gnu.org
2023-06-15  8:24 ` cvs-commit at gcc dot gnu.org
2023-06-15  8:26 ` cvs-commit at gcc dot gnu.org
2023-06-15  8:31 ` chenglulu at loongson dot cn
2023-06-15  8:51 ` xry111 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).