* [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]
@ 2023-06-07 2:31 Lulu Cheng
2023-06-07 3:26 ` WANG Xuerui
0 siblings, 1 reply; 5+ messages in thread
From: Lulu Cheng @ 2023-06-07 2:31 UTC (permalink / raw)
To: gcc-patches; +Cc: xry111, i, xuchenghua, Lulu Cheng
If the $ra register is modified during the jump to the jump table, the hardware
branch prediction function will be broken, resulting in a significant increase
in the branch false prediction rate and affecting performance.
gcc/ChangeLog:
* config/loongarch/loongarch.md: Change register constraint to 'q'.
---
gcc/config/loongarch/loongarch.md | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 816a943d155..f9b64173104 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2926,9 +2926,11 @@ (define_expand "tablejump"
DONE;
})
+;; Jump to the jump table Avoid using the $r1 register to prevent
+;; affecting hardware branch prediction.
(define_insn "@tablejump<mode>"
[(set (pc)
- (match_operand:P 0 "register_operand" "r"))
+ (match_operand:P 0 "register_operand" "q"))
(use (label_ref (match_operand 1 "" "")))]
""
"jr\t%0"
--
2.31.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]
2023-06-07 2:31 [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136] Lulu Cheng
@ 2023-06-07 3:26 ` WANG Xuerui
2023-06-07 3:36 ` Lulu Cheng
0 siblings, 1 reply; 5+ messages in thread
From: WANG Xuerui @ 2023-06-07 3:26 UTC (permalink / raw)
To: Lulu Cheng, gcc-patches; +Cc: xry111, xuchenghua
Hi,
On 2023/6/7 10:31, Lulu Cheng wrote:
> If the $ra register is modified during the jump to the jump table, the hardware
> branch prediction function will be broken, resulting in a significant increase
> in the branch false prediction rate and affecting performance.
Thanks for the insight! This is the kind of improvement that will
probably become a lot harder to even *sight* without uarch details
available.
However, I think it's better to also include a minimized test case to
ensure the compiled code doesn't regress. (Comparison of relevant
statistics, e.g. output of perf stat, would be even nicer to have!)
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md: Change register constraint to 'q'.
> ---
> gcc/config/loongarch/loongarch.md | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
> index 816a943d155..f9b64173104 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -2926,9 +2926,11 @@ (define_expand "tablejump"
> DONE;
> })
>
> +;; Jump to the jump table Avoid using the $r1 register to prevent
> +;; affecting hardware branch prediction.
> (define_insn "@tablejump<mode>"
> [(set (pc)
> - (match_operand:P 0 "register_operand" "r"))
> + (match_operand:P 0 "register_operand" "q"))
> (use (label_ref (match_operand 1 "" "")))]
> ""
> "jr\t%0"
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]
2023-06-07 3:26 ` WANG Xuerui
@ 2023-06-07 3:36 ` Lulu Cheng
2023-06-07 7:37 ` WANG Xuerui
0 siblings, 1 reply; 5+ messages in thread
From: Lulu Cheng @ 2023-06-07 3:36 UTC (permalink / raw)
To: WANG Xuerui, gcc-patches; +Cc: xry111, xuchenghua
在 2023/6/7 上午11:26, WANG Xuerui 写道:
> Hi,
>
> On 2023/6/7 10:31, Lulu Cheng wrote:
>> If the $ra register is modified during the jump to the jump table,
>> the hardware
>> branch prediction function will be broken, resulting in a significant
>> increase
>> in the branch false prediction rate and affecting performance.
>
> Thanks for the insight! This is the kind of improvement that will
> probably become a lot harder to even *sight* without uarch details
> available.
>
> However, I think it's better to also include a minimized test case to
> ensure the compiled code doesn't regress. (Comparison of relevant
> statistics, e.g. output of perf stat, would be even nicer to have!)
>
There was no way I could find a small test case that would replicate
this problem. This occurs when compiling spec2006 400.perlbench. And it
only appears when '-flto' is added.:-(
But I paid for reproducible programs and compilation methods under the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136.
>>
>> gcc/ChangeLog:
>>
>> * config/loongarch/loongarch.md: Change register constraint to 'q'.
>> ---
>> gcc/config/loongarch/loongarch.md | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/loongarch/loongarch.md
>> b/gcc/config/loongarch/loongarch.md
>> index 816a943d155..f9b64173104 100644
>> --- a/gcc/config/loongarch/loongarch.md
>> +++ b/gcc/config/loongarch/loongarch.md
>> @@ -2926,9 +2926,11 @@ (define_expand "tablejump"
>> DONE;
>> })
>> +;; Jump to the jump table Avoid using the $r1 register to prevent
>> +;; affecting hardware branch prediction.
>> (define_insn "@tablejump<mode>"
>> [(set (pc)
>> - (match_operand:P 0 "register_operand" "r"))
>> + (match_operand:P 0 "register_operand" "q"))
>> (use (label_ref (match_operand 1 "" "")))]
>> ""
>> "jr\t%0"
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]
2023-06-07 3:36 ` Lulu Cheng
@ 2023-06-07 7:37 ` WANG Xuerui
2023-06-07 7:51 ` Lulu Cheng
0 siblings, 1 reply; 5+ messages in thread
From: WANG Xuerui @ 2023-06-07 7:37 UTC (permalink / raw)
To: Lulu Cheng, gcc-patches; +Cc: xry111, xuchenghua
On 2023/6/7 11:36, Lulu Cheng wrote:
>
> 在 2023/6/7 上午11:26, WANG Xuerui 写道:
>> Hi,
>>
>> On 2023/6/7 10:31, Lulu Cheng wrote:
>>> If the $ra register is modified during the jump to the jump table,
>>> the hardware
>>> branch prediction function will be broken, resulting in a significant
>>> increase
>>> in the branch false prediction rate and affecting performance.
>>
>> Thanks for the insight! This is the kind of improvement that will
>> probably become a lot harder to even *sight* without uarch details
>> available.
>>
>> However, I think it's better to also include a minimized test case to
>> ensure the compiled code doesn't regress. (Comparison of relevant
>> statistics, e.g. output of perf stat, would be even nicer to have!)
>>
> There was no way I could find a small test case that would replicate
> this problem. This occurs when compiling spec2006 400.perlbench. And it
> only appears when '-flto' is added.:-(
>
> But I paid for reproducible programs and compilation methods under the
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136.
Ah okay. I missed the bug number in the title initially.
After reading through the context, I think the reason for avoiding $ra
(in general) should be that the micro-architecture unconditionally
treats a "jr $ra" as "return from subroutine", hence doing "jr $ra"
would interfere with *both* subroutine return prediction *and* the more
general indirect branch prediction; am I right? This could be more
informative than merely saying "HW branch prediction will be broken".
>
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/loongarch/loongarch.md: Change register constraint to 'q'.
>>> ---
>>> gcc/config/loongarch/loongarch.md | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/config/loongarch/loongarch.md
>>> b/gcc/config/loongarch/loongarch.md
>>> index 816a943d155..f9b64173104 100644
>>> --- a/gcc/config/loongarch/loongarch.md
>>> +++ b/gcc/config/loongarch/loongarch.md
>>> @@ -2926,9 +2926,11 @@ (define_expand "tablejump"
>>> DONE;
>>> })
>>> +;; Jump to the jump table Avoid using the $r1 register to prevent
>>> +;; affecting hardware branch prediction.
>>> (define_insn "@tablejump<mode>"
>>> [(set (pc)
>>> - (match_operand:P 0 "register_operand" "r"))
>>> + (match_operand:P 0 "register_operand" "q"))
>>> (use (label_ref (match_operand 1 "" "")))]
>>> ""
>>> "jr\t%0"
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]
2023-06-07 7:37 ` WANG Xuerui
@ 2023-06-07 7:51 ` Lulu Cheng
0 siblings, 0 replies; 5+ messages in thread
From: Lulu Cheng @ 2023-06-07 7:51 UTC (permalink / raw)
To: WANG Xuerui, gcc-patches; +Cc: xry111, xuchenghua
在 2023/6/7 下午3:37, WANG Xuerui 写道:
> On 2023/6/7 11:36, Lulu Cheng wrote:
>>
>> 在 2023/6/7 上午11:26, WANG Xuerui 写道:
>>> Hi,
>>>
>>> On 2023/6/7 10:31, Lulu Cheng wrote:
>>>> If the $ra register is modified during the jump to the jump table,
>>>> the hardware
>>>> branch prediction function will be broken, resulting in a
>>>> significant increase
>>>> in the branch false prediction rate and affecting performance.
>>>
>>> Thanks for the insight! This is the kind of improvement that will
>>> probably become a lot harder to even *sight* without uarch details
>>> available.
>>>
>>> However, I think it's better to also include a minimized test case
>>> to ensure the compiled code doesn't regress. (Comparison of relevant
>>> statistics, e.g. output of perf stat, would be even nicer to have!)
>>>
>> There was no way I could find a small test case that would replicate
>> this problem. This occurs when compiling spec2006 400.perlbench. And
>> it only appears when '-flto' is added.:-(
>>
>> But I paid for reproducible programs and compilation methods under
>> the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110136.
>
> Ah okay. I missed the bug number in the title initially.
>
> After reading through the context, I think the reason for avoiding $ra
> (in general) should be that the micro-architecture unconditionally
> treats a "jr $ra" as "return from subroutine", hence doing "jr $ra"
> would interfere with *both* subroutine return prediction *and* the
> more general indirect branch prediction; am I right? This could be
> more informative than merely saying "HW branch prediction will be
> broken".
>
Exactly what you described. :-)
I'll change the description.
Thank you very much!
>>
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> * config/loongarch/loongarch.md: Change register constraint to
>>>> 'q'.
>>>> ---
>>>> gcc/config/loongarch/loongarch.md | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/gcc/config/loongarch/loongarch.md
>>>> b/gcc/config/loongarch/loongarch.md
>>>> index 816a943d155..f9b64173104 100644
>>>> --- a/gcc/config/loongarch/loongarch.md
>>>> +++ b/gcc/config/loongarch/loongarch.md
>>>> @@ -2926,9 +2926,11 @@ (define_expand "tablejump"
>>>> DONE;
>>>> })
>>>> +;; Jump to the jump table Avoid using the $r1 register to prevent
>>>> +;; affecting hardware branch prediction.
>>>> (define_insn "@tablejump<mode>"
>>>> [(set (pc)
>>>> - (match_operand:P 0 "register_operand" "r"))
>>>> + (match_operand:P 0 "register_operand" "q"))
>>>> (use (label_ref (match_operand 1 "" "")))]
>>>> ""
>>>> "jr\t%0"
>>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-07 7:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 2:31 [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136] Lulu Cheng
2023-06-07 3:26 ` WANG Xuerui
2023-06-07 3:36 ` Lulu Cheng
2023-06-07 7:37 ` WANG Xuerui
2023-06-07 7:51 ` Lulu Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).