public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, i386] false dependencies fix
@ 2018-05-02  9:55 Nesterovskiy, Alexander
  2018-06-28  4:29 ` Jeff Law
  0 siblings, 1 reply; 7+ messages in thread
From: Nesterovskiy, Alexander @ 2018-05-02  9:55 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 294 bytes --]

This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss and vsqrtsd instructions.

Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance gains.
Please let me know if something is wrong here and should be changed.

--
Alexander Nesterovskiy

[-- Attachment #2: falsedep.patch --]
[-- Type: application/octet-stream, Size: 2382 bytes --]

--- i386.md	(revision 259756)
+++ i386.md	(working copy)
@@ -3547,7 +3547,7 @@
 	{
 	case MODE_DF:
 	  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-	    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
+	    return "%vmovsd\t{%d1, %0|%0, %d1}";
 	  return "%vmovsd\t{%1, %0|%0, %1}";
 
 	case MODE_V4SF:
@@ -3748,7 +3748,7 @@
 	{
 	case MODE_SF:
 	  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-	    return "vmovss\t{%1, %0, %0|%0, %0, %1}";
+	    return "%vmovss\t{%d1, %0|%0, %d1}";
 	  return "%vmovss\t{%1, %0|%0, %1}";
 
 	case MODE_V16SF:
@@ -15094,11 +15094,13 @@
 	 (symbol_ref "false"))))])
 
 (define_insn "*rcpsf2_sse"
-  [(set (match_operand:SF 0 "register_operand" "=x")
-	(unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm")]
+  [(set (match_operand:SF 0 "register_operand" "=x,x")
+	(unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "x,m")]
 		   UNSPEC_RCP))]
   "TARGET_SSE && TARGET_SSE_MATH"
-  "%vrcpss\t{%1, %d0|%d0, %1}"
+  "@
+   %vrcpss\t{%d1, %0|%0, %d1}
+   %vrcpss\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "sse")
    (set_attr "atom_sse_attr" "rcp")
    (set_attr "btver2_sse_attr" "rcp")
@@ -15396,11 +15398,13 @@
    (set_attr "bdver1_decode" "direct")])
 
 (define_insn "*rsqrtsf2_sse"
-  [(set (match_operand:SF 0 "register_operand" "=x")
-	(unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm")]
+  [(set (match_operand:SF 0 "register_operand" "=x,x")
+	(unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "x,m")]
 		   UNSPEC_RSQRT))]
   "TARGET_SSE && TARGET_SSE_MATH"
-  "%vrsqrtss\t{%1, %d0|%d0, %1}"
+  "@
+   %vrsqrtss\t{%d1, %0|%0, %d1}
+   %vrsqrtss\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "sse")
    (set_attr "atom_sse_attr" "rcp")
    (set_attr "btver2_sse_attr" "rcp")
@@ -15418,11 +15422,13 @@
 })
 
 (define_insn "*sqrt<mode>2_sse"
-  [(set (match_operand:MODEF 0 "register_operand" "=v")
+  [(set (match_operand:MODEF 0 "register_operand" "=v,v")
 	(sqrt:MODEF
-	  (match_operand:MODEF 1 "nonimmediate_operand" "vm")))]
+	  (match_operand:MODEF 1 "nonimmediate_operand" "v,m")))]
   "SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH"
-  "%vsqrt<ssemodesuffix>\t{%1, %d0|%d0, %1}"
+  "@
+   %vsqrt<ssemodesuffix>\t{%d1, %0|%0, %d1}
+   %vsqrt<ssemodesuffix>\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "sse")
    (set_attr "atom_sse_attr" "sqrt")
    (set_attr "btver2_sse_attr" "sqrt")

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, i386] false dependencies fix
  2018-05-02  9:55 [patch, i386] false dependencies fix Nesterovskiy, Alexander
@ 2018-06-28  4:29 ` Jeff Law
  2018-06-28  8:57   ` Nesterovskiy, Alexander
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2018-06-28  4:29 UTC (permalink / raw)
  To: Nesterovskiy, Alexander, gcc-patches

On 05/02/2018 03:55 AM, Nesterovskiy, Alexander wrote:
> This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss and vsqrtsd instructions.
> 
> Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance gains.
> Please let me know if something is wrong here and should be changed.
> 
> --
> Alexander Nesterovskiy
> 
> 
> falsedep.patch
> 
> 
> --- i386.md	(revision 259756)
> +++ i386.md	(working copy)
> @@ -3547,7 +3547,7 @@
>  	{
>  	case MODE_DF:
>  	  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
> -	    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
> +	    return "%vmovsd\t{%d1, %0|%0, %d1}";
>  	  return "%vmovsd\t{%1, %0|%0, %1}";
>  
>  	case MODE_V4SF:
> @@ -3748,7 +3748,7 @@
>  	{
>  	case MODE_SF:
>  	  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
> -	    return "vmovss\t{%1, %0, %0|%0, %0, %1}";
> +	    return "%vmovss\t{%d1, %0|%0, %d1}";
>  	  return "%vmovss\t{%1, %0|%0, %1}";
So what I'm confused about is in the original output template operand 0
is duplicated. In the new template operand 1 is duplicated.

Presumably what you're trying to accomplish is avoiding a false read on
operand 0 (the destination)?  Can you please confirm?

Knowing that should also help me evaluate the changes to recp and rsqrt
since they're being changed to the same style encoding when operating
strictly on registers.


THanks,
jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [patch, i386] false dependencies fix
  2018-06-28  4:29 ` Jeff Law
@ 2018-06-28  8:57   ` Nesterovskiy, Alexander
  0 siblings, 0 replies; 7+ messages in thread
From: Nesterovskiy, Alexander @ 2018-06-28  8:57 UTC (permalink / raw)
  To: Jeff Law, gcc-patches

Hello!

> So what I'm confused about is in the original output template operand 
> 0 is duplicated. In the new template operand 1 is duplicated.
>
> Presumably what you're trying to accomplish is avoiding a false read 
> on operand 0 (the destination)?  Can you please confirm?

> Knowing that should also help me evaluate the changes to recp and 
> rsqrt since they're being changed to the same style encoding when 
> operating strictly on registers.

Yes, it's the same for all instructions in the patch - we're not just avoiding
read but present more possibilities to execute speculatively for CPU here.

The destination depends only on the source after the patch, and (thanks
to CPU register renaming) CPU can successfully execute this instruction
even if some previous instruction with write to the same destination is
not finished currently.

--
Alexander Nesterovskiy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, i386] false dependencies fix
  2018-06-28  7:16 Uros Bizjak
  2018-06-28 13:56 ` Jeff Law
@ 2018-06-29  0:04 ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2018-06-29  0:04 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Nesterovskiy, Alexander

On 06/28/2018 01:16 AM, Uros Bizjak wrote:
> Hello!
> 
>>> --- i386.md (revision 259756)
>>> +++ i386.md (working copy)
>>> @@ -3547,7 +3547,7 @@
>>>   {
>>>   case MODE_DF:
>>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>>> +    return "%vmovsd\t{%d1, %0|%0, %d1}";
>>>    return "%vmovsd\t{%1, %0|%0, %1}";
>>>
>>>   case MODE_V4SF:
>>> @@ -3748,7 +3748,7 @@
>>>   {
>>>   case MODE_SF:
>>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -    return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>>> +    return "%vmovss\t{%d1, %0|%0, %d1}";
>>>    return "%vmovss\t{%1, %0|%0, %1}";
>> So what I'm confused about is in the original output template operand 0
>> is duplicated. In the new template operand 1 is duplicated.
>>
>> Presumably what you're trying to accomplish is avoiding a false read on
>> operand 0 (the destination)?  Can you please confirm?
> 
>> Knowing that should also help me evaluate the changes to recp and rsqrt
>> since they're being changed to the same style encoding when operating
>> strictly on registers.
> 
> Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
> in this case, all insn mnemonics are prefixed with "v".
I didn't realize you'd already acked the patch without the v->%v bits
and Sebastian committed the changes about a month ago...

Oh well, at least I know a bit more about the inner workings of the AVX
unit than I did a last week.

Sorry for the noise.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, i386] false dependencies fix
  2018-06-28  7:16 Uros Bizjak
@ 2018-06-28 13:56 ` Jeff Law
  2018-06-29  0:04 ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2018-06-28 13:56 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Nesterovskiy, Alexander

On 06/28/2018 01:16 AM, Uros Bizjak wrote:
> Hello!
> 
>>> --- i386.md (revision 259756)
>>> +++ i386.md (working copy)
>>> @@ -3547,7 +3547,7 @@
>>>   {
>>>   case MODE_DF:
>>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>>> +    return "%vmovsd\t{%d1, %0|%0, %d1}";
>>>    return "%vmovsd\t{%1, %0|%0, %1}";
>>>
>>>   case MODE_V4SF:
>>> @@ -3748,7 +3748,7 @@
>>>   {
>>>   case MODE_SF:
>>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -    return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>>> +    return "%vmovss\t{%d1, %0|%0, %d1}";
>>>    return "%vmovss\t{%1, %0|%0, %1}";
>> So what I'm confused about is in the original output template operand 0
>> is duplicated. In the new template operand 1 is duplicated.
>>
>> Presumably what you're trying to accomplish is avoiding a false read on
>> operand 0 (the destination)?  Can you please confirm?
> 
>> Knowing that should also help me evaluate the changes to recp and rsqrt
>> since they're being changed to the same style encoding when operating
>> strictly on registers.
> 
> Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
> in this case, all insn mnemonics are prefixed with "v".
ACK on that Uros -- I'd convinced myself that v->%v for TARGET_AVX
couldn't hurt anything since  we already had the v prefix in place.  I'm
happy to ensure this follows your preferred convention.

I was mostly trying to make sure I understood the other aspects of the
proposed change.

Cheers,
jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, i386] false dependencies fix
@ 2018-06-28  7:16 Uros Bizjak
  2018-06-28 13:56 ` Jeff Law
  2018-06-29  0:04 ` Jeff Law
  0 siblings, 2 replies; 7+ messages in thread
From: Uros Bizjak @ 2018-06-28  7:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law, Nesterovskiy, Alexander

Hello!

>> --- i386.md (revision 259756)
>> +++ i386.md (working copy)
>> @@ -3547,7 +3547,7 @@
>>   {
>>   case MODE_DF:
>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>> -    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>> +    return "%vmovsd\t{%d1, %0|%0, %d1}";
>>    return "%vmovsd\t{%1, %0|%0, %1}";
>>
>>   case MODE_V4SF:
>> @@ -3748,7 +3748,7 @@
>>   {
>>   case MODE_SF:
>>    if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>> -    return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>> +    return "%vmovss\t{%d1, %0|%0, %d1}";
>>    return "%vmovss\t{%1, %0|%0, %1}";
> So what I'm confused about is in the original output template operand 0
> is duplicated. In the new template operand 1 is duplicated.
>
> Presumably what you're trying to accomplish is avoiding a false read on
> operand 0 (the destination)?  Can you please confirm?

> Knowing that should also help me evaluate the changes to recp and rsqrt
> since they're being changed to the same style encoding when operating
> strictly on registers.

Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
in this case, all insn mnemonics are prefixed with "v".

Uros.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, i386] false dependencies fix
@ 2018-05-03  9:01 Uros Bizjak
  0 siblings, 0 replies; 7+ messages in thread
From: Uros Bizjak @ 2018-05-03  9:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: alexander.nesterovskiy

> This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss and vsqrtsd
> instructions.
>
> Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance gains.
> Please let me know if something is wrong here and should be changed.

Your submission needs a ChangeLog entry. Please see [1]

  case MODE_DF:
   if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-    return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
+    return "%vmovsd\t{%d1, %0|%0, %d1}";
   return "%vmovsd\t{%1, %0|%0, %1}";

No need to change "v" to "%v". We are always in AVX domain here.

Otherwise OK. Please resubmit the patch with the ChangeLog entry.

[1] https://gcc.gnu.org/contribute.html

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-28 23:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02  9:55 [patch, i386] false dependencies fix Nesterovskiy, Alexander
2018-06-28  4:29 ` Jeff Law
2018-06-28  8:57   ` Nesterovskiy, Alexander
2018-05-03  9:01 Uros Bizjak
2018-06-28  7:16 Uros Bizjak
2018-06-28 13:56 ` Jeff Law
2018-06-29  0:04 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).