public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix for mov<mode>_internal pattern
@ 2013-03-21 16:37 Michael Zolotukhin
  2013-03-21 17:27 ` H.J. Lu
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Zolotukhin @ 2013-03-21 16:37 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]

Hi,
I've found a little bit strange code in "mov<mode>_internal"
RTL-pattern from config/i386/sse.md:
        case MODE_V2DF:
         if (TARGET_AVX
             && (misaligned_operand (operands[0], <MODE>mode)
                 || misaligned_operand (operands[1], <MODE>mode)))
           return "vmovupd\t{%1, %0|%0, %1}";
          else
            return "%vmovapd\t{%1, %0|%0, %1}";

That could lead to generation of vmovapd for misaligned operands. Was
there any special intention behind that, or is that just a bug?
Attached is a patch that removes TARGET_AVX from the condition, thus
preventing generation of movapd for misaligned operands and enabling
generation of movupd for them.

If the patch is ok, could anyone please commit it? Of course, that
could wait until 4.8 is released.

Bootstrapped and tested on x86_64-unknown-linux-gnu and i686-linux.

Changelog:
2013-03-21  Michael Zolotukhin  <michael.v.zolotukhin@gmail.com>

        * config/i386/sse.md (*mov<mode>_internal): Fix condition.

-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[-- Attachment #2: mov_mode_internal.patch --]
[-- Type: application/octet-stream, Size: 1410 bytes --]

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e2bb68b..b454ad5 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -445,28 +445,25 @@
 	{
 	case MODE_V8SF:
 	case MODE_V4SF:
-	  if (TARGET_AVX
-	      && (misaligned_operand (operands[0], <MODE>mode)
-		  || misaligned_operand (operands[1], <MODE>mode)))
-	    return "vmovups\t{%1, %0|%0, %1}";
+	  if (misaligned_operand (operands[0], <MODE>mode)
+	      || misaligned_operand (operands[1], <MODE>mode))
+	    return "%vmovups\t{%1, %0|%0, %1}";
 	  else
 	    return "%vmovaps\t{%1, %0|%0, %1}";
 
 	case MODE_V4DF:
 	case MODE_V2DF:
-	  if (TARGET_AVX
-	      && (misaligned_operand (operands[0], <MODE>mode)
-		  || misaligned_operand (operands[1], <MODE>mode)))
-	    return "vmovupd\t{%1, %0|%0, %1}";
+	  if (misaligned_operand (operands[0], <MODE>mode)
+	      || misaligned_operand (operands[1], <MODE>mode))
+	    return "%vmovupd\t{%1, %0|%0, %1}";
 	  else
 	    return "%vmovapd\t{%1, %0|%0, %1}";
 
 	case MODE_OI:
 	case MODE_TI:
-	  if (TARGET_AVX
-	      && (misaligned_operand (operands[0], <MODE>mode)
-		  || misaligned_operand (operands[1], <MODE>mode)))
-	    return "vmovdqu\t{%1, %0|%0, %1}";
+	  if (misaligned_operand (operands[0], <MODE>mode)
+	      || misaligned_operand (operands[1], <MODE>mode))
+	    return "%vmovdqu\t{%1, %0|%0, %1}";
 	  else
 	    return "%vmovdqa\t{%1, %0|%0, %1}";
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-21 16:37 [PATCH] Fix for mov<mode>_internal pattern Michael Zolotukhin
@ 2013-03-21 17:27 ` H.J. Lu
  2013-03-22 12:50   ` Michael Zolotukhin
  0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2013-03-21 17:27 UTC (permalink / raw)
  To: Michael Zolotukhin; +Cc: gcc-patches, Uros Bizjak

On Thu, Mar 21, 2013 at 9:36 AM, Michael Zolotukhin
<michael.v.zolotukhin@gmail.com> wrote:
> Hi,
> I've found a little bit strange code in "mov<mode>_internal"
> RTL-pattern from config/i386/sse.md:
>         case MODE_V2DF:
>          if (TARGET_AVX
>              && (misaligned_operand (operands[0], <MODE>mode)
>                  || misaligned_operand (operands[1], <MODE>mode)))
>            return "vmovupd\t{%1, %0|%0, %1}";
>           else
>             return "%vmovapd\t{%1, %0|%0, %1}";
>
> That could lead to generation of vmovapd for misaligned operands. Was
> there any special intention behind that, or is that just a bug?
> Attached is a patch that removes TARGET_AVX from the condition, thus
> preventing generation of movapd for misaligned operands and enabling
> generation of movupd for them.
>
> If the patch is ok, could anyone please commit it? Of course, that
> could wait until 4.8 is released.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu and i686-linux.
>
> Changelog:
> 2013-03-21  Michael Zolotukhin  <michael.v.zolotukhin@gmail.com>
>
>         * config/i386/sse.md (*mov<mode>_internal): Fix condition.
>
> --

Do you have a testcase to show there is a problem?
The misaligned case should only be generated from
ix86_avx256_split_vector_move_misalign.


-- 
H.J.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-21 17:27 ` H.J. Lu
@ 2013-03-22 12:50   ` Michael Zolotukhin
  2013-03-22 16:30     ` H.J. Lu
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Zolotukhin @ 2013-03-22 12:50 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Uros Bizjak

> Do you have a testcase to show there is a problem?
> The misaligned case should only be generated from
> ix86_avx256_split_vector_move_misalign.
I faced it when playing with optimized expanding of memmov (with
vector instructions). There I generated v2di-move with emit_strmov,
which later used "*movv2di_internal" pattern, and there aligned
version was chosen because (TARGET_AVX) was false. Probably, that's
not how emit_strmov should've been used, but then it'd be cool to have
an assert or an additional check somewhere, which would tell me, that
I'm using some function in a wrong way.
So, I guess it's true that in trunk everything is ok here and the
misaligned case could only come from
ix86_avx256_split_vector_move_misalign - nevertheless, this place
seems to me a bit untidy.


On 21 March 2013 21:27, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Mar 21, 2013 at 9:36 AM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>> Hi,
>> I've found a little bit strange code in "mov<mode>_internal"
>> RTL-pattern from config/i386/sse.md:
>>         case MODE_V2DF:
>>          if (TARGET_AVX
>>              && (misaligned_operand (operands[0], <MODE>mode)
>>                  || misaligned_operand (operands[1], <MODE>mode)))
>>            return "vmovupd\t{%1, %0|%0, %1}";
>>           else
>>             return "%vmovapd\t{%1, %0|%0, %1}";
>>
>> That could lead to generation of vmovapd for misaligned operands. Was
>> there any special intention behind that, or is that just a bug?
>> Attached is a patch that removes TARGET_AVX from the condition, thus
>> preventing generation of movapd for misaligned operands and enabling
>> generation of movupd for them.
>>
>> If the patch is ok, could anyone please commit it? Of course, that
>> could wait until 4.8 is released.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu and i686-linux.
>>
>> Changelog:
>> 2013-03-21  Michael Zolotukhin  <michael.v.zolotukhin@gmail.com>
>>
>>         * config/i386/sse.md (*mov<mode>_internal): Fix condition.
>>
>> --
>
> Do you have a testcase to show there is a problem?
> The misaligned case should only be generated from
> ix86_avx256_split_vector_move_misalign.
>
>
> --
> H.J.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-22 12:50   ` Michael Zolotukhin
@ 2013-03-22 16:30     ` H.J. Lu
  2013-03-22 16:58       ` Michael Zolotukhin
  0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2013-03-22 16:30 UTC (permalink / raw)
  To: Michael Zolotukhin; +Cc: gcc-patches, Uros Bizjak

On Fri, Mar 22, 2013 at 5:49 AM, Michael Zolotukhin
<michael.v.zolotukhin@gmail.com> wrote:
>> Do you have a testcase to show there is a problem?
>> The misaligned case should only be generated from
>> ix86_avx256_split_vector_move_misalign.
> I faced it when playing with optimized expanding of memmov (with
> vector instructions). There I generated v2di-move with emit_strmov,

You can't use aligned vector move if alignment is unknown or
not properly aligned.

> which later used "*movv2di_internal" pattern, and there aligned
> version was chosen because (TARGET_AVX) was false. Probably, that's
> not how emit_strmov should've been used, but then it'd be cool to have
> an assert or an additional check somewhere, which would tell me, that
> I'm using some function in a wrong way.
> So, I guess it's true that in trunk everything is ok here and the
> misaligned case could only come from
> ix86_avx256_split_vector_move_misalign - nevertheless, this place
> seems to me a bit untidy.
>
>

In ix86_expand_vector_move_misalign, we generate different SSE
unaligned load/store, depending on optimization option.  For AVX,
we just use 256-bit unaligned load/store or split it to 128-bit normal
load/store since there is no performance difference between
normal load/store vs unaligned load/store on aligned memory.


-- 
H.J.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-22 16:30     ` H.J. Lu
@ 2013-03-22 16:58       ` Michael Zolotukhin
  2013-03-22 19:48         ` H.J. Lu
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Zolotukhin @ 2013-03-22 16:58 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Uros Bizjak

> You can't use aligned vector move if alignment is unknown or
> not properly aligned.
Yes, sure. But I just emit V2DI move for an operands, which are
aligned to 64-bit (not 128-bit!) - and it's encoded into movdqa which
is obviously incorrect. The compiler knows, that operands are
misaligned, but the check is (TARGET_AVX && (misaligned_operand(op0)
|| misaligned_operand(op1)) - in case when TARGET_AVX is false it
doesn't matter whether operands are misaligned or not (in this check)
- the compiler will always emit movdqa. I understand that this check
wasn't used for such cases before and probably isn't intended for
that. But in this case, it needs some guarding checks I guess.

On 22 March 2013 20:30, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Mar 22, 2013 at 5:49 AM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>>> Do you have a testcase to show there is a problem?
>>> The misaligned case should only be generated from
>>> ix86_avx256_split_vector_move_misalign.
>> I faced it when playing with optimized expanding of memmov (with
>> vector instructions). There I generated v2di-move with emit_strmov,
>
> You can't use aligned vector move if alignment is unknown or
> not properly aligned.
>
>> which later used "*movv2di_internal" pattern, and there aligned
>> version was chosen because (TARGET_AVX) was false. Probably, that's
>> not how emit_strmov should've been used, but then it'd be cool to have
>> an assert or an additional check somewhere, which would tell me, that
>> I'm using some function in a wrong way.
>> So, I guess it's true that in trunk everything is ok here and the
>> misaligned case could only come from
>> ix86_avx256_split_vector_move_misalign - nevertheless, this place
>> seems to me a bit untidy.
>>
>>
>
> In ix86_expand_vector_move_misalign, we generate different SSE
> unaligned load/store, depending on optimization option.  For AVX,
> we just use 256-bit unaligned load/store or split it to 128-bit normal
> load/store since there is no performance difference between
> normal load/store vs unaligned load/store on aligned memory.
>
>
> --
> H.J.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-22 16:58       ` Michael Zolotukhin
@ 2013-03-22 19:48         ` H.J. Lu
  2013-03-23  8:26           ` Michael Zolotukhin
  0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2013-03-22 19:48 UTC (permalink / raw)
  To: Michael Zolotukhin; +Cc: gcc-patches, Uros Bizjak

On Fri, Mar 22, 2013 at 9:58 AM, Michael Zolotukhin
<michael.v.zolotukhin@gmail.com> wrote:
>> You can't use aligned vector move if alignment is unknown or
>> not properly aligned.
> Yes, sure. But I just emit V2DI move for an operands, which are
> aligned to 64-bit (not 128-bit!) - and it's encoded into movdqa which

Can you try gen_movv2di for V2DI move?

-- 
H.J.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for mov<mode>_internal pattern
  2013-03-22 19:48         ` H.J. Lu
@ 2013-03-23  8:26           ` Michael Zolotukhin
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Zolotukhin @ 2013-03-23  8:26 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Uros Bizjak

Thanks for the hints!
gen_movv2di seems to handle the situation well: it calls
ix86_expand_vector_move which honors alignment and emits correct code.
However, it was very convenient to use gen_strmov for moves in any
mode - I just needed to define the biggest mode with which a piece of
memory could be copied. Probably, I'll need to implement such
'universal'-move function and stop using gen_strmov, that should solve
the problem.

On 22 March 2013 23:47, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Mar 22, 2013 at 9:58 AM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>>> You can't use aligned vector move if alignment is unknown or
>>> not properly aligned.
>> Yes, sure. But I just emit V2DI move for an operands, which are
>> aligned to 64-bit (not 128-bit!) - and it's encoded into movdqa which
>
> Can you try gen_movv2di for V2DI move?
>
> --
> H.J.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-03-23  8:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-21 16:37 [PATCH] Fix for mov<mode>_internal pattern Michael Zolotukhin
2013-03-21 17:27 ` H.J. Lu
2013-03-22 12:50   ` Michael Zolotukhin
2013-03-22 16:30     ` H.J. Lu
2013-03-22 16:58       ` Michael Zolotukhin
2013-03-22 19:48         ` H.J. Lu
2013-03-23  8:26           ` Michael Zolotukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).