public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/32951]  New: missed memcpy -> movdqa optimization.
@ 2007-07-31 18:40 pluto at agmk dot net
  2007-07-31 18:46 ` [Bug target/32951] " pinskia at gcc dot gnu dot org
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: pluto at agmk dot net @ 2007-07-31 18:40 UTC (permalink / raw)
  To: gcc-bugs

#include <emmintrin.h>

typedef char const* __attribute__((aligned(16))) aligned_byte_buffer;

__m128i load_1( aligned_byte_buffer buf )
{
        return *((__m128i*)buf);
}

__m128i load_2( aligned_byte_buffer buf )
{
        __m128i m;
        __builtin_memcpy( &m, buf, sizeof( m ) );
        return m;
}

gcc-4.2.1 produces unoptimal load_2 code:

load_1:
        movdqa  (%rdi), %xmm0
        ret

load_2:
        movq    (%rdi), %rax
        movq    %rax, -24(%rsp)
        movq    8(%rdi), %rax
        movq    %rax, -16(%rsp)
        movdqa  -24(%rsp), %xmm0
        ret


-- 
           Summary: missed memcpy -> movdqa optimization.
           Product: gcc
           Version: 4.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pluto at agmk dot net
GCC target triplet: x86_64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
@ 2007-07-31 18:46 ` pinskia at gcc dot gnu dot org
  2007-08-06 12:42 ` pluto at agmk dot net
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-07-31 18:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from pinskia at gcc dot gnu dot org  2007-07-31 18:46 -------
(insn 7 6 8 t.c:15 (set (reg:DI 61)
        (mem:DI (reg/v/f:DI 59 [ buf ]) [0 S8 A8])) -1 (nil))


See A8.

So the aligned attribute so not applying where you think it should be.
This is how you get the correct aligned attribute:
typedef char a __attribute__((aligned(16)));

typedef a const* aligned_byte_buffer;

And then after that memcpy is not using the vector registers:
(insn 7 6 8 t.c:17 (set (reg:DI 61)
        (mem:DI (reg/v/f:DI 59 [ buf ]) [0 S8 A128])) -1 (nil))

(insn 8 7 9 t.c:17 (set (mem/c/i:DI (reg:DI 60) [0 m+0 S8 A128])
        (reg:DI 61)) -1 (nil))

(insn 9 8 10 t.c:17 (set (reg:DI 62)
        (mem:DI (plus:DI (reg/v/f:DI 59 [ buf ])
                (const_int 8 [0x8])) [0 S8 A64])) -1 (nil))

(insn 10 9 0 t.c:17 (set (mem/c/i:DI (plus:DI (reg:DI 60)
                (const_int 8 [0x8])) [0 m+8 S8 A64])
        (reg:DI 62)) -1 (nil))


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2007-07-31 18:46:45
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
  2007-07-31 18:46 ` [Bug target/32951] " pinskia at gcc dot gnu dot org
@ 2007-08-06 12:42 ` pluto at agmk dot net
  2007-08-06 12:43   ` Andrew Pinski
  2007-08-06 12:43 ` pinskia at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 11+ messages in thread
From: pluto at agmk dot net @ 2007-08-06 12:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from pluto at agmk dot net  2007-08-06 12:42 -------
thanks for ths explanation about aligned attribute.

moreover i'm wondering why gcc uses movdqa for unaligned loads?
it should use movdqu for *((__m128i*)ptr) and _mm_set_epi8(ptr[15],...,ptr[0]).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-08-06 12:42 ` pluto at agmk dot net
@ 2007-08-06 12:43   ` Andrew Pinski
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Pinski @ 2007-08-06 12:43 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

On 6 Aug 2007 12:42:18 -0000, pluto at agmk dot net
<gcc-bugzilla@gcc.gnu.org> wrote:
> moreover i'm wondering why gcc uses movdqa for unaligned loads?

Because __m128i is assumed to be aligned.

-- Pinski


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
  2007-07-31 18:46 ` [Bug target/32951] " pinskia at gcc dot gnu dot org
  2007-08-06 12:42 ` pluto at agmk dot net
@ 2007-08-06 12:43 ` pinskia at gmail dot com
  2007-08-06 12:56 ` pluto at agmk dot net
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gmail dot com @ 2007-08-06 12:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from pinskia at gmail dot com  2007-08-06 12:43 -------
Subject: Re:  missed memcpy -> movdqa optimization.

On 6 Aug 2007 12:42:18 -0000, pluto at agmk dot net
<gcc-bugzilla@gcc.gnu.org> wrote:
> moreover i'm wondering why gcc uses movdqa for unaligned loads?

Because __m128i is assumed to be aligned.

-- Pinski


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (2 preceding siblings ...)
  2007-08-06 12:43 ` pinskia at gmail dot com
@ 2007-08-06 12:56 ` pluto at agmk dot net
  2008-03-31 15:58 ` hjl dot tools at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pluto at agmk dot net @ 2007-08-06 12:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pluto at agmk dot net  2007-08-06 12:56 -------
(In reply to comment #3)
> Subject: Re:  missed memcpy -> movdqa optimization.
> 
> On 6 Aug 2007 12:42:18 -0000, pluto at agmk dot net
> <gcc-bugzilla@gcc.gnu.org> wrote:
> > moreover i'm wondering why gcc uses movdqa for unaligned loads?
> 
> Because __m128i is assumed to be aligned.

so, as far i can see currently there's no way to generate movdqu load
with <emmintrin.h> and __builtin_memcpy( &__m128i, unaligned_ptr, 16 )
and i must write a short inline assembly for this purpose.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (3 preceding siblings ...)
  2007-08-06 12:56 ` pluto at agmk dot net
@ 2008-03-31 15:58 ` hjl dot tools at gmail dot com
  2008-05-29  4:46 ` hjl dot tools at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: hjl dot tools at gmail dot com @ 2008-03-31 15:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from hjl dot tools at gmail dot com  2008-03-31 15:57 -------
Can you use

[hjl@gnu-6 tmp]$ cat v.c
#include <emmintrin.h>

__m128i load1( char const* buf )
{
  return _mm_loadu_si128 ((__m128i const *) buf);
}

__m128i load2( char const* buf )
{
  return _mm_load_si128 ((__m128i const *) buf);
}
[hjl@gnu-6 tmp]$ /usr/gcc-4.4/bin/gcc -O2 v.c -S 
[hjl@gnu-6 tmp]$ cat v.s
        .file   "v.c"
        .text
        .p2align 4,,15
.globl load2
        .type   load2, @function
load2:
.LFB519:
        movdqa  (%rdi), %xmm0
        ret
.LFE519:
        .size   load2, .-load2
        .p2align 4,,15
.globl load1
        .type   load1, @function
load1:
.LFB518:
        movdqu  (%rdi), %xmm0
        ret
.LFE518:
        .size   load1, .-load1


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (4 preceding siblings ...)
  2008-03-31 15:58 ` hjl dot tools at gmail dot com
@ 2008-05-29  4:46 ` hjl dot tools at gmail dot com
  2010-03-09 19:34 ` pluto at agmk dot net
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: hjl dot tools at gmail dot com @ 2008-05-29  4:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from hjl dot tools at gmail dot com  2008-05-29 04:45 -------
Type alignment is ignored for call. See PR 35771.


-- 

hjl dot tools at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |35771
Bug 32951 depends on bug 35771, which changed state.

Bug 35771 Summary: Call expander ignores type alignment
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35771

           What    |Old Value                   |New Value
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|FIXED                       |

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (5 preceding siblings ...)
  2008-05-29  4:46 ` hjl dot tools at gmail dot com
@ 2010-03-09 19:34 ` pluto at agmk dot net
  2010-03-10 10:52 ` rguenth at gcc dot gnu dot org
  2010-03-10 10:54 ` rguenth at gcc dot gnu dot org
  8 siblings, 0 replies; 11+ messages in thread
From: pluto at agmk dot net @ 2010-03-09 19:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from pluto at agmk dot net  2010-03-09 19:34 -------
current 4.4.x generates 'movdqa (%rdi), %xmm0' in both cases.
4.2 branch is closed, 4.3 is near to close.
can we close this bug as fixed?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (6 preceding siblings ...)
  2010-03-09 19:34 ` pluto at agmk dot net
@ 2010-03-10 10:52 ` rguenth at gcc dot gnu dot org
  2010-03-10 10:54 ` rguenth at gcc dot gnu dot org
  8 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-03-10 10:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from rguenth at gcc dot gnu dot org  2010-03-10 10:52 -------
(In reply to comment #7)
> current 4.4.x generates 'movdqa (%rdi), %xmm0' in both cases.
> 4.2 branch is closed, 4.3 is near to close.
> can we close this bug as fixed?

GCC 4.4 creates movdqu (%rdi), %xmm0 for load_2 for me, so no (same as 4.5).


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |4.4.3 4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/32951] missed memcpy -> movdqa optimization.
  2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
                   ` (7 preceding siblings ...)
  2010-03-10 10:52 ` rguenth at gcc dot gnu dot org
@ 2010-03-10 10:54 ` rguenth at gcc dot gnu dot org
  8 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-03-10 10:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2010-03-10 10:54 -------
Actually it does with the fixed aligned attribute.  Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
      Known to fail|4.4.3 4.5.0                 |
      Known to work|                            |4.4.3 4.5.0
         Resolution|                            |FIXED
   Target Milestone|---                         |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32951


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-03-10 10:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-31 18:40 [Bug target/32951] New: missed memcpy -> movdqa optimization pluto at agmk dot net
2007-07-31 18:46 ` [Bug target/32951] " pinskia at gcc dot gnu dot org
2007-08-06 12:42 ` pluto at agmk dot net
2007-08-06 12:43   ` Andrew Pinski
2007-08-06 12:43 ` pinskia at gmail dot com
2007-08-06 12:56 ` pluto at agmk dot net
2008-03-31 15:58 ` hjl dot tools at gmail dot com
2008-05-29  4:46 ` hjl dot tools at gmail dot com
2010-03-09 19:34 ` pluto at agmk dot net
2010-03-10 10:52 ` rguenth at gcc dot gnu dot org
2010-03-10 10:54 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).