public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/33103]  New: Redundant multiplications for memset
@ 2007-08-18  5:55 guillaume dot melquiond at ens-lyon dot fr
  2007-08-18  9:03 ` [Bug middle-end/33103] " pinskia at gcc dot gnu dot org
  0 siblings, 1 reply; 4+ messages in thread
From: guillaume dot melquiond at ens-lyon dot fr @ 2007-08-18  5:55 UTC (permalink / raw)
  To: gcc-bugs

This report was prompted by a mail on the lkml which was suggesting to
hand-craft memset: http://lkml.org/lkml/2007/8/17/309 . So I wondered if the
code generated for __builtin_memset was any good, and could be used instead of
hand-crafted code. I tested with (Debian) GCC 3.4.6, 4.1.3, 4.2.1, and also
with a snapshot of GCC 4.3. All the results are similar, so I will only show
them for GCC 4.2 on x86-64. Compilation was done with -O3.

First, the __builtin_memset code:

  void fill1(char *s, int a)
  {
    __builtin_memset(s, a, 15);
  }

GCC generates:

   0:   40 0f b6 c6             movzbl %sil,%eax
   4:   48 ba 01 01 01 01 01    mov    $0x101010101010101,%rdx
   b:   01 01 01 
   e:   40 0f b6 ce             movzbl %sil,%ecx
  12:   48 0f af c2             imul   %rdx,%rax
  16:   40 88 77 0e             mov    %sil,0xe(%rdi)
  1a:   48 89 07                mov    %rax,(%rdi)
  1d:   40 0f b6 c6             movzbl %sil,%eax
  21:   69 c0 01 01 01 01       imul   $0x1010101,%eax,%eax
  27:   89 47 08                mov    %eax,0x8(%rdi)
  2a:   89 c8                   mov    %ecx,%eax
  2c:   c1 e0 08                shl    $0x8,%eax
  2f:   01 c8                   add    %ecx,%eax
  31:   66 89 47 0c             mov    %ax,0xc(%rdi)
  35:   c3                      retq   

Notice that GCC first computes %sil * (01)^8 and puts it into %rax, then it
computes %sil * (01)^4 and puts it into %eax (where it already was, due to the
previous multiplication), then it computes %sil * (01)^2 and puts it into %ax
(where it already was, again).

Second, some code where multiplication results are reused:

  void fill2(char *s, int a)
  {
    unsigned long long int v = (unsigned char)a * 0x0101010101010101ull;
    *(unsigned long long int *)s = v;
    *(unsigned *)(s + 8) = v;
    *(unsigned short *)(s + 12) = v;
    *(s + 15) = v;
  }

GCC generates:

   0:   40 0f b6 f6             movzbl %sil,%esi
   4:   48 b8 01 01 01 01 01    mov    $0x101010101010101,%rax
   b:   01 01 01 
   e:   48 0f af f0             imul   %rax,%rsi
  12:   48 89 37                mov    %rsi,(%rdi)
  15:   89 77 08                mov    %esi,0x8(%rdi)
  18:   66 89 77 0c             mov    %si,0xc(%rdi)
  1c:   40 88 77 0f             mov    %sil,0xf(%rdi)
  20:   c3                      retq   

The function is 21 bytes smaller (-40%), it does not require two additional
registers (c and d), and it will not be slower.

The same issue arises on x86_32. The hand-written code (with 32bit integers
this time) is 14 bytes smaller for memset(,,15).


-- 
           Summary: Redundant multiplications for memset
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: guillaume dot melquiond at ens-lyon dot fr
GCC target triplet: x86_64-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33103


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-22  1:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-33103-4@http.gcc.gnu.org/bugzilla/>
2012-06-06 10:12 ` [Bug middle-end/33103] Redundant multiplications for memset rguenth at gcc dot gnu.org
2021-08-22  0:06 ` pinskia at gcc dot gnu.org
2021-08-22  1:26 ` hjl.tools at gmail dot com
2007-08-18  5:55 [Bug target/33103] New: " guillaume dot melquiond at ens-lyon dot fr
2007-08-18  9:03 ` [Bug middle-end/33103] " pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).