public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98442] New: [X86] suboptimal for memset with CLEAR_BY_PIECES
@ 2020-12-25  1:38 crazylht at gmail dot com
  2020-12-31  3:38 ` [Bug target/98442] " crazylht at gmail dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2020-12-25  1:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98442

            Bug ID: 98442
           Summary: [X86] suboptimal for memset with CLEAR_BY_PIECES
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
                CC: hjl.tools at gmail dot com, wei3.xiao at intel dot com,
                    wwwhhhyyy333 at gmail dot com
  Target Milestone: ---
            Target: x86_64-*-* i?86-*-*

cat test.c

--------
char Tab[64];
void foo(int n)
{
    for (int i= 0; i != 64; i++)
     Tab[i] = 0;
}
----


gcc generate

------
foo(int):
  vpxor xmm0, xmm0, xmm0
  vmovdqa XMMWORD PTR Tab[rip], xmm0
  vmovdqa XMMWORD PTR Tab[rip+16], xmm0
  vmovdqa XMMWORD PTR Tab[rip+32], xmm0
  vmovdqa XMMWORD PTR Tab[rip+48], xmm0
  ret
Tab:
  .zero 64
---------

Could be better

----
foo(int):
        vpxor     ymm0, ymm0, ymm0                              #4.5
        vmovdqu   YMMWORD PTR Tab[rip], ymm0                    #4.5
        vmovdqu   YMMWORD PTR 32+Tab[rip], ymm0                 #4.5
        vzeroupper                                              #6.1
        ret                                                     #6.1
Tab:
-----

GCC use 128-bit as default
----
bool
default_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
                                        unsigned int alignment,
                                        enum by_pieces_operation op,
                                        bool speed_p)
{
  unsigned int max_size = 0;
  unsigned int ratio = 0;

  switch (op)
    {
    case CLEAR_BY_PIECES:
      max_size = STORE_MAX_PIECES;
      ratio = CLEAR_RATIO (speed_p);
----

Define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P for i386?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-06 23:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-25  1:38 [Bug target/98442] New: [X86] suboptimal for memset with CLEAR_BY_PIECES crazylht at gmail dot com
2020-12-31  3:38 ` [Bug target/98442] " crazylht at gmail dot com
2020-12-31  3:48 ` hjl.tools at gmail dot com
2020-12-31  3:56 ` hjl.tools at gmail dot com
2021-01-05 10:05 ` rguenth at gcc dot gnu.org
2021-10-06 23:48 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).