public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: <gcc@gnu.org>
Subject: Epic code generator/optimiser failures
Date: Sat, 27 May 2023 19:32:52 +0200	[thread overview]
Message-ID: <F6AF971658BE49E783455B0F03A18A06@H270> (raw)

--- demo.c ---
int ispowerof2(unsigned long long argument) {
    return (argument != 0) && ((argument & argument - 1) == 0);
}
--- EOF ---

GCC 13.1    gcc -m32 -mavx -O3 # or -march=native instead of -mavx

https://gcc.godbolt.org/z/T31Gzo85W
ispowerof2(unsigned long long):
        vmovq   xmm1, QWORD PTR [esp+4]        ->    movq     xmm0, dword ptr [esp+4]
        xor     eax, eax                       ->    xor      eax, eax
        vpunpcklqdq     xmm0, xmm1, xmm1       # superfluous
        vptest  xmm0, xmm0                     ->    ptest    xmm0, xmm0
        je      .L1                            ->    jz       .L1
        vpcmpeqd        xmm0, xmm0, xmm0       ->    pcmpeqd  xmm1, xmm1
        xor     eax, eax                       # superfluous
        vpaddq  xmm0, xmm1, xmm0               ->    paddq    xmm1. xmm0
        vpand   xmm0, xmm0, xmm1               # superfluous
        vpunpcklqdq     xmm0, xmm0, xmm0       # superfluous
        vptest  xmm0, xmm0                     ->    ptest    xmm1, xmm0
        sete    al                             ->    setz     al
.L1:
        ret                                    ->    ret

5 out of 13 instructions are SUPERFLUOUS here!

OUCH #1: there's ANSOLUTELY no need to generate AVX instructions and
         bloat the code through VEX prefixes and longer instructions!

OUCH #2: [V]MOVQ clears the upper lane of XMM registers, there's
         ABSOLTELY no need for [V]PUNPCKLQDQ instructions.

GCC 13.1    gcc -m32 -msse4.1 -O3

https://gcc.godbolt.org/z/bqsqec6r1
ispowerof2(unsigned long long):
        movq    xmm1, QWORD PTR [esp+4]       ->    movq    xmm0, [esp+4]
        xor     eax, eax                      ->    xor     eax, eax
        movdqa  xmm0, xmm1                    # superfluous
        punpcklqdq      xmm0, xmm1            # superfluous
        ptest   xmm0, xmm0                    ->    ptest   xmm0, xmm0
        je      .L1                           ->    jz      .L1
        pcmpeqd xmm0, xmm0                    ->    pcmpeqq xmm1, xmm1
        xor     eax, eax                      # superfluous
        paddq   xmm0, xmm1                    ->    paddq   xmm1, xmm0
        pand    xmm0, xmm1                    # superfluous
        punpcklqdq      xmm0, xmm0            # superfluous
        ptest   xmm0, xmm0                    ->    ptest   xmm1, xmm0
        sete    al                            ->    setz    al
.L1:
        ret                                   ->    ret

5 out of 14 instructions are superfluous here, or 18 of 50 bytes!

OUCH #3/#4: see above!

Will GCC eventually generate proper SSE4.1/AVX code?

Stefan

                 reply	other threads:[~2023-05-27 17:33 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F6AF971658BE49E783455B0F03A18A06@H270 \
    --to=stefan.kanthak@nexgo.de \
    --cc=gcc@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).