public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: <gcc@gnu.org>
Subject: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?
Date: Mon, 5 Jun 2023 12:17:43 +0200	[thread overview]
Message-ID: <5982A5DF4D694B4EA971B2597E833FC6@H270> (raw)

--- failure.c ---
int _clz(unsigned long long argument) {
    return __builtin_clzll(argument);
}

int _ctz(unsigned long long argument) {
    return __builtin_ctzll(argument);
}
--- EOF ---

GCC 13.1    -m32 -mabm -mbmi -mlzcnt -O3 failure.c

<https://godbolt.org/z/MMf11hKch>
_clz(unsigned long long):
        mov     edx, DWORD PTR [esp+8]
        xor     ecx, ecx
        xor     eax, eax
        lzcnt   eax, DWORD PTR [esp+4]
        add     eax, 32
        lzcnt   ecx, edx
        test    edx, edx
        cmovne  eax, ecx
        ret
_ctz(unsigned long long):
        sub     esp, 20
        push    DWORD PTR [esp+28]
        push    DWORD PTR [esp+28]
        call    __ctzdi2
        add     esp, 28
        ret

OUCH: although EXPLICITLY enabled via -mabm (for AMD processors) and -mbmi
      (for Intel processors), GCC generates slowmotion code calling __ctzdi2()
      instead of TZCNT instructions available since 10 (in words: TEN) years.


GCC 13.1    -m32 -march=i386 -O3 failure.c

<https://godbolt.org/z/16ezfaexb>
_clz(unsigned long long):
        mov     edx, DWORD PTR [esp+4]
        mov     eax, DWORD PTR [esp+8]
        test    eax, eax
        je      .L2
        bsr     eax, eax
        xor     eax, 31
        ret
.L2:
        bsr     eax, edx
        xor     eax, 31
        lea     eax, [eax+32]
        ret
_ctz(unsigned long long):
        sub     esp, 20
        push    DWORD PTR [esp+28]
        push    DWORD PTR [esp+28]
        call    __ctzdi2
        add     esp, 28
        ret

OUCH²: the BSF/BSR instructions were introduced 38 (in words: THIRTY-EIGHT)
       years ago with the i386 processor, but GCC fails to know/use BSF --
       a real shame!

OUCH³: an optimising compiler would of course generate "JMP __ctzdi2" instead
       of code fiddling with the stack!

Stefan Kanthak


             reply	other threads:[~2023-06-05 10:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05 10:17 Stefan Kanthak [this message]
2023-06-05 10:33 ` Jonathan Wakely
2023-06-05 11:35 ` Gabriel Ravier
2023-06-05 22:23   ` Dave Blanchard
2023-06-05 23:59     ` Gabriel Ravier
2023-06-06  0:09       ` Dave Blanchard
2023-06-06  0:28         ` Gabriel Ravier
2023-06-06  0:28         ` Paul Koning
2023-06-06  7:57         ` Jonathan Wakely
2023-06-06  8:31         ` David Brown
2023-06-05 12:55 Julian Waters
2023-06-06  8:36 Julian Waters
2023-06-06 12:53 ` Paul Smith
2023-06-06 15:37   ` David Brown
2023-06-06 17:39     ` David Edelsohn
2023-06-06 18:43       ` Arsen Arsenović
2023-06-08 11:36         ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5982A5DF4D694B4EA971B2597E833FC6@H270 \
    --to=stefan.kanthak@nexgo.de \
    --cc=gcc@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).