public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?
@ 2023-06-05 12:55 Julian Waters
  0 siblings, 0 replies; 17+ messages in thread
From: Julian Waters @ 2023-06-05 12:55 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 774 bytes --]

gcc -O0 -c -mabm -mbmi retard.c -o retard.o
           ^
           |
           |
           |

int code(unsigned long long number) {
    return (int) _tzcnt_u64(number);
}

objdump --disassemble-all retard.o

0000000000000000 <code>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 4d 10             mov    %rcx,0x10(%rbp)
   c:   48 8b 45 10             mov    0x10(%rbp),%rax
  10:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  14:   31 c0                   xor    %eax,%eax
  16:   f3 48 0f bc 45 f8       tzcnt  -0x8(%rbp),%rax
 <---------------------
  1c:   48 83 c4 10             add    $0x10,%rsp
  20:   5d                      pop    %rbp

Moron.

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?
@ 2023-06-06  8:36 Julian Waters
  2023-06-06 12:53 ` Paul Smith
  0 siblings, 1 reply; 17+ messages in thread
From: Julian Waters @ 2023-06-06  8:36 UTC (permalink / raw)
  To: gabravier, gcc

[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

It's alright Gabriel, pay this intellectually challenged individual no
mind. It's clear that, unlike Stefan, who at the very least knows how to
disassemble native code and understands what the instruction sequences mean
(even though the way he goes about it is flat out wrong), this retard
doesn't know how the fuck compiler code emission works in the slightest.
I'd be surprised if our friend Dave here (from "killthe.net", VERY
professional Dave!) even knows what registers or the stack are. Hell Dave,
forget about something as advanced as a compiler for native chip
architectures, do you even know the basic design principles of a fucking
Interpreter? Should be easy for a "Big Dog" like you, shouldn't it? Hell,
I'll make it easy for you, you can ignore ultra low level Interpreters like
the one within the JVM (because there is no fucking way in hell someone
like you with IQ comparable to room temperature can ever grasp how such a
complex system works), how would a basic Interpreter even function? How
does simple interpreter dispatch work, hm? How does cpython (or PyPy) or
even one as simple as the Matz Ruby Interpreter operate? Can you answer
that Dave? Or have you deluded your braindead self into thinking that
you're one of the "Big Dogs" (Your own words) and are now punching way
above your league? Are you going to answer with the "Hurr durr they work by
translating source code to machine code on the fly" that outdated and flat
out wrong as shit text in computer science textbooks say? Pffft



Sorry for my outburst, to the rest of this list. I can no longer stay
silent and watch these little shits bully people who are too kind to fire
back with the same kind of venom in their words. They may be polite enough
to refrain from doing so, Dave, but rest assured, I am far from as kind as
they are when dealing with assholes. I'm all for constructive criticism to
help improve a product as a whole, but if any of Stefan's (or god forbid,
David's) mails are "constructive", then Donald Trump is the second fucking
incarnation of Jesus Christ himself returning to earth for the second coming

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?
@ 2023-06-05 10:17 Stefan Kanthak
  2023-06-05 10:33 ` Jonathan Wakely
  2023-06-05 11:35 ` Gabriel Ravier
  0 siblings, 2 replies; 17+ messages in thread
From: Stefan Kanthak @ 2023-06-05 10:17 UTC (permalink / raw)
  To: gcc

--- failure.c ---
int _clz(unsigned long long argument) {
    return __builtin_clzll(argument);
}

int _ctz(unsigned long long argument) {
    return __builtin_ctzll(argument);
}
--- EOF ---

GCC 13.1    -m32 -mabm -mbmi -mlzcnt -O3 failure.c

<https://godbolt.org/z/MMf11hKch>
_clz(unsigned long long):
        mov     edx, DWORD PTR [esp+8]
        xor     ecx, ecx
        xor     eax, eax
        lzcnt   eax, DWORD PTR [esp+4]
        add     eax, 32
        lzcnt   ecx, edx
        test    edx, edx
        cmovne  eax, ecx
        ret
_ctz(unsigned long long):
        sub     esp, 20
        push    DWORD PTR [esp+28]
        push    DWORD PTR [esp+28]
        call    __ctzdi2
        add     esp, 28
        ret

OUCH: although EXPLICITLY enabled via -mabm (for AMD processors) and -mbmi
      (for Intel processors), GCC generates slowmotion code calling __ctzdi2()
      instead of TZCNT instructions available since 10 (in words: TEN) years.


GCC 13.1    -m32 -march=i386 -O3 failure.c

<https://godbolt.org/z/16ezfaexb>
_clz(unsigned long long):
        mov     edx, DWORD PTR [esp+4]
        mov     eax, DWORD PTR [esp+8]
        test    eax, eax
        je      .L2
        bsr     eax, eax
        xor     eax, 31
        ret
.L2:
        bsr     eax, edx
        xor     eax, 31
        lea     eax, [eax+32]
        ret
_ctz(unsigned long long):
        sub     esp, 20
        push    DWORD PTR [esp+28]
        push    DWORD PTR [esp+28]
        call    __ctzdi2
        add     esp, 28
        ret

OUCH²: the BSF/BSR instructions were introduced 38 (in words: THIRTY-EIGHT)
       years ago with the i386 processor, but GCC fails to know/use BSF --
       a real shame!

OUCH³: an optimising compiler would of course generate "JMP __ctzdi2" instead
       of code fiddling with the stack!

Stefan Kanthak


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-06-08 11:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-05 12:55 Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors? Julian Waters
  -- strict thread matches above, loose matches on Subject: below --
2023-06-06  8:36 Julian Waters
2023-06-06 12:53 ` Paul Smith
2023-06-06 15:37   ` David Brown
2023-06-06 17:39     ` David Edelsohn
2023-06-06 18:43       ` Arsen Arsenović
2023-06-08 11:36         ` Mark Wielaard
2023-06-05 10:17 Stefan Kanthak
2023-06-05 10:33 ` Jonathan Wakely
2023-06-05 11:35 ` Gabriel Ravier
2023-06-05 22:23   ` Dave Blanchard
2023-06-05 23:59     ` Gabriel Ravier
2023-06-06  0:09       ` Dave Blanchard
2023-06-06  0:28         ` Gabriel Ravier
2023-06-06  0:28         ` Paul Koning
2023-06-06  7:57         ` Jonathan Wakely
2023-06-06  8:31         ` David Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).