public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: "Jonathan Wakely" <jwakely.gcc@gmail.com>
Cc: <gcc@gnu.org>, "Andrew Pinski" <pinskia@gmail.com>
Subject: Re: Will GCC eventually support SSE2 or SSE4.1?
Date: Fri, 26 May 2023 09:58:43 +0200	[thread overview]
Message-ID: <896EB515110646CEBAA84E98E273E4B8@H270> (raw)
In-Reply-To: <CAH6eHdQ6m-erXFFtVF-bKLqgejqgfyUd+e9LQhKzqEYfHiNX3w@mail.gmail.com>

"Jonathan Wakely" <jwakely.gcc@gmail.com> wrote:

> On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote:
>
>> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kanthak@nexgo.de>
>> wrote:
>>>
>>> Hi,
>>>
>>> compile the following function on a system with Core2 processor
>>> (released January 2008) for the 32-bit execution environment:
>>>
>>> --- demo.c ---
>>> int ispowerof2(unsigned long long argument)
>>> {
>>>     return (argument & argument - 1) == 0;
>>> }
>>> --- EOF ---
>>>
>>> GCC 13.3: gcc -m32 -O3 demo.c
>>>
>>> NOTE: -mtune=native is the default!
>>
>> You need to use -march=native and not -mtune=native .... to turn on
>> the architecture features.

(Un)fortunately this changes nothing!

STOP: that's wrong, it makes it even WORSE!

# Compilation provided by Compiler Explorer at https://godbolt.org/
ispowerof2(unsigned long long):
        vmovq   xmm1, QWORD PTR [esp+4]
        vpcmpeqd        xmm0, xmm0, xmm0
        xor     eax, eax
        vpaddq  xmm0, xmm1, xmm0
        vpand   xmm0, xmm0, xmm1
        vpunpcklqdq     xmm0, xmm0, xmm0
        vptest  xmm0, xmm0
        sete    al
        ret

That's what I call a REALLY EPIC FAILURE!

Compare this unefficient BLOAT to the SSE4.1 code from my original post!

> Yes this is just user error. You didn't use the right options to say you
> want SSE2.

ARGH: please read CAREFULLY what I wrote!

1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use
   SSE per default, especially when the generated code is SLOWER and BIGGER
   than conventional code using the general purpose registers)!

2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use
   PMOVMSKB here, despite -O3!

3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all!

> GCC supports it fine already.

DREAM ON!
Again: view the 2 counter examples from my original post CAREFULLY!

not amused
Stefan

  reply	other threads:[~2023-05-26  8:00 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  6:46 Stefan Kanthak
2023-05-26  7:00 ` Andrew Pinski
2023-05-26  7:30   ` Jonathan Wakely
2023-05-26  7:58     ` Stefan Kanthak [this message]
2023-05-26  8:16       ` Sam James
2023-05-26  8:28       ` Jonathan Wakely
2023-05-26  8:59         ` Stefan Kanthak
2023-05-26  9:22           ` Jakub Jelinek
2023-05-26 11:28             ` Stefan Kanthak
2023-05-26 11:42               ` Jonathan Wakely
2023-05-26 12:03                 ` Stefan Kanthak
2023-05-26 12:16                   ` Jonathan Wakely
2023-05-26 12:22                     ` Stefan Kanthak
2023-05-26 13:00                       ` Mark Wielaard
2023-05-26 12:23                   ` Jonathan Wakely
2023-05-26 11:36             ` Stefan Kanthak
2023-05-26 11:45               ` Jonathan Wakely
2023-05-26 12:19                 ` Stefan Kanthak
2023-05-26 12:30                   ` Jonathan Wakely
2023-05-26 12:42                     ` Stefan Kanthak
2023-05-26 13:33                       ` Nicholas Vinson
2023-05-26 12:37                   ` Jakub Jelinek
2023-05-26 13:49                     ` Stefan Kanthak
2023-05-26 14:07                       ` Jonathan Wakely
2023-05-26 14:18                         ` Jakub Jelinek
2023-05-26 14:41                           ` Stefan Kanthak
2023-05-26 14:55                             ` Jonathan Wakely
2023-05-26 15:07                               ` Stefan Kanthak
2023-05-26 14:26                         ` Stefan Kanthak
2023-05-26 14:58                           ` Jonathan Wakely
2023-05-26 15:49                             ` Stefan Kanthak
2023-05-26 16:44                               ` David Brown
2023-05-27 18:16                                 ` Will GCC eventually support correct code compilation? Dave Blanchard
2023-05-27 18:59                                   ` Jason Merrill
2023-05-28 11:50                                   ` David Brown
2023-05-26  9:22           ` Will GCC eventually support SSE2 or SSE4.1? Jonathan Wakely
2023-05-26  8:12     ` Hagen Paul Pfeifer
2023-05-26  9:51       ` Jonathan Wakely
2023-05-26 11:34 ` Nicholas Vinson
2023-05-26 15:10 ` LIU Hao
2023-05-26 15:40   ` Stefan Kanthak
2023-05-27 18:20     ` LIU Hao
2023-05-27 18:49       ` Stefan Kanthak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=896EB515110646CEBAA84E98E273E4B8@H270 \
    --to=stefan.kanthak@nexgo.de \
    --cc=gcc@gnu.org \
    --cc=jwakely.gcc@gmail.com \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).