public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: "Jonathan Wakely" <jwakely.gcc@gmail.com>
Cc: <gcc@gnu.org>, "Andrew Pinski" <pinskia@gmail.com>
Subject: Re: Will GCC eventually support SSE2 or SSE4.1?
Date: Fri, 26 May 2023 10:59:03 +0200	[thread overview]
Message-ID: <4BD5D8BA8E0F45098CC3E2B188A216E6@H270> (raw)
In-Reply-To: <CAH6eHdTy=Ln-t7UP2psyB6jQ1BEyecjP+oebT1h+JFi1_Qac-Q@mail.gmail.com>

"Jonathan Wakely" <jwakely.gcc@gmail.com> wrote:

> On Fri, 26 May 2023 at 09:00, Stefan Kanthak <stefan.kanthak@nexgo.de> wrote:
>>
>> "Jonathan Wakely" <jwakely.gcc@gmail.com> wrote:
>>
>> > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote:
>> >
>> >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kanthak@nexgo.de>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> compile the following function on a system with Core2 processor
>> >>> (released January 2008) for the 32-bit execution environment:
>> >>>
>> >>> --- demo.c ---
>> >>> int ispowerof2(unsigned long long argument)
>> >>> {
>> >>>     return (argument & argument - 1) == 0;
>> >>> }
>> >>> --- EOF ---
>> >>>
>> >>> GCC 13.3: gcc -m32 -O3 demo.c
>> >>>
>> >>> NOTE: -mtune=native is the default!
>> >>
>> >> You need to use -march=native and not -mtune=native .... to turn on
>> >> the architecture features.
>>
>> (Un)fortunately this changes nothing!
>>
>> STOP: that's wrong, it makes it even WORSE!
>>
>> # Compilation provided by Compiler Explorer at https://godbolt.org/
>> ispowerof2(unsigned long long):
>>         vmovq   xmm1, QWORD PTR [esp+4]
>>         vpcmpeqd        xmm0, xmm0, xmm0
>>         xor     eax, eax
>>         vpaddq  xmm0, xmm1, xmm0
>>         vpand   xmm0, xmm0, xmm1
>>         vpunpcklqdq     xmm0, xmm0, xmm0
>>         vptest  xmm0, xmm0
>>         sete    al
>>         ret
>>
>> That's what I call a REALLY EPIC FAILURE!
>>
>> Compare this unefficient BLOAT to the SSE4.1 code from my original post!
>>
>> > Yes this is just user error. You didn't use the right options to say you
>> > want SSE2.
>>
>> ARGH: please read CAREFULLY what I wrote!
> 
> You wrote "Now add the -mtune=core2 option to EXPLICITLY enable the
> NATIVE SSE4.1
> alias "Penryn New Instruction Set" of the Core2 processor" which is
> wrong, that's not what -mtune does.
> 
> Read the docs CAREFULLY: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

3) SSE4.1 is supported since Core2, but -march=core2 fails to enable it.
   That's bad, REALITY CHECK, please!

4) If the documenation is right, then the behaviour of GCC is wrong: it
   doesn't allow to use SSE4.1 without SSE4.2!

5) Compile the function with -march=nehalem (which according to the
   documentation enables support for BOTH SSE4.1 and SSE4.2) and notice
   that GCC fails to use SSE4.1!

>> 1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use
>>    SSE per default, especially when the generated code is SLOWER and BIGGER
>>    than conventional code using the general purpose registers)!
>>
>> 2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use
>>    PMOVMSKB here, despite -O3!
> 
> So report a bug to bugzilla, not via an email to the wrong list.
> 
>>
>> 3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all!
> 
> core2 doesn't enable SSE4.1, as clearly shown in the docs:
> https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
> 
> If you send emails full of confused mistakes, don't be surprised if
> the replies aren't what you want.
> 
> If you think GCC is generating bad code, file a bug. But make sure
> you're actually using the right options to enable the right
> instruction sets before complaining about the instructions used.

See above: GCC fails to use SSE4.1, despite -march=nehalem
And (if the documentation is right, then) GCC fails to support SSE4.1
without SSE4.2.

Stefan

  reply	other threads:[~2023-05-26  9:06 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  6:46 Stefan Kanthak
2023-05-26  7:00 ` Andrew Pinski
2023-05-26  7:30   ` Jonathan Wakely
2023-05-26  7:58     ` Stefan Kanthak
2023-05-26  8:16       ` Sam James
2023-05-26  8:28       ` Jonathan Wakely
2023-05-26  8:59         ` Stefan Kanthak [this message]
2023-05-26  9:22           ` Jakub Jelinek
2023-05-26 11:28             ` Stefan Kanthak
2023-05-26 11:42               ` Jonathan Wakely
2023-05-26 12:03                 ` Stefan Kanthak
2023-05-26 12:16                   ` Jonathan Wakely
2023-05-26 12:22                     ` Stefan Kanthak
2023-05-26 13:00                       ` Mark Wielaard
2023-05-26 12:23                   ` Jonathan Wakely
2023-05-26 11:36             ` Stefan Kanthak
2023-05-26 11:45               ` Jonathan Wakely
2023-05-26 12:19                 ` Stefan Kanthak
2023-05-26 12:30                   ` Jonathan Wakely
2023-05-26 12:42                     ` Stefan Kanthak
2023-05-26 13:33                       ` Nicholas Vinson
2023-05-26 12:37                   ` Jakub Jelinek
2023-05-26 13:49                     ` Stefan Kanthak
2023-05-26 14:07                       ` Jonathan Wakely
2023-05-26 14:18                         ` Jakub Jelinek
2023-05-26 14:41                           ` Stefan Kanthak
2023-05-26 14:55                             ` Jonathan Wakely
2023-05-26 15:07                               ` Stefan Kanthak
2023-05-26 14:26                         ` Stefan Kanthak
2023-05-26 14:58                           ` Jonathan Wakely
2023-05-26 15:49                             ` Stefan Kanthak
2023-05-26 16:44                               ` David Brown
2023-05-27 18:16                                 ` Will GCC eventually support correct code compilation? Dave Blanchard
2023-05-27 18:59                                   ` Jason Merrill
2023-05-28 11:50                                   ` David Brown
2023-05-26  9:22           ` Will GCC eventually support SSE2 or SSE4.1? Jonathan Wakely
2023-05-26  8:12     ` Hagen Paul Pfeifer
2023-05-26  9:51       ` Jonathan Wakely
2023-05-26 11:34 ` Nicholas Vinson
2023-05-26 15:10 ` LIU Hao
2023-05-26 15:40   ` Stefan Kanthak
2023-05-27 18:20     ` LIU Hao
2023-05-27 18:49       ` Stefan Kanthak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BD5D8BA8E0F45098CC3E2B188A216E6@H270 \
    --to=stefan.kanthak@nexgo.de \
    --cc=gcc@gnu.org \
    --cc=jwakely.gcc@gmail.com \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).