public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: "Jonathan Wakely" <jwakely.gcc@gmail.com>
Cc: "Jakub Jelinek" <jakub@redhat.com>, <gcc@gnu.org>,
	"Andrew Pinski" <pinskia@gmail.com>
Subject: Re: Will GCC eventually support SSE2 or SSE4.1?
Date: Fri, 26 May 2023 14:03:36 +0200	[thread overview]
Message-ID: <07FDB7C375CD46C1B6955B66338DD58E@H270> (raw)
In-Reply-To: <CAH6eHdQ31oC4jqxq5gV5T_P3v67ic=1xRQLdEkdLxHUBQZW8WQ@mail.gmail.com>

"Jonathan Wakely" <jwakely.gcc@gmail.com> wrote:

> On Fri, 26 May 2023 at 12:29, Stefan Kanthak <stefan.kanthak@nexgo.de> wrote:
>>
>> "Jakub Jelinek" <jakub@redhat.com> wrote:
>>
>> > On Fri, May 26, 2023 at 10:59:03AM +0200, Stefan Kanthak wrote:
>> >> 3) SSE4.1 is supported since Core2, but -march=core2 fails to enable it.
>> >>    That's bad, REALITY CHECK, please!
>> >
>> > You're wrong.
>> > SSE4.1 first appeared in the 45nm versions of Core2, the 65nm versions
>> > didn't have it.
>>
>> That's correct, I failed to see this difference.
> 
> REALITY CHECK please!

Dumbass check please!

>> > The supported CPU names don't distinguish between core2 submodels,
>> > so if you have core2 with sse4.1, you should either be using -march=native
>> > if compiling on such a machine, or use -march=core2 -msse4.1,
>>
>> This is one of the combinations I didn't test until now; with it (and with
>> -m32 -msse4.1 too) GCC generates SSE4.1 instructions, but FAILS to optimise:
>>
>> # Compilation provided by Compiler Explorer at https://godbolt.org/
>> ispowerof2(unsigned long long):
>>         movq    xmm1, QWORD PTR [esp+4]
>>         pcmpeqd xmm0, xmm0
>>         xor     eax, eax
>>         paddq   xmm0, xmm1
>>         pand    xmm0, xmm1            # SUPERFLUOUS!
>>         punpcklqdq      xmm0, xmm0    # SUPERFLUOUS!
>>         ptest   xmm0, xmm0            #    ptest    xmm0, xmm1
>>         sete    al
>>         ret
>>
>> 9 instructions in 36 bytes instead of 7 instructions in 26 bytes.

No comment here?

>> JFTR: the documentation of MOVQ specifies
>>
>> | when the destination operand is an XMM register, the quadword is
>> | stored to the low quadword of the register, and the high quadword
>> | is cleared to all 0s.
>>
>> > there is no -march={conroe,allendale,wolfdale,merom,penryn,...}.
>> >
>> >> 4) If the documenation is right, then the behaviour of GCC is wrong: it
>> >>    doesn't allow to use SSE4.1 without SSE4.2!
>> >
>> > If you aren't able to read the documentation, it is hard to argue.
>>
>> When the documentation is wrong or incomplete it's hard to trust it!
> 
> Just like when you make incorrect statements and assume everybody else is wrong.

Do I assume that? Or did you just make this up?

> The documentation isn't perfect, but you should not just ignore it and
> assume you know better in all cases.
> 
>> | -m32
>> ...
>> | The -m32 option sets int, long, and pointer types to 32 bits, and
>> | generates code that runs on any i386 system.
>>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> OUCH: as shown in https://godbolt.org/z/b43cjGdY9 -m32 ALONE but
>>       generates SSE2 instructions which DONT run on ANY i386 system!
> 
> That's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109954

I posted this here some years ago; see for example
<https://skanthak.homepage.t-online.de/gcc.html#case27>
Ignorance is bliss?!

>> OOPS: as shown above, -m32 -msse4.1 (or another -msse*) also generates
>>       code that does NOT run on ANY i386 system!
>>
>> Where is the precedence of the different -m* options for the CPU type
>> documented?
>> Where is their influence on each other documented?
> 
> -march enables the instructions listed for the relevant cpu family,
> then using -mxxx or -mno-xxx adds or removes particular instruction
> sets from the ones enabled by -march.

ADD THIS TO THE DOCUMENTATION!

> If you give an option twice, e.g. -march=core2 -march=nehalem, then
> the second one wins. If you use -msse2 -mno-sse2 then the second one
> wins.

ARGH: not repetitions of ONE particular option or its negation, stupid!

> You can check this using e.g.
> 
> gcc -Q --help=target -march=core2 -msse2
> 
>> | -march=cpu-type
>> ...
>> |   Specifying -march=cpu-type implies -mtune=cpu-type, except where noted
>> |   otherwise.
>> ...
>> | -mtune=cpu-type
>> ...
>> |    the compiler does not generate any code that cannot run on the default
>> |    machine type unless you use a -march=cpu-type option.
>>
>> Why is the "default machine type" not mentioned/specified with -march=?
> 
> Using -march overrides it. The default is set during configure.

And exactly this is missing in the documentation for -march=!
Guess why I cited the documentation for -mtune= where it is mentioned?

> Adding -v to the compilation will show what -march option is used by cc1 by
> default.

Not reliable unless documented elsewhere!

Stefan

  reply	other threads:[~2023-05-26 12:09 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  6:46 Stefan Kanthak
2023-05-26  7:00 ` Andrew Pinski
2023-05-26  7:30   ` Jonathan Wakely
2023-05-26  7:58     ` Stefan Kanthak
2023-05-26  8:16       ` Sam James
2023-05-26  8:28       ` Jonathan Wakely
2023-05-26  8:59         ` Stefan Kanthak
2023-05-26  9:22           ` Jakub Jelinek
2023-05-26 11:28             ` Stefan Kanthak
2023-05-26 11:42               ` Jonathan Wakely
2023-05-26 12:03                 ` Stefan Kanthak [this message]
2023-05-26 12:16                   ` Jonathan Wakely
2023-05-26 12:22                     ` Stefan Kanthak
2023-05-26 13:00                       ` Mark Wielaard
2023-05-26 12:23                   ` Jonathan Wakely
2023-05-26 11:36             ` Stefan Kanthak
2023-05-26 11:45               ` Jonathan Wakely
2023-05-26 12:19                 ` Stefan Kanthak
2023-05-26 12:30                   ` Jonathan Wakely
2023-05-26 12:42                     ` Stefan Kanthak
2023-05-26 13:33                       ` Nicholas Vinson
2023-05-26 12:37                   ` Jakub Jelinek
2023-05-26 13:49                     ` Stefan Kanthak
2023-05-26 14:07                       ` Jonathan Wakely
2023-05-26 14:18                         ` Jakub Jelinek
2023-05-26 14:41                           ` Stefan Kanthak
2023-05-26 14:55                             ` Jonathan Wakely
2023-05-26 15:07                               ` Stefan Kanthak
2023-05-26 14:26                         ` Stefan Kanthak
2023-05-26 14:58                           ` Jonathan Wakely
2023-05-26 15:49                             ` Stefan Kanthak
2023-05-26 16:44                               ` David Brown
2023-05-27 18:16                                 ` Will GCC eventually support correct code compilation? Dave Blanchard
2023-05-27 18:59                                   ` Jason Merrill
2023-05-28 11:50                                   ` David Brown
2023-05-26  9:22           ` Will GCC eventually support SSE2 or SSE4.1? Jonathan Wakely
2023-05-26  8:12     ` Hagen Paul Pfeifer
2023-05-26  9:51       ` Jonathan Wakely
2023-05-26 11:34 ` Nicholas Vinson
2023-05-26 15:10 ` LIU Hao
2023-05-26 15:40   ` Stefan Kanthak
2023-05-27 18:20     ` LIU Hao
2023-05-27 18:49       ` Stefan Kanthak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07FDB7C375CD46C1B6955B66338DD58E@H270 \
    --to=stefan.kanthak@nexgo.de \
    --cc=gcc@gnu.org \
    --cc=jakub@redhat.com \
    --cc=jwakely.gcc@gmail.com \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).