public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data
@ 2022-09-22  2:27 hpa at zytor dot com
  2022-09-22  2:28 ` [Bug rtl-optimization/107006] " hpa at zytor dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

            Bug ID: 107006
           Summary: Missing optimization: common idiom for external data
           Product: gcc
           Version: 12.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hpa at zytor dot com
  Target Milestone: ---

Created attachment 53602
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53602&action=edit
C test case source

The only *portable* way in C to deal with external data structures containing
data of specific endianness, possibly unaligned, is to operate on them as byte
(char) arrays.

At least on x86 (which supports arbitrarily aligned loads), gcc *sometimes*
recognize these as single loads, but sometimes not.

In the included test cases, there is a plain C implementation and an
implementation wrapped in a C++ class.

Compiling the former with:

gcc -std=c2x -g -O3 -W -Wall -[cSE] -o bswap.[osi] bswap.c

... recognizes the load idiom for 16-bit numbers but not for 32- or 64-bit
numbers.

Compiling the latter with:

gcc -std=c++20 -g -O3 -E -Wall -[cSE] -o bswapcc.[osi] bswapcc.cc

... *additionally* recognizes the 32-bit load, *but only in the bigendian case*
(that is, it generates a load and a bswap instruction); whereas in the
littleendian -- native -- case, this does not happen!

I am familiar with the used of packed arrays and __builtin_bswap*() for these
accesses, but unfortunately these are gcc-specific.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
@ 2022-09-22  2:28 ` hpa at zytor dot com
  2022-09-22  2:28 ` hpa at zytor dot com
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #1 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53603
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53603&action=edit
C test case preprocessed source

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
  2022-09-22  2:28 ` [Bug rtl-optimization/107006] " hpa at zytor dot com
@ 2022-09-22  2:28 ` hpa at zytor dot com
  2022-09-22  2:28 ` hpa at zytor dot com
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #2 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53604
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53604&action=edit
C test case assembly output

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
  2022-09-22  2:28 ` [Bug rtl-optimization/107006] " hpa at zytor dot com
  2022-09-22  2:28 ` hpa at zytor dot com
@ 2022-09-22  2:28 ` hpa at zytor dot com
  2022-09-22  2:29 ` hpa at zytor dot com
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #3 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53605
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53605&action=edit
C test case object code

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (2 preceding siblings ...)
  2022-09-22  2:28 ` hpa at zytor dot com
@ 2022-09-22  2:29 ` hpa at zytor dot com
  2022-09-22  2:30 ` hpa at zytor dot com
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #4 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53606
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53606&action=edit
C++ test case class definition header file

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (3 preceding siblings ...)
  2022-09-22  2:29 ` hpa at zytor dot com
@ 2022-09-22  2:30 ` hpa at zytor dot com
  2022-09-22  2:30 ` hpa at zytor dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #5 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53607
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53607&action=edit
C++ test case main file

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (4 preceding siblings ...)
  2022-09-22  2:30 ` hpa at zytor dot com
@ 2022-09-22  2:30 ` hpa at zytor dot com
  2022-09-22  2:30 ` hpa at zytor dot com
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #6 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53608
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53608&action=edit
C++ test case preprocessed source

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (5 preceding siblings ...)
  2022-09-22  2:30 ` hpa at zytor dot com
@ 2022-09-22  2:30 ` hpa at zytor dot com
  2022-09-22  2:31 ` hpa at zytor dot com
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #7 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53609
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53609&action=edit
C++ test case assembly output

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (6 preceding siblings ...)
  2022-09-22  2:30 ` hpa at zytor dot com
@ 2022-09-22  2:31 ` hpa at zytor dot com
  2022-09-22  2:32 ` hpa at zytor dot com
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #8 from H. Peter Anvin <hpa at zytor dot com> ---
Created attachment 53610
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53610&action=edit
C++ test case object code

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (7 preceding siblings ...)
  2022-09-22  2:31 ` hpa at zytor dot com
@ 2022-09-22  2:32 ` hpa at zytor dot com
  2022-09-22  7:08 ` [Bug tree-optimization/107006] " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22  2:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #9 from H. Peter Anvin <hpa at zytor dot com> ---
To clarify: the C test case produces the same output regardless if it is
compiled as C or C++. Only the C++ wrapped class definition detects the
additional case of a 32-bit bigendian load.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (8 preceding siblings ...)
  2022-09-22  2:32 ` hpa at zytor dot com
@ 2022-09-22  7:08 ` rguenth at gcc dot gnu.org
  2022-09-22 14:26 ` hpa at zytor dot com
  2022-09-23  6:11 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-09-22  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |tree-optimization
   Last reconfirmed|                            |2022-09-22
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
The reason is that the loops are not unrolled (early enough or at all) for the
bswap/load detection.  So to work you have to unroll the loops manually or
direct GCC to do that, for example

inline uint64_t get_le64 (const unsigned char x[64/8]) {
 uint64_t y = 0;
#pragma GCC unroll 8
for (size_t i = 0; i < sizeof y; i++)
 if (0) y |= (uint64_t)x[i] << ((sizeof y - 1 - i)*8);
 else y |= (uint64_t)x[i] << i*8; return y; 
}

produces

get_le64:
.LFB11:
        .cfi_startproc
        movq    (%rdi), %rax
        ret

the unroll heuristics do not anticipate that later bswap/load detection will
merge all the loads and thus not grow code too much.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (9 preceding siblings ...)
  2022-09-22  7:08 ` [Bug tree-optimization/107006] " rguenth at gcc dot gnu.org
@ 2022-09-22 14:26 ` hpa at zytor dot com
  2022-09-23  6:11 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: hpa at zytor dot com @ 2022-09-22 14:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #11 from H. Peter Anvin <hpa at zytor dot com> ---
If you look at the output, you see that the loops are already fully unrolled
(at considerable code size cost.)

Unfortunately, since the issue at hand is dealing with code written to be
portable, adding gcc-specific hacks are not really a reasonable option.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/107006] Missing optimization: common idiom for external data
  2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
                   ` (10 preceding siblings ...)
  2022-09-22 14:26 ` hpa at zytor dot com
@ 2022-09-23  6:11 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-09-23  6:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to H. Peter Anvin from comment #11)
> If you look at the output, you see that the loops are already fully unrolled
> (at considerable code size cost.)

The unrolling is done too late for the bswap detection pass to trigger.

> Unfortunately, since the issue at hand is dealing with code written to be
> portable, adding gcc-specific hacks are not really a reasonable option.

Well, #pragma GCC unroll n is "portable" in that #pragma is an ISO C feature
and pragmas in the 'GCC' domain are supposed to be ignored by other compilers,
so not sure what you are wanting to say here.

You can also manually unroll of course.

Alternatively somebody can try to implement loop pattern matching for
bswap/load.  It's a reduction so blueprints might be available by the
strlen pattern matching in loop_distribution::transform_reduction_loop.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-09-23  6:11 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-22  2:27 [Bug rtl-optimization/107006] New: Missing optimization: common idiom for external data hpa at zytor dot com
2022-09-22  2:28 ` [Bug rtl-optimization/107006] " hpa at zytor dot com
2022-09-22  2:28 ` hpa at zytor dot com
2022-09-22  2:28 ` hpa at zytor dot com
2022-09-22  2:29 ` hpa at zytor dot com
2022-09-22  2:30 ` hpa at zytor dot com
2022-09-22  2:30 ` hpa at zytor dot com
2022-09-22  2:30 ` hpa at zytor dot com
2022-09-22  2:31 ` hpa at zytor dot com
2022-09-22  2:32 ` hpa at zytor dot com
2022-09-22  7:08 ` [Bug tree-optimization/107006] " rguenth at gcc dot gnu.org
2022-09-22 14:26 ` hpa at zytor dot com
2022-09-23  6:11 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).