public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
@ 2021-10-15  6:54 siarhei.siamashka at gmail dot com
  2021-11-05 13:33 ` [Bug d/102765] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2021-10-15  6:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

            Bug ID: 102765
           Summary: [11 Regression] GDC11 stopped inlining library
                    functions and lambdas used by a binary search
                    one-liner code
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: d
          Assignee: ibuclaw at gdcproject dot org
          Reporter: siarhei.siamashka at gmail dot com
  Target Milestone: ---

The performance of the following simple binary search code regressed a lot
starting from GDC11:

/*******************************************************/
import std.algorithm, std.range, std.stdio, std.stdint;

// calculate integer square root using binary search
int64_t isqrt(int64_t x) {
  return iota(0, min(x, 3037000499) + 1)
         .map!(v => (v * v > x))
         .assumeSorted.lowerBound(true)
         .length - 1;
}

// print the sum of 20M square roots
void main() { 20000000.iota.map!isqrt.sum.writeln; }
/*******************************************************/

$ gdc-6.3.0 -g -O3 -frelease -fno-bounds-check test.d && time ./a.out 
59618479180

real    0m1.924s
user    0m1.924s
sys     0m0.000s

$ gdc-9.3.0 -g -O3 -frelease -fno-bounds-check test.d && time ./a.out 
59618479180

real    0m2.100s
user    0m2.099s
sys     0m0.000s

$ gdc-10.3.0 -g -O3 -frelease -fno-bounds-check test.d && time ./a.out 
59618479180

real    0m1.776s
user    0m1.776s
sys     0m0.000s

$ gdc-11.2.0 -g -O3 -frelease -fno-bounds-check test.d && time ./a.out 
59618479180

real    0m6.889s
user    0m6.887s
sys     0m0.000s


My expectation is that the compilers should inline everything here and generate
code for a small and efficient binary search loop. But GDC11 stopped doing
this, as can be confirmed by running "perf record ./a.out && perf report":

    27.86%  a.out    a.out             [.]
_D3std5range__T11SortedRangeTSQBc9algorithm9iteration__T9MapResultS4test5isqrtFlZ9__lambda2TSQDnQDm__T4iotaTiTlZQkFilZ6ResultZQCsVAyaa5_61203c2062ZQFc__T18getTransitionIndexVEQGrQGq12SearchPolicyi3SQHoQHn__TQHkTQHaVQDha5_61203c2062ZQIj3geqTbZQDlMFNaNbNiNfbZm
    15.02%  a.out    a.out             [.]
_D3std5range__T11SortedRangeTSQBc9algorithm9iteration__T9MapResultS4test5isqrtFlZ9__lambda2TSQDnQDm__T4iotaTiTlZQkFilZ6ResultZQCsVAyaa5_61203c2062ZQFc__T3geqTbTbZQjMFNaNbNiNfbbZb
    10.34%  a.out    a.out             [.]
_D3std9algorithm9iteration__T9MapResultS4test5isqrtFlZ9__lambda2TSQCm5range__T4iotaTiTlZQkFilZ6ResultZQCv7opIndexMFNaNbNiNfmZb
    10.31%  a.out    a.out             [.]
_D3std10functional__T9binaryFunVAyaa5_61203c2062VQra1_61VQza1_62Z__TQBvTbTbZQCdFNaNbNiNfKbKbZb
     3.03%  a.out    a.out             [.]
_D3std5range__T4iotaTiTlZQkFilZ6Result7opIndexMNgFNaNbNiNfmZNgl
     2.34%  a.out    a.out             [.] 0x0000000000031a09
     2.28%  a.out    a.out             [.]
_D4core6atomic__T7casImplTmTxmTmZQqFNaNbNiNePOmxmmZb
     2.11%  a.out    a.out             [.]
_D3std5range__T11SortedRangeTSQBc9algorithm9iteration__T9MapResultS4test5isqrtFlZ9__lambda2TSQDnQDm__T4iotaTiTlZQkFilZ6ResultZQCsVAyaa5_61203c2062ZQFc7opSliceMFNaNbNiNfmmZSQGoQGn__TQGkTQGaVQCha5_61203c2062ZQHj
     2.02%  a.out    a.out             [.]
_D3std5range__T12assumeSortedVAyaa5_61203c2062TSQBu9algorithm9iteration__T9MapResultS4test5isqrtFlZ9__lambda2TSQEfQEe__T4iotaTiTlZQkFilZ6ResultZQCsZQFdFNaNbNiNfQEjZSQGhQGg__T11SortedRangeTQFlVQGga5_61203c2062ZQBj


Using either -fwhole-program or -flto cmdline options resolves the performance
problem and allows all of these functions to be inlined again:

$ gdc-11.2.0 -g -O3 -frelease -fno-bounds-check -flto test.d && time ./a.out 
59618479180

real    0m2.085s
user    0m2.085s
sys     0m0.000s


But is this expected? Does GDC now require using -flto option for getting
reasonable performance starting from version 11? Or is this a real performance
regression and something can be done to improve the inlining behaviour?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
@ 2021-11-05 13:33 ` rguenth at gcc dot gnu.org
  2021-11-05 13:47 ` ibuclaw at gdcproject dot org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-05 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.3

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Inline heuristics were changed from 10 -> 11, I suspect trunk (GCC 12) is still
affected?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
  2021-11-05 13:33 ` [Bug d/102765] " rguenth at gcc dot gnu.org
@ 2021-11-05 13:47 ` ibuclaw at gdcproject dot org
  2021-12-09  2:33 ` siarhei.siamashka at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ibuclaw at gdcproject dot org @ 2021-11-05 13:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

--- Comment #2 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
D semantics for template symbols is that they must be overridable - even by
normal global symbols.

So in version 11.1, the default linkage for templates was switched over to
weak, and with that, you can't safely inline them without violating ODR.

To revert to the old behaviour, use `-fno-weak-templates`.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
  2021-11-05 13:33 ` [Bug d/102765] " rguenth at gcc dot gnu.org
  2021-11-05 13:47 ` ibuclaw at gdcproject dot org
@ 2021-12-09  2:33 ` siarhei.siamashka at gmail dot com
  2022-02-01  3:47 ` siarhei.siamashka at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2021-12-09  2:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

--- Comment #3 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> ---
Thanks for the explanations. Is there a small example, which demonstrates
templates inlining causing a real practical problem for older versions of GDC?
A link to a bugtracker, commit message, post in a mailing list, forum or any
other source of information would be very much welcome. How is LDC able to
workaround this without sacrificing templates inlining and without enforcing
the use of LTO?

Also it's good to know about `-fno-weak-templates`. If it just reverts to the
old behaviour, then it's probably somewhat less risky than `-flto` for those,
who are just upgrading from the older versions of GDC and don't want any
unexpected surprises.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
                   ` (2 preceding siblings ...)
  2021-12-09  2:33 ` siarhei.siamashka at gmail dot com
@ 2022-02-01  3:47 ` siarhei.siamashka at gmail dot com
  2022-04-21  7:50 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2022-02-01  3:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

--- Comment #4 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> ---
First of all, it's my own fault for not just bisecting the GDC code from the
day one to figure out all the relevant details many months earlier. The code is
large and takes a lot of time to compile, so I was lazy. And I apologise for
this.

Now comments from https://forum.dlang.org/thread/sspkdp$1m4n$1@digitalmars.com
provided some missing bits of important information. I may be still wrong, so
please correct me if necessary, but the root cause of this performance
regression appears to be an attempt to fix the actual problem PR104317 in GDC11
via some excessively invasive PR99914 that ended up evolving GDC in a wrong
direction.

Just imagine someone encountering something like the examples from
https://stackoverflow.com/questions/3691835/why-uninitialized-global-variable-is-weak-symbol
and then suddenly making a strange conclusion that all template functions
should be non-inlineable in a C++ compiler (unless LTO is enabled). Looks like
that's exactly what happened to GDC. The D language standard documentation is
incomplete and this isn't helping. But the developers of the other D compilers
seem to have an opinion that inlining template functions is okay (due to the
same or at least similar ODR rules as in C++).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
                   ` (3 preceding siblings ...)
  2022-02-01  3:47 ` siarhei.siamashka at gmail dot com
@ 2022-04-21  7:50 ` rguenth at gcc dot gnu.org
  2022-08-09 19:27 ` ibuclaw at gdcproject dot org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-21  7:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.3                        |11.4

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 is being released, retargeting bugs to GCC 11.4.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
                   ` (4 preceding siblings ...)
  2022-04-21  7:50 ` rguenth at gcc dot gnu.org
@ 2022-08-09 19:27 ` ibuclaw at gdcproject dot org
  2022-10-13  5:45 ` ibuclaw at gdcproject dot org
  2023-05-29 10:05 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: ibuclaw at gdcproject dot org @ 2022-08-09 19:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

--- Comment #6 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
r13-2002 (and r12-8673) is a start that sows the seeds to make the codegen
option -fno-weak-templates the default.  Should just be a case of extending the
forced emission to all instantiations too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
                   ` (5 preceding siblings ...)
  2022-08-09 19:27 ` ibuclaw at gdcproject dot org
@ 2022-10-13  5:45 ` ibuclaw at gdcproject dot org
  2023-05-29 10:05 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: ibuclaw at gdcproject dot org @ 2022-10-13  5:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

Iain Buclaw <ibuclaw at gdcproject dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |witold.baryluk+gcc at gmail dot co
                   |                            |m

--- Comment #7 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
*** Bug 107241 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug d/102765] [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code
  2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
                   ` (6 preceding siblings ...)
  2022-10-13  5:45 ` ibuclaw at gdcproject dot org
@ 2023-05-29 10:05 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-29 10:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102765

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.4                        |11.5

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.4 is being released, retargeting bugs to GCC 11.5.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-29 10:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15  6:54 [Bug d/102765] New: [11 Regression] GDC11 stopped inlining library functions and lambdas used by a binary search one-liner code siarhei.siamashka at gmail dot com
2021-11-05 13:33 ` [Bug d/102765] " rguenth at gcc dot gnu.org
2021-11-05 13:47 ` ibuclaw at gdcproject dot org
2021-12-09  2:33 ` siarhei.siamashka at gmail dot com
2022-02-01  3:47 ` siarhei.siamashka at gmail dot com
2022-04-21  7:50 ` rguenth at gcc dot gnu.org
2022-08-09 19:27 ` ibuclaw at gdcproject dot org
2022-10-13  5:45 ` ibuclaw at gdcproject dot org
2023-05-29 10:05 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).