public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107895] New: mt19937 bad performance on LP64
@ 2022-11-28 10:36 jengelh at inai dot de
2022-11-28 11:13 ` [Bug tree-optimization/107895] " jengelh at inai dot de
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: jengelh at inai dot de @ 2022-11-28 10:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107895
Bug ID: 107895
Summary: mt19937 bad performance on LP64
Product: gcc
Version: 12.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jengelh at inai dot de
Target Milestone: ---
Input
=====
#include <random>
#include <unistd.h>
static std::mt19937 rng;
int main() {
for (size_t j = 0; j < 0x8000; ++j) {
uint32_t numbers[65536];
for (size_t i = 0; i < std::size(numbers); ++i)
numbers[i] = rng();
// ensure number generation is not all optimized away
write(STDOUT_FILENO, numbers, sizeof(numbers));
}
}
Observed
========
Target: x86_64-suse-linux
gcc version 12.2.1 20221020 [revision 0aaef83351473e8f4eb774f8f999bbe87a4866d7]
(SUSE Linux)
$ g++ x.cpp -O2 && time ./a.out >/dev/zero
-m32 -m64
=============== ===== ======
std::mt19937 3.9s 11.5s
std::mt19937_64 14.0s 11.6s
=============== ===== ======
error ±0.1s
With -ftree-loop-if-convert [Bug #80520], but still not at -m32 levels:
+-ftree- -m32 -m64
=============== ===== ======
std::mt19937 3.9s 5.2s
std::mt19937_64 14.0s 5.4s
=============== ===== ======
error ±0.1s
Expected
========
Expected to see <= 4.7s on -m64 at all times. (3.9 + ~20% margin for wider
transfers CPU<->caches/RAM)
The -m64 versions should have somewhat equal runtime or faster runtime (because
more registers, more opportunities); concerns like https://gmplib.org/32vs64
apply to old CPUs, but I do not think it's indicative of how contemporary
x86_64 systems perform.
Additional information
======================
CPUs:
"11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz"
[fam 6 model 140 stepping 1 microcode 0xa4] and
"AMD Ryzen 7 3700X 8-Core Processor"
[fam 23 model 113 stepping 0 microcode 0x8701013]
(about 3.0 and 10.2 seconds runtime, respectively)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/107895] mt19937 bad performance on LP64
2022-11-28 10:36 [Bug tree-optimization/107895] New: mt19937 bad performance on LP64 jengelh at inai dot de
@ 2022-11-28 11:13 ` jengelh at inai dot de
2022-11-28 11:58 ` rguenth at gcc dot gnu.org
2022-11-28 13:48 ` marxin at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: jengelh at inai dot de @ 2022-11-28 11:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107895
--- Comment #1 from Jan Engelhardt <jengelh at inai dot de> ---
clang-15.0.5+gnustdlibc timing distribution.
-m32 -m64
mt19937 6.0 4.7
mt19937_64 9.2 4.7
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/107895] mt19937 bad performance on LP64
2022-11-28 10:36 [Bug tree-optimization/107895] New: mt19937 bad performance on LP64 jengelh at inai dot de
2022-11-28 11:13 ` [Bug tree-optimization/107895] " jengelh at inai dot de
@ 2022-11-28 11:58 ` rguenth at gcc dot gnu.org
2022-11-28 13:48 ` marxin at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-11-28 11:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107895
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target| |x86_64-*-*
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Sounds like an if-conversion issue, thus RTL? Btw, we inline a wrapper
doing
if (..->_M_p > 623)
mersenne_twister_engine ();
else
{
some inlined stuff, incrementing _M_p
}
but the inlined stuff is already fully if-converted and thus not the
timing critical part?
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/107895] mt19937 bad performance on LP64
2022-11-28 10:36 [Bug tree-optimization/107895] New: mt19937 bad performance on LP64 jengelh at inai dot de
2022-11-28 11:13 ` [Bug tree-optimization/107895] " jengelh at inai dot de
2022-11-28 11:58 ` rguenth at gcc dot gnu.org
@ 2022-11-28 13:48 ` marxin at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-11-28 13:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107895
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
CC| |marxin at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed| |2022-11-28
--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
Btw. the benchmark was sped up by r13-739-g793f847ba7dbe763 from 8.4 -> 5.3s on
AMD Ryzen 9 5950X 16-Core Processor with -O2 options.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-28 13:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28 10:36 [Bug tree-optimization/107895] New: mt19937 bad performance on LP64 jengelh at inai dot de
2022-11-28 11:13 ` [Bug tree-optimization/107895] " jengelh at inai dot de
2022-11-28 11:58 ` rguenth at gcc dot gnu.org
2022-11-28 13:48 ` marxin at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).