public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/110879] New: Unnecessary reread from memory in a loop
@ 2023-08-02 19:26 palevichva at gmail dot com
2023-08-03 11:42 ` [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: palevichva at gmail dot com @ 2023-08-02 19:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
Bug ID: 110879
Summary: Unnecessary reread from memory in a loop
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: palevichva at gmail dot com
Target Milestone: ---
Created attachment 55678
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55678&action=edit
preprocessed file by g++ from revision dd2eb972a
I've found a strange regression in optimization. Trunk version of g++ produces
less optimal assembly. It rereads same memory location in every iteration of a
loop. More specifically, it rereads fields _M_finish and _M_end_of_storage of a
vector from memory every push_back call, although it is not necessary.
Released version 13.2 doesn't do that, and just uses values from registers.
I'm compiling following code:
#include <vector>
std::vector<int> f(std::size_t n) {
std::vector<int> res;
res.reserve(n);
for (std::size_t i = 0; i < n; ++i) {
res.push_back(i*i);
}
return res;
}
The main body of a loop looks like this:
~/.local/gcc/bin/g++ -S -fverbose-asm -O3 -std=c++20 pb.cpp
>.L41:
># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/stl_construct.h:97: { return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); }
> movl %r15d, (%rbx) # _3, *prephitmp_51
># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:119: ++this->_M_impl._M_finish;
> addq $4, %rbx #, tmp135
> movq %rbx, 8(%rbp) # tmp135, res_8(D)->D.35756._M_impl.D.35067._M_finish
>.L8:
># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) {
> addq $1, %r13 #, i
># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) {
> cmpq %r13, %r12 # i, n
> je .L1 #,
># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
> movq 8(%rbp), %rbx # res_8(D)->D.35756._M_impl.D.35067._M_finish, prephitmp_51
># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
> movq 16(%rbp), %rax # res_8(D)->D.35756._M_impl.D.35067._M_end_of_storage, pretmp_52
>.L16:
># pb.cpp:7: res.push_back(i*i);
> movl %r13d, %r15d # i, _3
> imull %r13d, %r15d # i, _3
># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
> cmpq %rax, %rbx # pretmp_52, prephitmp_51
> jne .L41 #,
Same loop as produced by 13.2:
~/.local/gcc-13.2/bin/g++ -v -S -fverbose-asm -O3 -std=c++20 pb.cpp
>.L43:
># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/stl_construct.h:97: { return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); }
> movl %r12d, (%rcx) # _3, *prephitmp_4
># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/vector.tcc:119: ++this->_M_impl._M_finish;
> addq $4, %rcx #, prephitmp_4
> movq %rcx, 8(%rbp) # prephitmp_4, res_8(D)->D.35699._M_impl.D.35010._M_finish
>.L8:
># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) {
> addq $1, %rbx #, i
># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) {
> cmpq %rbx, %r13 # i, n
> je .L1 #,
>.L18:
># pb.cpp:7: res.push_back(i*i);
> movl %ebx, %r12d # i, _3
> imull %ebx, %r12d # i, _3
># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/vector.tcc:114: if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
> cmpq %r8, %rcx # prephitmp_74, prephitmp_4
> jne .L43 #,
Notice this extra commands in the first snippet:
movq 8(%rbp), %rbx
movq 16(%rbp), %rax
I've bisected this problem to the commit dd2eb972a (libstdc++: Use RAII in
std::vector::_M_realloc_insert)
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=dd2eb972a5b063e10c83878d5c9336a818fa8291).
It doesn't look like commit is the problem. Code looks pretty equivalent. But
for some reason compiler produces different result.
I'm using version built from aforementioned commit dd2eb972a:
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-languages=c++ --disable-multilib
--prefix=/home/scaiper/.local/gcc
gcc version 14.0.0 20230623 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-v' '-S' '-fverbose-asm' '-O3' '-std=c++20'
'-shared-libgcc' '-mtune=generic' '-march=x86-64'
Comparing with 13.2:
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-languages=c++ --disable-multilib
--prefix=/home/scaiper/.local/gcc-13.2
gcc version 13.2.0 (GCC)
COLLECT_GCC_OPTIONS='-v' '-S' '-fverbose-asm' '-O3' '-std=c++20'
'-shared-libgcc' '-mtune=generic' '-march=x86-64'
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
@ 2023-08-03 11:42 ` rguenth at gcc dot gnu.org
2023-09-01 15:02 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-03 11:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |14.0
Summary|Unnecessary reread from |[14 Regression] Unnecessary
|memory in a loop |reread from memory in a
| |loop with std::vector
Component|c++ |libstdc++
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
2023-08-03 11:42 ` [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector rguenth at gcc dot gnu.org
@ 2023-09-01 15:02 ` cvs-commit at gcc dot gnu.org
2023-09-01 16:12 ` redi at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-09-01 15:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
--- Comment #1 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:419c423d3aeca754e47e1ce1bf707735603a90a3
commit r14-3627-g419c423d3aeca754e47e1ce1bf707735603a90a3
Author: Vladimir Palevich <palevichva@gmail.com>
Date: Wed Aug 9 01:34:05 2023 +0300
libstdc++: fix memory clobbering in std::vector [PR110879]
Fix ordering to prevent clobbering of class members by a call to deallocate
in _M_realloc_insert and _M_default_append.
Because of recent changes in _M_realloc_insert and _M_default_append,
calls to deallocate were ordered after assignment to class members of
std::vector (in the guard destructor), which is causing said members to
be call-clobbered. This is preventing further optimization, the
compiler is unable to move memory read out of a hot loop in this case.
This patch reorders the call to before assignments by putting guard in
its own block. Plus a new testsuite for this case. I'm not very happy
with the new testsuite, but I don't know how to properly test this.
PR libstdc++/110879
libstdc++-v3/ChangeLog:
* include/bits/vector.tcc (_M_realloc_insert): End guard
lifetime just before assignment to class members.
(_M_default_append): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/pr110879.C: New test.
Signed-off-by: Vladimir Palevich <palevichva@gmail.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
2023-08-03 11:42 ` [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector rguenth at gcc dot gnu.org
2023-09-01 15:02 ` cvs-commit at gcc dot gnu.org
@ 2023-09-01 16:12 ` redi at gcc dot gnu.org
2023-11-22 8:53 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-09-01 16:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
This should be fixed now. Thanks for the report and the patch.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
` (2 preceding siblings ...)
2023-09-01 16:12 ` redi at gcc dot gnu.org
@ 2023-11-22 8:53 ` jakub at gcc dot gnu.org
2023-11-22 11:27 ` redi at gcc dot gnu.org
2023-11-23 10:58 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-11-22 8:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The new testcase FAILs on both x86_64-linux and i686-linux with C++98:
FAIL: g++.dg/opt/pr110879.C -std=gnu++98 scan-tree-dump-not optimized
"=\\s*\\S*res_(?!\\S*_M_end_of_storage;)"
Is it expected (if so, the test should have { target c++11 } either on dg-do or
on dg-final), or not?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
` (3 preceding siblings ...)
2023-11-22 8:53 ` jakub at gcc dot gnu.org
@ 2023-11-22 11:27 ` redi at gcc dot gnu.org
2023-11-23 10:58 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-11-22 11:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Ah I think that's probably expected. In _M_realloc_insert (and now
_M_realloc_append) we have:
#if __cplusplus >= 201103L
if _GLIBCXX17_CONSTEXPR (_S_use_relocate())
{
// Relocation cannot throw.
__new_finish = _S_relocate(__old_start, __position.base(),
__new_start, _M_get_Tp_allocator());
++__new_finish;
__new_finish = _S_relocate(__position.base(), __old_finish,
__new_finish, _M_get_Tp_allocator());
}
else
#endif
and then an alternative path used for non-trivial types and for C++98. That
alternative path does more work and probably can't be optimized as well, so the
reads from _M_end_of_storage aren't optimized out.
I think we can just use { target c++11 } for the test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
` (4 preceding siblings ...)
2023-11-22 11:27 ` redi at gcc dot gnu.org
@ 2023-11-23 10:58 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-23 10:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:256d64b3460807f33ff2f70e676801bdcf908c1c
commit r14-5774-g256d64b3460807f33ff2f70e676801bdcf908c1c
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Wed Nov 22 21:24:08 2023 +0000
c++: Require C++11 for g++.dg/opt/pr110879.C [PR110879]
The _M_realloc_insert member does not have the trivial relocation
optimization for C++98, which seems to be why the _M_end_of_storage
member does not get optimized away. Make this test unsupported for
C++98.
gcc/testsuite/ChangeLog:
PR libstdc++/110879
* g++.dg/opt/pr110879.C: Require C++11 or later.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-11-23 10:58 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-02 19:26 [Bug c++/110879] New: Unnecessary reread from memory in a loop palevichva at gmail dot com
2023-08-03 11:42 ` [Bug libstdc++/110879] [14 Regression] Unnecessary reread from memory in a loop with std::vector rguenth at gcc dot gnu.org
2023-09-01 15:02 ` cvs-commit at gcc dot gnu.org
2023-09-01 16:12 ` redi at gcc dot gnu.org
2023-11-22 8:53 ` jakub at gcc dot gnu.org
2023-11-22 11:27 ` redi at gcc dot gnu.org
2023-11-23 10:58 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).