public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113468] New: copy of large struct violates semantics of 'volatile'
@ 2024-01-18 5:52 gnu at kosak dot com
2024-01-18 6:02 ` [Bug target/113468] " pinskia at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: gnu at kosak dot com @ 2024-01-18 5:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113468
Bug ID: 113468
Summary: copy of large struct violates semantics of 'volatile'
Product: gcc
Version: 13.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gnu at kosak dot com
Target Milestone: ---
Created attachment 57132
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57132&action=edit
the .ii file from --save-temps
Hello,
I'm not 100% certain, but I'm pretty sure that the assembly generated for this
code violates the semantics for 'volatile'. The reason I think so is that it
reads and writes the same memory locations more than once, and also reads and
writes them out of order. I believe that both of these actions violate the
rules for 'volatile'.
```
struct Foo {
volatile int values[256];
};
void copy(const Foo *src, Foo *dest) {
*dest = *src;
}
```
Relevant assembly on x86_64, invoked with g++ -S -O3 test.cc
```
_Z4copyPK3FooPS_:
.LFB0:
.cfi_startproc
endbr64
movq (%rdi), %rdx
movq %rdi, %rax
movq %rsi, %rcx
movq %rdx, (%rsi)
movq 1016(%rdi), %rdx
leaq 8(%rsi), %rdi
andq $-8, %rdi
subq %rdi, %rcx
movq %rdx, 1016(%rsi)
subq %rcx, %rax
addl $1024, %ecx
movq %rax, %rsi
shrl $3, %ecx
rep movsq
ret
.cfi_endproc
```
Analysis:
The compiler wants to use "rep movsq", but Foo is only aligned to 4 bytes, so
the generated code takes extra pains to make sure that the destination register
rdi is aligned to an 8 byte boundary so that "rep movsq" can be efficient.
The trick used is to first move the first and last quadwords (regardless of
alignment), and then do a "rep movsq" of the interior, but with care taken so
that the destination is aligned to an 8-byte boundary. There is a 2x2 matrix of
cases here: src will be aligned to either 4 or 8; likewise dest will be aligned
to 4 or 8. To illustrate the problem, consider when src and dest are aligned to
4 but not 8. For the sake of the example, assume that src is 0x50000004 and
dest is 0x60000004
First, the code moves a quadword from [0x50000004] to [0x60000004]. Call this
[FIRST STEP] because we reference it below.
Then it moves a quadword from [0x50000004+1016] to [0x60000004+1016]. To my
mind this already is a problem because it violates the ordering assumptions of
'volatile'. But that's not the only problem.
Finally the code does a bit of arithmetic and masking, ultimately ending up
with rsi = 0x50000008 and rdi = 0x60000008 and then it does the "rep movsq".
My main objection is that this is a partially overlapping re-read (and
re-write) of one-half of the same quadword that was copied in [FIRST STEP],
which I believe violates the rules for volatile. If Foo were a bank of hardware
registers this might be an unwelcome access pattern.
Similar violations happen for the other cases in the 2x2 matrix.
Apologies if I'm wrong about the above. I'm certainly not a 'volatile' expert
but this strikes me as rather hinky.
I would stress that I have no objection to the code for the non-volatile case.
Output from g++ -v:
```
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 13.2.0-4ubuntu3'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Ubuntu 13.2.0-4ubuntu3)
```
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug target/113468] copy of large struct violates semantics of 'volatile'
2024-01-18 5:52 [Bug target/113468] New: copy of large struct violates semantics of 'volatile' gnu at kosak dot com
@ 2024-01-18 6:02 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-18 6:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113468
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |DUPLICATE
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 47409.
*** This bug has been marked as a duplicate of bug 47409 ***
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-01-18 6:02 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-18 5:52 [Bug target/113468] New: copy of large struct violates semantics of 'volatile' gnu at kosak dot com
2024-01-18 6:02 ` [Bug target/113468] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).