public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111522] New: Different code path for static initialization with flto
@ 2023-09-21 15:04 malat at debian dot org
2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-21 15:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
Bug ID: 111522
Summary: Different code path for static initialization with
flto
Product: gcc
Version: 13.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: malat at debian dot org
Target Milestone: ---
Created attachment 55962
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55962&action=edit
c++ code
I am seeing a difference in behavior when using -flto. Consider the following:
% c++ -std=c++11 -g -O2 -o works lto.cc
% c++ -std=c++11 -g -O2 -flto -o fails lto.cc
% ./works&& echo "success"
success
% ./fails
zsh: illegal hardware instruction ./fails
This a POWER8 system.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
@ 2023-09-21 15:53 ` pinskia at gcc dot gnu.org
2023-09-25 7:07 ` malat at debian dot org
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-21 15:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Ever confirmed|0 |1
Last reconfirmed| |2023-09-21
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think this is just broken code.
It does:
#define HWY_BEFORE_NAMESPACE()
\
HWY_PUSH_ATTRIBUTES("altivec,vsx,power8-vector"
\
",cpu=power10")
But does not do a pop before the main function.
And then you are testing on power8 which obvious does not have all of the
instructions as power10 ...
Why it works without -flto is just pure accident not using the instructions
that are not in power8.
Anyways I suspect this is too much reduced testcase. So you might need to
provide the original one.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
@ 2023-09-25 7:07 ` malat at debian dot org
2023-09-25 7:08 ` malat at debian dot org
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 7:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #2 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Andrew Pinski from comment #1)
> I think this is just broken code.
>
> It does:
> #define HWY_BEFORE_NAMESPACE()
> \
> HWY_PUSH_ATTRIBUTES("altivec,vsx,power8-vector"
> \
> ",cpu=power10")
>
> But does not do a pop before the main function.
>
> And then you are testing on power8 which obvious does not have all of the
> instructions as power10 ...
> Why it works without -flto is just pure accident not using the instructions
> that are not in power8.
>
> Anyways I suspect this is too much reduced testcase. So you might need to
> provide the original one.
I reported this one up after reading #111380. Honestly there is no "wrong-code"
here. The LTO case is simply an eager init of global variable, while the
non-LTO is a lazy loading of global var. So the original (upstream) code is
somewhat buggy as it rely on lazy init for global var.
Could someone please just confirm that eager init of global var is expected in
LTO case, we could just close this one.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
2023-09-25 7:07 ` malat at debian dot org
@ 2023-09-25 7:08 ` malat at debian dot org
2023-09-25 7:14 ` malat at debian dot org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 7:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #3 from Mathieu Malaterre <malat at debian dot org> ---
For reference:
*
https://github.com/google/highway/commit/fea3dba9cfec3a74ddcd8ecac3a5d4d8429191e4
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (2 preceding siblings ...)
2023-09-25 7:08 ` malat at debian dot org
@ 2023-09-25 7:14 ` malat at debian dot org
2023-09-25 7:23 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 7:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #4 from Mathieu Malaterre <malat at debian dot org> ---
> So the original
> (upstream) code is somewhat buggy as it rely on lazy init for global var.
Those global vars are in different namespace, I actually fail to underwhat why
the definition with ",cpu=power10" gets pulled in...
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (3 preceding siblings ...)
2023-09-25 7:14 ` malat at debian dot org
@ 2023-09-25 7:23 ` pinskia at gcc dot gnu.org
2023-09-25 9:10 ` malat at debian dot org
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-25 7:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #4)
> > So the original
> > (upstream) code is somewhat buggy as it rely on lazy init for global var.
>
> Those global vars are in different namespace, I actually fail to underwhat
> why the definition with ",cpu=power10" gets pulled in...
Because `#pragma GCC target targets_str` is global state and unrelated to
namespace ...
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (4 preceding siblings ...)
2023-09-25 7:23 ` pinskia at gcc dot gnu.org
@ 2023-09-25 9:10 ` malat at debian dot org
2023-09-25 10:28 ` malat at debian dot org
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 9:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #6 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Mathieu Malaterre from comment #4)
> > > So the original
> > > (upstream) code is somewhat buggy as it rely on lazy init for global var.
> >
> > Those global vars are in different namespace, I actually fail to underwhat
> > why the definition with ",cpu=power10" gets pulled in...
>
> Because `#pragma GCC target targets_str` is global state and unrelated to
> namespace ...
Forgot to mentionned that each `#pragma GCC target` for namespace are inside
`#pragma GCC push_options` / `#pragma GCC pop_options`. This implements "per
namespace" target-specific options AFAIK.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (5 preceding siblings ...)
2023-09-25 9:10 ` malat at debian dot org
@ 2023-09-25 10:28 ` malat at debian dot org
2023-09-25 10:33 ` malat at debian dot org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #7 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55987
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55987&action=edit
gcc -E -P
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (6 preceding siblings ...)
2023-09-25 10:28 ` malat at debian dot org
@ 2023-09-25 10:33 ` malat at debian dot org
2023-09-25 10:34 ` malat at debian dot org
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #8 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55988
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55988&action=edit
gcc -E -P
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (7 preceding siblings ...)
2023-09-25 10:33 ` malat at debian dot org
@ 2023-09-25 10:34 ` malat at debian dot org
2023-09-25 10:36 ` malat at debian dot org
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #9 from Mathieu Malaterre <malat at debian dot org> ---
If you download pr111522.cc from comment #8, you should be able to reproduce
exactly the original upstream issue.
Steps:
% c++ -O2 -flto pr111522.cc && ./a.out
vs
% c++ -O2 pr111522.cc && ./a.out
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (8 preceding siblings ...)
2023-09-25 10:34 ` malat at debian dot org
@ 2023-09-25 10:36 ` malat at debian dot org
2023-09-29 14:41 ` malat at debian dot org
2023-10-16 9:07 ` linkw at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #10 from Mathieu Malaterre <malat at debian dot org> ---
for reference:
% c++ --verbose -O2 -flto base2.cc && ./a.out
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4)
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a-'
/usr/libexec/gcc/powerpc64le-linux-gnu/13/cc1plus -quiet -v -imultiarch
powerpc64le-linux-gnu -D_GNU_SOURCE base2.cc -msecure-plt -quiet -dumpdir a-
-dumpbase base2.cc -dumpbase-ext .cc -O2 -version -flto
-fasynchronous-unwind-tables -o /tmp/cc1cimSD.s
GNU C++17 (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../include/powerpc64-linux-gnu/c++/13"
ignoring nonexistent directory "/usr/local/include/powerpc64le-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/include-fixed/powerpc64le-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../powerpc64le-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/13
/usr/include/powerpc64le-linux-gnu/c++/13
/usr/include/c++/13/backward
/usr/lib/gcc/powerpc64le-linux-gnu/13/include
/usr/local/include
/usr/include/powerpc64le-linux-gnu
/usr/include
End of search list.
Compiler executable checksum: 403ce0768541423839c6b7d8fd9dfeff
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a-'
as -v -a64 -mpower8 -many -mlittle -o /tmp/ccFzBgtQ.o /tmp/cc1cimSD.s
GNU assembler version 2.41 (powerpc64le-linux-gnu) using BFD version (GNU
Binutils for Debian) 2.41
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a.'
/usr/libexec/gcc/powerpc64le-linux-gnu/13/collect2 -plugin
/usr/libexec/gcc/powerpc64le-linux-gnu/13/liblto_plugin.so
-plugin-opt=/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-plugin-opt=-fresolution=/tmp/ccSvdAAw.res -plugin-opt=-pass-through=-lgcc_s
-plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -flto
--build-id --eh-frame-hdr -V -m elf64lppc --hash-style=gnu --as-needed
-dynamic-linker /lib64/ld64.so.2 -pie
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/Scrt1.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/crti.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/crtbeginS.o
-L/usr/lib/gcc/powerpc64le-linux-gnu/13
-L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu
-L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib
-L/lib/powerpc64le-linux-gnu -L/lib/../lib -L/usr/lib/powerpc64le-linux-gnu
-L/usr/lib/../lib -L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../..
/tmp/ccFzBgtQ.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
/usr/lib/gcc/powerpc64le-linux-gnu/13/crtendS.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/crtn.o
GNU ld (GNU Binutils for Debian) 2.41
Supported emulations:
elf64lppc
elf32lppc
elf32lppclinux
elf32lppcsim
elf64ppc
elf32ppc
elf32ppclinux
elf32ppcsim
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie /tmp/ccFzBgtQ.o
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie /tmp/ccFzBgtQ.o
c++ @/tmp/ccHHJvsS
Using built-in specs.
COLLECT_GCC=c++
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4)
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc' '-fltrans-output-list=/tmp/ccjomOFj.ltrans.out'
'-fwpa' '-fresolution=/tmp/ccSvdAAw.res' '-flinker-output=pie' '-shared-libgcc'
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto1 -quiet -dumpbase ./a.wpa
-msecure-plt -O2 -O2 -version -fno-openmp -fno-openacc -fPIC
-fcf-protection=none -fasynchronous-unwind-tables
-fltrans-output-list=/tmp/ccjomOFj.ltrans.out -fwpa
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie @/tmp/ccu56D0B
GNU GIMPLE (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/../lib/:/lib/../lib/powerpc64le-linux-gnu/:/lib/../lib/../lib/:/usr/lib/../lib/powerpc64le-linux-gnu/:/usr/lib/../lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc' '-fltrans-output-list=/tmp/ccjomOFj.ltrans.out'
'-fwpa' '-fresolution=/tmp/ccSvdAAw.res' '-flinker-output=pie' '-shared-libgcc'
'-dumpdir' './a.wpa.'
c++ @/tmp/ccHUbBpN
Using built-in specs.
COLLECT_GCC=c++
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4)
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc' '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc'
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto1 -quiet -dumpbase
./a.ltrans0.ltrans -msecure-plt -O2 -O2 -version -fno-openmp -fno-openacc -fPIC
-fcf-protection=none -fasynchronous-unwind-tables -fltrans @/tmp/ccli6r4u -o
/tmp/ccOP9VCS.s
GNU GIMPLE (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc' '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc'
as -v -a64 -mpower8 -many -mlittle -o /tmp/ccjomOFj.ltrans0.ltrans.o
/tmp/ccOP9VCS.s
GNU assembler version 2.41 (powerpc64le-linux-gnu) using BFD version (GNU
Binutils for Debian) 2.41
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/../lib/:/lib/../lib/powerpc64le-linux-gnu/:/lib/../lib/../lib/:/usr/lib/../lib/powerpc64le-linux-gnu/:/usr/lib/../lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc' '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc' '-dumpdir' './a.ltrans0.ltrans.'
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a.'
zsh: illegal hardware instruction ./a.out
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (9 preceding siblings ...)
2023-09-25 10:36 ` malat at debian dot org
@ 2023-09-29 14:41 ` malat at debian dot org
2023-10-16 9:07 ` linkw at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-29 14:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
--- Comment #11 from Mathieu Malaterre <malat at debian dot org> ---
Here is a dead simple reduced version:
```
% cat pr111522.cc
#include <iostream>
#include <cstring>
#pragma GCC push_options
#pragma GCC target "cpu=power10"
float BitCast(int in) {
float out;
memcpy(&out, &in, sizeof(out));
return out;
}
float kNearOneF = BitCast(1065353215);
#pragma GCC pop_options
int main() { std::cout << kNearOneF << std::endl; }
```
You can compare:
g++ -o works -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
vs
g++ -o fails -flto -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
For some reason, `-flto` rightfully generates a `xxspltidp` instruction:
(gdb) display/i $pc
1: x/i $pc
=> 0x100000940 <_Z7BitCasti.constprop.0>: xxspltidp vs1,1065353215
I am not sure I understand the behavior of the non LTO case now...
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/111522] Different code path for static initialization with flto
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
` (10 preceding siblings ...)
2023-09-29 14:41 ` malat at debian dot org
@ 2023-10-16 9:07 ` linkw at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-16 9:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522
Kewen Lin <linkw at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
CC| |rguenth at gcc dot gnu.org
Status|WAITING |RESOLVED
--- Comment #12 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #11)
> Here is a dead simple reduced version:
>
> ```
> % cat pr111522.cc
> #include <iostream>
> #include <cstring>
> #pragma GCC push_options
> #pragma GCC target "cpu=power10"
> float BitCast(int in) {
> float out;
> memcpy(&out, &in, sizeof(out));
> return out;
> }
> float kNearOneF = BitCast(1065353215);
> #pragma GCC pop_options
> int main() { std::cout << kNearOneF << std::endl; }
> ```
>
> You can compare:
>
> g++ -o works -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
>
> vs
>
> g++ -o fails -flto -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
>
> For some reason, `-flto` rightfully generates a `xxspltidp` instruction:
>
> (gdb) display/i $pc
> 1: x/i $pc
> => 0x100000940 <_Z7BitCasti.constprop.0>: xxspltidp vs1,1065353215
>
> I am not sure I understand the behavior of the non LTO case now...
I think this is a test issue. The given source code claims it wants to compile
the function BitCast with -mcpu=power10, it's valid to generate power10 insns
for it and its specialized ones.
Without LTO, no power10 insn helps the general BitCast, so the generated insns
looks like:
0000000010000b10 <_Z7BitCasti>:
10000b10: c6 07 69 78 rldicr r9,r3,32,31
10000b14: 66 01 29 7c mtfprd f1,r9
10000b18: 2c 0d 20 f0 xscvspdpn vs1,vs1
10000b1c: 20 00 80 4e blr
while with LTO, function versioning is able to create one specialized function
with fixed argument 1065353215, then the newly created one is able to leverage
power10 insn so we have:
// specialized with const argument propagate
0000000010000840 <_Z7BitCasti.constprop.0>:
10000840: 7f 3f 00 05 xxspltidp vs1,1065353215
10000844: ff ff 24 80
10000848: 20 00 80 4e blr
while the global variable initialization still uses power8 insns:
0000000010000940 <_GLOBAL__sub_I__Z7BitCasti>:
10000940: 02 10 40 3c lis r2,4098
10000944: 00 7f 42 38 addi r2,r2,32512
10000948: a6 02 08 7c mflr r0
1000094c: 10 00 01 f8 std r0,16(r1)
10000950: e1 ff 21 f8 stdu r1,-32(r1)
10000954: dd fe ff 4b bl 10000830 <00000184.long_branch.184:6>
10000958: 18 00 41 e8 ld r2,24(r1)
1000095c: 20 00 21 38 addi r1,r1,32
10000960: 00 00 00 60 nop
10000964: 10 00 01 e8 ld r0,16(r1)
10000968: 5c 81 22 d0 stfs f1,-32420(r2)
1000096c: a6 03 08 7c mtlr r0
10000970: 20 00 80 4e blr
If we specify -mcpu=power10 -flto, we can see _GLOBAL__sub_I__Z7BitCasti will
directly adopts p10 insns (it implicitly indicates that with the default
-mcpu=power8, inlining considers it's unsafe to inline _Z7BitCasti.constprop.0)
0000000010000900 <_GLOBAL__sub_I__Z7BitCasti>:
10000900: 7f 3f 00 05 xxspltidp vs0,1065353215
10000904: ff ff 04 80
10000908: 01 00 10 06 pstfs f0,128852 # 1002005c <kNearOneF>
1000090c: 54 f7 00 d0
10000910: 20 00 80 4e blr
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-10-16 9:07 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
2023-09-25 7:07 ` malat at debian dot org
2023-09-25 7:08 ` malat at debian dot org
2023-09-25 7:14 ` malat at debian dot org
2023-09-25 7:23 ` pinskia at gcc dot gnu.org
2023-09-25 9:10 ` malat at debian dot org
2023-09-25 10:28 ` malat at debian dot org
2023-09-25 10:33 ` malat at debian dot org
2023-09-25 10:34 ` malat at debian dot org
2023-09-25 10:36 ` malat at debian dot org
2023-09-29 14:41 ` malat at debian dot org
2023-10-16 9:07 ` linkw at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).