public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111522] New: Different code path for static initialization with flto
@ 2023-09-21 15:04 malat at debian dot org
  2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-21 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

            Bug ID: 111522
           Summary: Different code path for static initialization with
                    flto
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: malat at debian dot org
  Target Milestone: ---

Created attachment 55962
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55962&action=edit
c++ code

I am seeing a difference in behavior when using -flto. Consider the following:

% c++ -std=c++11  -g -O2  -o works lto.cc
% c++ -std=c++11  -g -O2 -flto -o fails lto.cc
% ./works&& echo "success"
success
% ./fails
zsh: illegal hardware instruction  ./fails

This a POWER8 system.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
@ 2023-09-21 15:53 ` pinskia at gcc dot gnu.org
  2023-09-25  7:07 ` malat at debian dot org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-21 15:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-09-21

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think this is just broken code.

It does:
#define HWY_BEFORE_NAMESPACE()                                                
\
  HWY_PUSH_ATTRIBUTES("altivec,vsx,power8-vector"                             
\
                      ",cpu=power10")

But does not do a pop before the main function.

And then you are testing on power8 which obvious does not have all of the
instructions as power10 ...
Why it works without -flto is just pure accident not using the instructions
that are not in power8.

Anyways I suspect this is too much reduced testcase. So you might need to
provide the original one.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
  2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
@ 2023-09-25  7:07 ` malat at debian dot org
  2023-09-25  7:08 ` malat at debian dot org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #2 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Andrew Pinski from comment #1)
> I think this is just broken code.
> 
> It does:
> #define HWY_BEFORE_NAMESPACE()                                              
> \
>   HWY_PUSH_ATTRIBUTES("altivec,vsx,power8-vector"                           
> \
>                       ",cpu=power10")
> 
> But does not do a pop before the main function.
> 
> And then you are testing on power8 which obvious does not have all of the
> instructions as power10 ...
> Why it works without -flto is just pure accident not using the instructions
> that are not in power8.
> 
> Anyways I suspect this is too much reduced testcase. So you might need to
> provide the original one.

I reported this one up after reading #111380. Honestly there is no "wrong-code"
here. The LTO case is simply an eager init of global variable, while the
non-LTO is a lazy loading of global var. So the original (upstream) code is
somewhat buggy as it rely on lazy init for global var.

Could someone please just confirm that eager init of global var is expected in
LTO case, we could just close this one.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
  2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
  2023-09-25  7:07 ` malat at debian dot org
@ 2023-09-25  7:08 ` malat at debian dot org
  2023-09-25  7:14 ` malat at debian dot org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #3 from Mathieu Malaterre <malat at debian dot org> ---
For reference:

*
https://github.com/google/highway/commit/fea3dba9cfec3a74ddcd8ecac3a5d4d8429191e4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (2 preceding siblings ...)
  2023-09-25  7:08 ` malat at debian dot org
@ 2023-09-25  7:14 ` malat at debian dot org
  2023-09-25  7:23 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #4 from Mathieu Malaterre <malat at debian dot org> ---
> So the original
> (upstream) code is somewhat buggy as it rely on lazy init for global var.

Those global vars are in different namespace, I actually fail to underwhat why
the definition with ",cpu=power10" gets pulled in...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (3 preceding siblings ...)
  2023-09-25  7:14 ` malat at debian dot org
@ 2023-09-25  7:23 ` pinskia at gcc dot gnu.org
  2023-09-25  9:10 ` malat at debian dot org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-25  7:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #4)
> > So the original
> > (upstream) code is somewhat buggy as it rely on lazy init for global var.
> 
> Those global vars are in different namespace, I actually fail to underwhat
> why the definition with ",cpu=power10" gets pulled in...

Because `#pragma GCC target targets_str` is global state and unrelated to
namespace ...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (4 preceding siblings ...)
  2023-09-25  7:23 ` pinskia at gcc dot gnu.org
@ 2023-09-25  9:10 ` malat at debian dot org
  2023-09-25 10:28 ` malat at debian dot org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25  9:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #6 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Mathieu Malaterre from comment #4)
> > > So the original
> > > (upstream) code is somewhat buggy as it rely on lazy init for global var.
> > 
> > Those global vars are in different namespace, I actually fail to underwhat
> > why the definition with ",cpu=power10" gets pulled in...
> 
> Because `#pragma GCC target targets_str` is global state and unrelated to
> namespace ...

Forgot to mentionned that each `#pragma GCC target` for namespace are inside
`#pragma GCC push_options` / `#pragma GCC pop_options`. This implements "per
namespace" target-specific options AFAIK.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (5 preceding siblings ...)
  2023-09-25  9:10 ` malat at debian dot org
@ 2023-09-25 10:28 ` malat at debian dot org
  2023-09-25 10:33 ` malat at debian dot org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #7 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55987
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55987&action=edit
gcc -E -P

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (6 preceding siblings ...)
  2023-09-25 10:28 ` malat at debian dot org
@ 2023-09-25 10:33 ` malat at debian dot org
  2023-09-25 10:34 ` malat at debian dot org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #8 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55988
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55988&action=edit
gcc -E -P

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (7 preceding siblings ...)
  2023-09-25 10:33 ` malat at debian dot org
@ 2023-09-25 10:34 ` malat at debian dot org
  2023-09-25 10:36 ` malat at debian dot org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #9 from Mathieu Malaterre <malat at debian dot org> ---
If you download pr111522.cc from comment #8, you should be able to reproduce
exactly the original upstream issue.

Steps:

% c++ -O2 -flto pr111522.cc  && ./a.out


vs

% c++ -O2 pr111522.cc && ./a.out

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (8 preceding siblings ...)
  2023-09-25 10:34 ` malat at debian dot org
@ 2023-09-25 10:36 ` malat at debian dot org
  2023-09-29 14:41 ` malat at debian dot org
  2023-10-16  9:07 ` linkw at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-25 10:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #10 from Mathieu Malaterre <malat at debian dot org> ---
for reference:

% c++ --verbose  -O2 -flto   base2.cc  && ./a.out
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4) 
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a-'
 /usr/libexec/gcc/powerpc64le-linux-gnu/13/cc1plus -quiet -v -imultiarch
powerpc64le-linux-gnu -D_GNU_SOURCE base2.cc -msecure-plt -quiet -dumpdir a-
-dumpbase base2.cc -dumpbase-ext .cc -O2 -version -flto
-fasynchronous-unwind-tables -o /tmp/cc1cimSD.s
GNU C++17 (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
        compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../include/powerpc64-linux-gnu/c++/13"
ignoring nonexistent directory "/usr/local/include/powerpc64le-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/include-fixed/powerpc64le-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../powerpc64le-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/13
 /usr/include/powerpc64le-linux-gnu/c++/13
 /usr/include/c++/13/backward
 /usr/lib/gcc/powerpc64le-linux-gnu/13/include
 /usr/local/include
 /usr/include/powerpc64le-linux-gnu
 /usr/include
End of search list.
Compiler executable checksum: 403ce0768541423839c6b7d8fd9dfeff
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a-'
 as -v -a64 -mpower8 -many -mlittle -o /tmp/ccFzBgtQ.o /tmp/cc1cimSD.s
GNU assembler version 2.41 (powerpc64le-linux-gnu) using BFD version (GNU
Binutils for Debian) 2.41
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a.'
 /usr/libexec/gcc/powerpc64le-linux-gnu/13/collect2 -plugin
/usr/libexec/gcc/powerpc64le-linux-gnu/13/liblto_plugin.so
-plugin-opt=/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-plugin-opt=-fresolution=/tmp/ccSvdAAw.res -plugin-opt=-pass-through=-lgcc_s
-plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -flto
--build-id --eh-frame-hdr -V -m elf64lppc --hash-style=gnu --as-needed
-dynamic-linker /lib64/ld64.so.2 -pie
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/Scrt1.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/crti.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/crtbeginS.o
-L/usr/lib/gcc/powerpc64le-linux-gnu/13
-L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu
-L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib
-L/lib/powerpc64le-linux-gnu -L/lib/../lib -L/usr/lib/powerpc64le-linux-gnu
-L/usr/lib/../lib -L/usr/lib/gcc/powerpc64le-linux-gnu/13/../../..
/tmp/ccFzBgtQ.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
/usr/lib/gcc/powerpc64le-linux-gnu/13/crtendS.o
/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/crtn.o
GNU ld (GNU Binutils for Debian) 2.41
  Supported emulations:
   elf64lppc
   elf32lppc
   elf32lppclinux
   elf32lppcsim
   elf64ppc
   elf32ppc
   elf32ppclinux
   elf32ppcsim
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie /tmp/ccFzBgtQ.o 
/usr/libexec/gcc/powerpc64le-linux-gnu/13/lto-wrapper
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie /tmp/ccFzBgtQ.o 
c++ @/tmp/ccHHJvsS
Using built-in specs.
COLLECT_GCC=c++
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4) 
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc'   '-fltrans-output-list=/tmp/ccjomOFj.ltrans.out'
'-fwpa' '-fresolution=/tmp/ccSvdAAw.res' '-flinker-output=pie' '-shared-libgcc'
 /usr/libexec/gcc/powerpc64le-linux-gnu/13/lto1 -quiet -dumpbase ./a.wpa
-msecure-plt -O2 -O2 -version -fno-openmp -fno-openacc -fPIC
-fcf-protection=none -fasynchronous-unwind-tables
-fltrans-output-list=/tmp/ccjomOFj.ltrans.out -fwpa
-fresolution=/tmp/ccSvdAAw.res -flinker-output=pie @/tmp/ccu56D0B
GNU GIMPLE (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
        compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/../lib/:/lib/../lib/powerpc64le-linux-gnu/:/lib/../lib/../lib/:/usr/lib/../lib/powerpc64le-linux-gnu/:/usr/lib/../lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc'   '-fltrans-output-list=/tmp/ccjomOFj.ltrans.out'
'-fwpa' '-fresolution=/tmp/ccSvdAAw.res' '-flinker-output=pie' '-shared-libgcc'
'-dumpdir' './a.wpa.'
c++ @/tmp/ccHUbBpN
Using built-in specs.
COLLECT_GCC=c++
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: powerpc64le-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4'
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=powerpc64le-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --with-libphobos-druntime-only=yes
--enable-objc-gc=auto --enable-secureplt --enable-targets=powerpcle-linux
--disable-multilib --enable-multiarch --disable-werror --with-long-double-128
--enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-13-13.2.0/debian/tmp-nvptx/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=powerpc64le-linux-gnu --host=powerpc64le-linux-gnu
--target=powerpc64le-linux-gnu --with-build-config=bootstrap-lto-lean
--enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4) 
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc'   '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc'
 /usr/libexec/gcc/powerpc64le-linux-gnu/13/lto1 -quiet -dumpbase
./a.ltrans0.ltrans -msecure-plt -O2 -O2 -version -fno-openmp -fno-openacc -fPIC
-fcf-protection=none -fasynchronous-unwind-tables -fltrans @/tmp/ccli6r4u -o
/tmp/ccOP9VCS.s
GNU GIMPLE (Debian 13.2.0-4) version 13.2.0 (powerpc64le-linux-gnu)
        compiled by GNU C version 13.2.0, GMP version 6.3.0, MPFR version
4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc'   '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc'
 as -v -a64 -mpower8 -many -mlittle -o /tmp/ccjomOFj.ltrans0.ltrans.o
/tmp/ccOP9VCS.s
GNU assembler version 2.41 (powerpc64le-linux-gnu) using BFD version (GNU
Binutils for Debian) 2.41
COMPILER_PATH=/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/13/:/usr/libexec/gcc/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/../lib/:/lib/../lib/powerpc64le-linux-gnu/:/lib/../lib/../lib/:/usr/lib/../lib/powerpc64le-linux-gnu/:/usr/lib/../lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../powerpc64le-linux-gnu/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../../lib/:/lib/powerpc64le-linux-gnu/:/lib/../lib/:/usr/lib/powerpc64le-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/:/usr/lib/gcc/powerpc64le-linux-gnu/13/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-fno-openmp' '-fno-openacc' '-fPIC'
'-fcf-protection=none' '-msecure-plt' '-O2' '-fasynchronous-unwind-tables' '-v'
'-O2' '-shared-libgcc'   '-fltrans' '-o' '/tmp/ccjomOFj.ltrans0.ltrans.o'
'-shared-libgcc' '-dumpdir' './a.ltrans0.ltrans.'
COLLECT_GCC_OPTIONS='-v' '-O2' '-flto' '-shared-libgcc' '-dumpdir' 'a.'
zsh: illegal hardware instruction  ./a.out

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (9 preceding siblings ...)
  2023-09-25 10:36 ` malat at debian dot org
@ 2023-09-29 14:41 ` malat at debian dot org
  2023-10-16  9:07 ` linkw at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: malat at debian dot org @ 2023-09-29 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

--- Comment #11 from Mathieu Malaterre <malat at debian dot org> ---
Here is a dead simple reduced version:

```
% cat pr111522.cc
#include <iostream>
#include <cstring>
#pragma GCC push_options
#pragma GCC target "cpu=power10"
float BitCast(int in) {
  float out;
  memcpy(&out, &in, sizeof(out));
  return out;
}
float kNearOneF = BitCast(1065353215);
#pragma GCC pop_options
int main() { std::cout << kNearOneF << std::endl; }
```

You can compare:

g++ -o works -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors

vs

g++ -o fails -flto -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors

For some reason, `-flto` rightfully generates a `xxspltidp` instruction:

(gdb) display/i $pc
1: x/i $pc
=> 0x100000940 <_Z7BitCasti.constprop.0>:       xxspltidp vs1,1065353215

I am not sure I understand the behavior of the non LTO case now...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/111522] Different code path for static initialization with flto
  2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
                   ` (10 preceding siblings ...)
  2023-09-29 14:41 ` malat at debian dot org
@ 2023-10-16  9:07 ` linkw at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-16  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
                 CC|                            |rguenth at gcc dot gnu.org
             Status|WAITING                     |RESOLVED

--- Comment #12 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #11)
> Here is a dead simple reduced version:
> 
> ```
> % cat pr111522.cc
> #include <iostream>
> #include <cstring>
> #pragma GCC push_options
> #pragma GCC target "cpu=power10"
> float BitCast(int in) {
>   float out;
>   memcpy(&out, &in, sizeof(out));
>   return out;
> }
> float kNearOneF = BitCast(1065353215);
> #pragma GCC pop_options
> int main() { std::cout << kNearOneF << std::endl; }
> ```
> 
> You can compare:
> 
> g++ -o works -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
> 
> vs
> 
> g++ -o fails -flto -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
> 
> For some reason, `-flto` rightfully generates a `xxspltidp` instruction:
> 
> (gdb) display/i $pc
> 1: x/i $pc
> => 0x100000940 <_Z7BitCasti.constprop.0>:       xxspltidp vs1,1065353215
> 
> I am not sure I understand the behavior of the non LTO case now...

I think this is a test issue. The given source code claims it wants to compile
the function BitCast with -mcpu=power10, it's valid to generate power10 insns
for it and its specialized ones.

Without LTO, no power10 insn helps the general BitCast, so the generated insns
looks like:

0000000010000b10 <_Z7BitCasti>:
    10000b10:   c6 07 69 78     rldicr  r9,r3,32,31
    10000b14:   66 01 29 7c     mtfprd  f1,r9
    10000b18:   2c 0d 20 f0     xscvspdpn vs1,vs1
    10000b1c:   20 00 80 4e     blr

while with LTO, function versioning is able to create one specialized function
with fixed argument 1065353215, then the newly created one is able to leverage
power10 insn so we have:

// specialized with const argument propagate 
0000000010000840 <_Z7BitCasti.constprop.0>:
    10000840:   7f 3f 00 05     xxspltidp vs1,1065353215
    10000844:   ff ff 24 80
    10000848:   20 00 80 4e     blr

while the global variable initialization still uses power8 insns:

0000000010000940 <_GLOBAL__sub_I__Z7BitCasti>:
    10000940:   02 10 40 3c     lis     r2,4098
    10000944:   00 7f 42 38     addi    r2,r2,32512
    10000948:   a6 02 08 7c     mflr    r0
    1000094c:   10 00 01 f8     std     r0,16(r1)
    10000950:   e1 ff 21 f8     stdu    r1,-32(r1)
    10000954:   dd fe ff 4b     bl      10000830 <00000184.long_branch.184:6>
    10000958:   18 00 41 e8     ld      r2,24(r1)
    1000095c:   20 00 21 38     addi    r1,r1,32
    10000960:   00 00 00 60     nop
    10000964:   10 00 01 e8     ld      r0,16(r1)
    10000968:   5c 81 22 d0     stfs    f1,-32420(r2)
    1000096c:   a6 03 08 7c     mtlr    r0
    10000970:   20 00 80 4e     blr

If we specify -mcpu=power10 -flto, we can see _GLOBAL__sub_I__Z7BitCasti will
directly adopts p10 insns (it implicitly indicates that with the default
-mcpu=power8, inlining considers it's unsafe to inline _Z7BitCasti.constprop.0)

0000000010000900 <_GLOBAL__sub_I__Z7BitCasti>:
    10000900:   7f 3f 00 05     xxspltidp vs0,1065353215
    10000904:   ff ff 04 80
    10000908:   01 00 10 06     pstfs   f0,128852       # 1002005c <kNearOneF>
    1000090c:   54 f7 00 d0
    10000910:   20 00 80 4e     blr

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-10-16  9:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-21 15:04 [Bug target/111522] New: Different code path for static initialization with flto malat at debian dot org
2023-09-21 15:53 ` [Bug target/111522] " pinskia at gcc dot gnu.org
2023-09-25  7:07 ` malat at debian dot org
2023-09-25  7:08 ` malat at debian dot org
2023-09-25  7:14 ` malat at debian dot org
2023-09-25  7:23 ` pinskia at gcc dot gnu.org
2023-09-25  9:10 ` malat at debian dot org
2023-09-25 10:28 ` malat at debian dot org
2023-09-25 10:33 ` malat at debian dot org
2023-09-25 10:34 ` malat at debian dot org
2023-09-25 10:36 ` malat at debian dot org
2023-09-29 14:41 ` malat at debian dot org
2023-10-16  9:07 ` linkw at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).