public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114676] New: [12/13/14 Regression] DSE removes assignment that is used later
@ 2024-04-10 10:52 aleksei.nikiforov at linux dot ibm.com
  2024-04-10 11:06 ` [Bug target/114676] " pinskia at gcc dot gnu.org
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: aleksei.nikiforov at linux dot ibm.com @ 2024-04-10 10:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114676

            Bug ID: 114676
           Summary: [12/13/14 Regression] DSE removes assignment that is
                    used later
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: aleksei.nikiforov at linux dot ibm.com
  Target Milestone: ---

Created attachment 57916
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57916&action=edit
GridSamplerKernel.cpp.ZVECTOR.cpp.o.prep2.cpp.bz2

When building pytorch on s390x with gcc >= 12, resulting pytorch application
crashes in some tests. It doesn't happen with gcc <= 11. I've bisected gcc, and
issue first appears with gcc commit 32955416d8040b1fa1ba21cd4179b3264e6c5bd6.
I've also found in which object file miscompilation happens.

gcc configuration:
/bin/sh /var/tmp/portage/sys-devel/gcc-12.3.9999/work/gcc-12.3.9999/configure
--host=s390x-ibm-linux-gnu --build=s390x-ibm-linux-gnu --prefix=/usr
--bindir=/usr/s390x-ibm-linux-gnu/gcc-bin/12
--includedir=/usr/lib/gcc/s390x-ibm-linux-gnu/12/include
--datadir=/usr/share/gcc-data/s390x-ibm-linux-gnu/12
--mandir=/usr/share/gcc-data/s390x-ibm-linux-gnu/12/man
--infodir=/usr/share/gcc-data/s390x-ibm-linux-gnu/12/info
--with-gxx-include-dir=/usr/lib/gcc/s390x-ibm-linux-gnu/12/include/g++-v12
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/s390x-ibm-linux-gnu/12/python
--enable-languages='c,c++,fortran' --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=release
--with-bugurl='https://bugs.gentoo.org/' --with-pkgversion='Gentoo 12.0.0,
commit 32955416d8040b1fa1ba21cd4179b3264e6c5bd6' --with-gcc-major-version-only
--enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--disable-multilib --disable-fixed-point --enable-libgomp --disable-libssp
--disable-libada --disable-cet --disable-systemtap
--disable-valgrind-annotations --disable-vtable-verify --disable-libvtv
--without-zstd --without-isl --disable-libsanitizer --enable-default-pie
--enable-default-ssp --with-arch=z15

I'm attaching preprocessed file. Full compilation command is:
/usr/bin/g++-12 -DAT_PER_OPERATOR_HEADERS -DCAFFE2_BUILD_MAIN_LIB
-DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1
-DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS
-DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO
-DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE
-D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS
-I/home/user/work12/pytorch/build/aten/src -I/home/user/work12/pytorch/aten/src
-I/home/user/work12/pytorch/build -I/home/user/work12/pytorch
-I/home/user/work12/pytorch/cmake/../third_party/benchmark/include
-I/home/user/work12/pytorch/third_party/onnx
-I/home/user/work12/pytorch/build/third_party/onnx
-I/home/user/work12/pytorch/third_party/foxi
-I/home/user/work12/pytorch/build/third_party/foxi
-I/home/user/work12/pytorch/torch/csrc/api
-I/home/user/work12/pytorch/torch/csrc/api/include
-I/home/user/work12/pytorch/caffe2/aten/src/TH
-I/home/user/work12/pytorch/build/caffe2/aten/src/TH
-I/home/user/work12/pytorch/build/caffe2/aten/src
-I/home/user/work12/pytorch/build/caffe2/../aten/src
-I/home/user/work12/pytorch/torch/csrc
-I/home/user/work12/pytorch/third_party/miniz-2.1.0
-I/home/user/work12/pytorch/third_party/kineto/libkineto/include
-I/home/user/work12/pytorch/third_party/kineto/libkineto/src
-I/home/user/work12/pytorch/aten/src/ATen/.. -I/home/user/work12/pytorch/c10/..
-I/home/user/work12/pytorch/third_party/FP16/include
-I/home/user/work12/pytorch/third_party/tensorpipe
-I/home/user/work12/pytorch/build/third_party/tensorpipe
-I/home/user/work12/pytorch/third_party/tensorpipe/third_party/libnop/include
-I/home/user/work12/pytorch/third_party/fmt/include
-I/home/user/work12/pytorch/third_party/flatbuffers/include -isystem
/home/user/work12/pytorch/build/third_party/gloo -isystem
/home/user/work12/pytorch/cmake/../third_party/gloo -isystem
/home/user/work12/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include
-isystem
/home/user/work12/pytorch/cmake/../third_party/googletest/googlemock/include
-isystem
/home/user/work12/pytorch/cmake/../third_party/googletest/googletest/include
-isystem /home/user/work12/pytorch/third_party/protobuf/src -isystem
/home/user/work12/pytorch/cmake/../third_party/eigen -isystem
/home/user/work12/pytorch/build/include -march=z15 -D_GLIBCXX_USE_CXX11_ABI=1
-fvisibility-inlines-hidden -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI
-DLIBKINETO_NOROCTRACER -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall
-Wextra -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits
-Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter
-Wno-unused-function -Wno-unused-result -Wno-strict-overflow
-Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi
-Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces
-fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable
-Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math
-Wno-stringop-overflow -DHAVE_ZVECTOR_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG
-std=gnu++17 -fPIC -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -Wall -Wextra
-Wdeprecated -Wno-unused-parameter -Wno-unused-function
-Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-type-limits
-Wno-array-bounds -Wno-strict-overflow -Wno-strict-aliasing
-Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -O3    -mvx -mzvector
-march=z15 -mtune=z15 -DCPU_CAPABILITY=ZVECTOR -DCPU_CAPABILITY_ZVECTOR -MD -MT
caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.ZVECTOR.cpp.o
-MF
caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.ZVECTOR.cpp.o.d
-o
caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.ZVECTOR.cpp.o
-c
/home/user/work12/pytorch/build/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.ZVECTOR.cpp


There are following lines in file around line 121590:

    integer_t mask_arr[iVec::size()];
    mask.store(mask_arr);


    scalar_t gInp_corner_arr[Vec::size()];
    delta.store(gInp_corner_arr);

    mask_scatter_add(gInp_corner_arr, data, i_gInp_offset_arr, mask_arr, len);

store call (lines 117929-117940):
  void __attribute__((__always_inline__)) inline store(void* ptr, int count =
size()) const {
    if (count == size()) {

# 421
"/home/user/work12/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h" 3
4
     __builtin_s390_vec_xst
# 421
"/home/user/work12/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h"
            (_vec0, offset0, reinterpret_cast<ElementType*>(ptr));

# 422
"/home/user/work12/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h" 3
4
     __builtin_s390_vec_xst
# 422
"/home/user/work12/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h"
            (_vec1, offset16, reinterpret_cast<ElementType*>(ptr));

mask.store(mask_arr) is first replaced by 2 corresponding calls to
__builtin_s390_vec_xst, and those are later incorrectly removed by DSE.

I've also ran compilation command with -fdump-tree-all-all -fdump-rtl-all-all.
In file *.040t.dse1 I've found following lines:

;; Function at::native::{anonymous}::ApplyGridSample<double, 2,
at::native::detail::GridSamplerInterpolation::Bicubic,
at::native::detail::GridSamplerPadding::Border, true>::add_value_bounded
(_ZNK2at6nat
ive12_GLOBAL__N_115ApplyGridSampleIdLi2ELNS0_6detail24GridSamplerInterpolationE2ELNS3_18GridSamplerPaddingE1ELb1EE17add_value_boundedEPdlRKNS_3vec7ZVECTOR10VectorizedIdvEESD_SD_,
funcdef_no=13629, decl_ui
d=274419, cgraph_uid=8075, symbol_order=9478)


Pass statistics of "dse": ----------------

  Deleted dead store: # .MEM_369 = VDEF <.MEM_368>
MEM <const vtypeD.254540> [(ElementTypeD.254545 *)&mask_arrD.383153 + 16B] =
_244;

  Deleted dead store: # .MEM_368 = VDEF <.MEM_360>
MEM <const vtypeD.254540> [(ElementTypeD.254545 *)&mask_arrD.383153] = _242;

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-23  8:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-10 10:52 [Bug tree-optimization/114676] New: [12/13/14 Regression] DSE removes assignment that is used later aleksei.nikiforov at linux dot ibm.com
2024-04-10 11:06 ` [Bug target/114676] " pinskia at gcc dot gnu.org
2024-04-10 11:22 ` aleksei.nikiforov at linux dot ibm.com
2024-04-10 11:25 ` pinskia at gcc dot gnu.org
2024-04-10 11:28 ` aleksei.nikiforov at linux dot ibm.com
2024-04-10 11:33 ` pinskia at gcc dot gnu.org
2024-04-10 13:40 ` jakub at gcc dot gnu.org
2024-04-10 16:47 ` rguenth at gcc dot gnu.org
2024-04-11 12:30 ` krebbel at gcc dot gnu.org
2024-04-11 12:35 ` jakub at gcc dot gnu.org
2024-04-11 12:36 ` jakub at gcc dot gnu.org
2024-04-11 15:48 ` krebbel at gcc dot gnu.org
2024-04-12 13:32 ` law at gcc dot gnu.org
2024-04-12 18:36 ` sjames at gcc dot gnu.org
2024-04-17 17:01 ` krebbel at gcc dot gnu.org
2024-04-17 17:34 ` jakub at gcc dot gnu.org
2024-04-18 14:29 ` aleksei.nikiforov at linux dot ibm.com
2024-04-22  9:37 ` krebbel at gcc dot gnu.org
2024-04-23  8:17 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).