public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/106187] New: armhf: Miscompilation with -O2
@ 2022-07-04 16:23 mathieu.malaterre at gmail dot com
  2022-07-04 16:37 ` [Bug c++/106187] " mathieu.malaterre at gmail dot com
                   ` (59 more replies)
  0 siblings, 60 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-04 16:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

            Bug ID: 106187
           Summary: armhf: Miscompilation with -O2
           Product: gcc
           Version: 11.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mathieu.malaterre at gmail dot com
  Target Milestone: ---

I can trigger an assertion in highway unit test suite on armhf when using -O2
(does not happen at -O0).

Symptoms:

% tests/mul_test
"--gtest_filter=HwyMulTestGroup/HwyMulTest.TestAllMulAdd/Emu128"
"--gtest_also_run_disabled_tests"
Running main() from ./googletest/src/gtest_main.cc
Note: Google Test filter = HwyMulTestGroup/HwyMulTest.TestAllMulAdd/Emu128
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HwyMulTestGroup/HwyMulTest
[ RUN      ] HwyMulTestGroup/HwyMulTest.TestAllMulAdd/Emu128


f32x4 expect [0+ ->]:
  5,11,19,29,
f32x4 actual [0+ ->]:
  -9,11,19,29,
Abort at /home/malat/highway/hwy/tests/mul_test.cc:308: Emu128, f32x4 lane 0
mismatch: expected '5', got '-9'.

zsh: abort      tests/mul_test  "--gtest_also_run_disabled_tests"

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug c++/106187] armhf: Miscompilation with -O2
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
@ 2022-07-04 16:37 ` mathieu.malaterre at gmail dot com
  2022-07-04 16:37 ` mathieu.malaterre at gmail dot com
                   ` (58 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-04 16:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #1 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
Ok it seems to be working ok using:

% g++-12 --version
g++-12 (Debian 12.1.0-5) 12.1.0

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug c++/106187] armhf: Miscompilation with -O2
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
  2022-07-04 16:37 ` [Bug c++/106187] " mathieu.malaterre at gmail dot com
@ 2022-07-04 16:37 ` mathieu.malaterre at gmail dot com
  2022-07-04 16:42 ` [Bug c++/106187] armhf: Miscompilation at all optimization levels mathieu.malaterre at gmail dot com
                   ` (57 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-04 16:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #2 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
g++-10 seems affected:

% g++-10 --version
g++-10 (Debian 10.4.0-1) 10.4.0

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug c++/106187] armhf: Miscompilation at all optimization levels
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
  2022-07-04 16:37 ` [Bug c++/106187] " mathieu.malaterre at gmail dot com
  2022-07-04 16:37 ` mathieu.malaterre at gmail dot com
@ 2022-07-04 16:42 ` mathieu.malaterre at gmail dot com
  2022-07-04 20:19 ` pinskia at gcc dot gnu.org
                   ` (56 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-04 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #3 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
> I can trigger an assertion in highway unit test suite on armhf when using -O2 (does not happen at -O0).

The above sentence is wrong, I can make the symptoms go away using:

CXXFLAGS=-fsanitize=undefined

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug c++/106187] armhf: Miscompilation at all optimization levels
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (2 preceding siblings ...)
  2022-07-04 16:42 ` [Bug c++/106187] armhf: Miscompilation at all optimization levels mathieu.malaterre at gmail dot com
@ 2022-07-04 20:19 ` pinskia at gcc dot gnu.org
  2022-07-04 20:22 ` pinskia at gcc dot gnu.org
                   ` (55 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-04 20:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
                URL|https://github.com/google/h |
                   |ighway/                     |

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
https://github.com/google/highway/

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug c++/106187] armhf: Miscompilation at all optimization levels
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (3 preceding siblings ...)
  2022-07-04 20:19 ` pinskia at gcc dot gnu.org
@ 2022-07-04 20:22 ` pinskia at gcc dot gnu.org
  2022-07-05  7:18 ` [Bug target/106187] " mathieu.malaterre at gmail dot com
                   ` (54 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-04 20:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2022-07-04
     Ever confirmed|0                           |1

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Does it work with -fstrict-aliasing ?

Does adding -fsanitize=address report anything?

Please reduce it down to which file is being miscompiled at least. You can
compile the objects with -O2 and do a bysection of the ones needing to be
compiled with -O0.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at all optimization levels
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (4 preceding siblings ...)
  2022-07-04 20:22 ` pinskia at gcc dot gnu.org
@ 2022-07-05  7:18 ` mathieu.malaterre at gmail dot com
  2022-07-05  7:46 ` [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working) jan.wassenberg at gmail dot com
                   ` (53 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-05  7:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #6 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
(In reply to Andrew Pinski from comment #5)
> Does it work with -fstrict-aliasing ?

Yes with and without valgrind I can reproduce the assert.

> Does adding -fsanitize=address report anything?

When I use either `-fsanitize=address` or `-fsanitize=undefined` symptoms goes
away.

> Please reduce it down to which file is being miscompiled at least. You can
> compile the objects with -O2 and do a bysection of the ones needing to be
> compiled with -O0.

@jan could you suggest a way to reduce :

...
ForFloatTypes<hwy::N_EMU128::ForPartialVectors<hwy::N_EMU128::TestMulAdd> >
...

in a non-brute force approach ? Thanks *very* much.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (5 preceding siblings ...)
  2022-07-05  7:18 ` [Bug target/106187] " mathieu.malaterre at gmail dot com
@ 2022-07-05  7:46 ` jan.wassenberg at gmail dot com
  2022-07-05  8:00 ` mathieu.malaterre at gmail dot com
                   ` (52 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: jan.wassenberg at gmail dot com @ 2022-07-05  7:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #7 from Jan Wassenberg <jan.wassenberg at gmail dot com> ---
The easiest way to reduce the amount of code in the binary is to comment out
from mul_test.cc all the HWY_EXPORT_AND_TEST_P except the one with
TestAllMulEven.

The actual miscompilation is probably happening within ops/emu128-inl.h.

You can further reduce instantiations by replacing
ForFloatTypes(ForPartialVectors<TestMulAdd>());
with
TestMulAdd()(float(), FixedTag<float, 4>());

That gets us down to a fairly minimal single TU (mul_test.cc).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (6 preceding siblings ...)
  2022-07-05  7:46 ` [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working) jan.wassenberg at gmail dot com
@ 2022-07-05  8:00 ` mathieu.malaterre at gmail dot com
  2022-07-07  7:50 ` mathieu.malaterre at gmail dot com
                   ` (51 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-05  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #8 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
(In reply to Jan Wassenberg from comment #7)
> The easiest way to reduce the amount of code in the binary is to comment out
> from mul_test.cc all the HWY_EXPORT_AND_TEST_P except the one with
> TestAllMulEven.
> 
> The actual miscompilation is probably happening within ops/emu128-inl.h.
> 
> You can further reduce instantiations by replacing
> ForFloatTypes(ForPartialVectors<TestMulAdd>());
> with
> TestMulAdd()(float(), FixedTag<float, 4>());
> 
> That gets us down to a fairly minimal single TU (mul_test.cc).

Bingo !

Program received signal SIGABRT, Aborted.
0xb6cafe86 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
(gdb) bt
#0  0xb6cafe86 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0xb6cbf0e4 in raise () from /lib/arm-linux-gnueabihf/libc.so.6
#2  0xb6cafa16 in abort () from /lib/arm-linux-gnueabihf/libc.so.6
#3  0x00416a2c in hwy::Abort (file=file@entry=0x43acf8
"/home/malat/highway/hwy/tests/mul_test.cc", line=line@entry=308,
format=0x43b6ec "%s, %sx%llu lane %llu mismatch: expected '%s', got '%s'.\n")
    at /home/malat/highway/hwy/targets.cc:322
#4  0x00416cbc in hwy::detail::PrintMismatchAndAbort (info=...,
expected_ptr=expected_ptr@entry=0x0, actual_ptr=0x467580,
actual_ptr@entry=0x43acf8, target_name=target_name@entry=0x43ad24 "Emu128",
    filename=0x43acf8 "/home/malat/highway/hwy/tests/mul_test.cc",
filename@entry=0xb6ff3010 "", line=line@entry=308, lane=0,
num_lanes=num_lanes@entry=4) at /home/malat/highway/hwy/tests/test_util.cc:90
#5  0x00416d54 in hwy::detail::AssertArrayEqual (info=...,
expected_void=expected_void@entry=0x467500,
actual_void=actual_void@entry=0x467580, N=N@entry=4, target_name=0x43ad24
"Emu128",
    filename=0x43acf8 "/home/malat/highway/hwy/tests/mul_test.cc", line=308) at
/home/malat/highway/hwy/tests/test_util.cc:113
#6  0x004108ae in hwy::N_EMU128::AssertVecEqual<hwy::N_EMU128::Simd<float, 4u,
0>, float, hwy::N_EMU128::Vec128<float, 4u> > (line=308, filename=0x43acf8
"/home/malat/highway/hwy/tests/mul_test.cc",
    actual=..., expected=0x467500, d=...) at
/home/malat/highway/hwy/tests/test_util-inl.h:51
#7  hwy::N_EMU128::TestMulAdd::operator()<float, hwy::N_EMU128::Simd<float, 4u,
0> > (d=..., this=<optimized out>) at
/home/malat/highway/hwy/tests/mul_test.cc:308
#8  0x0043a114 in void
testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,
void>(testing::Test*, void (testing::Test::*)(), char const*) ()

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (7 preceding siblings ...)
  2022-07-05  8:00 ` mathieu.malaterre at gmail dot com
@ 2022-07-07  7:50 ` mathieu.malaterre at gmail dot com
  2022-07-07  8:00 ` mathieu.malaterre at gmail dot com
                   ` (50 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-07  7:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #9 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
Created attachment 53271
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53271&action=edit
object files compiled using gcc or gcc12

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (8 preceding siblings ...)
  2022-07-07  7:50 ` mathieu.malaterre at gmail dot com
@ 2022-07-07  8:00 ` mathieu.malaterre at gmail dot com
  2022-07-07  9:38 ` rearnsha at gcc dot gnu.org
                   ` (49 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-07  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #10 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
I did upload the bad (gcc-11) and the good (gcc-12) object files. Not sure if
this is what was expected. In any case let me know if you want to provide more
info.


% gdb -batch -ex "disassemble/rs _ZN3hwy8N_EMU12813TestAllMulAddEv"
CMakeFiles/mul_test.dir/hwy/tests/mul_test.cc.o
Dump of assembler code for function _ZN3hwy8N_EMU12813TestAllMulAddEv:
/home/malat/highway/hwy/tests/mul_test.cc:
343     HWY_NOINLINE void TestAllMulAdd() {
344       //ForFloatTypes(ForPartialVectors<TestMulAdd>());
345       TestMulAdd()(float(), FixedTag<float, 4>());
   0x0000ead8 <+0>:     fa f7 64 bb     b.w     0x91a4
<_ZN3hwy8N_EMU12810TestMulAddclIfNS0_4SimdIfLj4ELi0EEEEEvT_T0_>
End of assembler dump.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (9 preceding siblings ...)
  2022-07-07  8:00 ` mathieu.malaterre at gmail dot com
@ 2022-07-07  9:38 ` rearnsha at gcc dot gnu.org
  2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
                   ` (48 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-07  9:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #11 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Object files are no use to us.  We need preprocessed source along with analysis
of what you think has gone wrong (my program crashes or prints the wrong number
is rarely enough).  Also, please make sure that you have compiled with all
warnings enabled and that any relevant warnings have been fixed.

See https://gcc.gnu.org/bugs/ for more details about what we need.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (10 preceding siblings ...)
  2022-07-07  9:38 ` rearnsha at gcc dot gnu.org
@ 2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
  2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
                   ` (47 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-08  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Mathieu Malaterre <mathieu.malaterre at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #53271|0                           |1
        is obsolete|                            |

--- Comment #12 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
Created attachment 53276
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53276&action=edit
gcc11 -save-temps

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (11 preceding siblings ...)
  2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
@ 2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
  2022-07-08  9:03 ` mathieu.malaterre at gmail dot com
                   ` (46 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-08  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #13 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
Created attachment 53277
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53277&action=edit
gcc-12 -save-temps

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (12 preceding siblings ...)
  2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
@ 2022-07-08  9:03 ` mathieu.malaterre at gmail dot com
  2022-07-08 13:50 ` rearnsha at gcc dot gnu.org
                   ` (45 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: mathieu.malaterre at gmail dot com @ 2022-07-08  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #14 from Mathieu Malaterre <mathieu.malaterre at gmail dot com> ---
@Richard

I've uploaded the generated *.ii files (-save-temps), as discussed with
upstream:

* https://github.com/google/highway/issues/776#issuecomment-1177864014

I do not know the codebase very well so I cannot provide much help other than
"it fails the test suite on this single arch/opt flag"...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (13 preceding siblings ...)
  2022-07-08  9:03 ` mathieu.malaterre at gmail dot com
@ 2022-07-08 13:50 ` rearnsha at gcc dot gnu.org
  2022-07-08 13:59 ` malat at debian dot org
                   ` (44 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-08 13:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #15 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
What's the output of "gcc -v" for the failing compiler(s)?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (14 preceding siblings ...)
  2022-07-08 13:50 ` rearnsha at gcc dot gnu.org
@ 2022-07-08 13:59 ` malat at debian dot org
  2022-07-08 14:16 ` rearnsha at gcc dot gnu.org
                   ` (43 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-07-08 13:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #16 from Mathieu Malaterre <malat at debian dot org> ---
This is one is producing wrong code:

% gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/11/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 11.3.0-4'
--with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm
--disable-libquadmath --disable-libquadmath-support --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-sjlj-exceptions --with-arch=armv7-a+fp --with-float=hard
--with-mode=thumb --disable-werror --enable-checking=release
--build=arm-linux-gnueabihf --host=arm-linux-gnueabihf
--target=arm-linux-gnueabihf
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Debian 11.3.0-4)

This one seems to be working ok:

% gcc-12 -v
Using built-in specs.
COLLECT_GCC=gcc-12
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/12/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 12.1.0-5'
--with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-libitm --disable-libquadmath
--disable-libquadmath-support --enable-plugin --enable-default-pie
--with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-sjlj-exceptions --with-arch=armv7-a+fp --with-float=hard
--with-mode=thumb --disable-werror --enable-checking=release
--build=arm-linux-gnueabihf --host=arm-linux-gnueabihf
--target=arm-linux-gnueabihf
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (Debian 12.1.0-5)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (15 preceding siblings ...)
  2022-07-08 13:59 ` malat at debian dot org
@ 2022-07-08 14:16 ` rearnsha at gcc dot gnu.org
  2022-07-08 14:18 ` malat at debian dot org
                   ` (42 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-08 14:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #17 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
And what options are you passing to cmake?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (16 preceding siblings ...)
  2022-07-08 14:16 ` rearnsha at gcc dot gnu.org
@ 2022-07-08 14:18 ` malat at debian dot org
  2022-07-08 14:51 ` rearnsha at gcc dot gnu.org
                   ` (41 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-07-08 14:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #18 from Mathieu Malaterre <malat at debian dot org> ---
The complete line to generate the *.ii file is:

```
% /usr/bin/g++ -DHWY_STATIC_DEFINE -I/home/malat/highway -O2 -fstrict-aliasing
-ggdb3 -fPIE -fvisibility=hidden -fvisibility-inlines-hidden
-Wno-builtin-macro-redefined -D__DATE__=\"redacted\"
-D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants
-Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor
-fmath-errno -fno-exceptions -DHWY_IS_TEST=1 -DGTEST_HAS_PTHREAD=1 -MD -MT
CMakeFiles/mul_test.dir/hwy/tests/mul_test.cc.o -MF
CMakeFiles/mul_test.dir/hwy/tests/mul_test.cc.o.d -o
CMakeFiles/mul_test.dir/hwy/tests/mul_test.cc.o -save-temps -c
/home/malat/highway/hwy/tests/mul_test.cc
```

Where the important cmake setup is:

```
CMAKE_BUILD_TYPE:STRING=None
CMAKE_CXX_FLAGS:STRING=-O2 -fstrict-aliasing -ggdb3
```

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (17 preceding siblings ...)
  2022-07-08 14:18 ` malat at debian dot org
@ 2022-07-08 14:51 ` rearnsha at gcc dot gnu.org
  2022-07-08 15:03 ` malat at debian dot org
                   ` (40 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-08 14:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #19 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Hmm, I'm not sure that makes sense to me. AFAICT highway requires an Arm CPU
with Neon, but the default compiler flags that you posted (-mthumb
-march=armv7-a+fp) does not provide that.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (18 preceding siblings ...)
  2022-07-08 14:51 ` rearnsha at gcc dot gnu.org
@ 2022-07-08 15:03 ` malat at debian dot org
  2022-07-08 17:24 ` rearnsha at gcc dot gnu.org
                   ` (39 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-07-08 15:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #20 from Mathieu Malaterre <malat at debian dot org> ---
The Debian package highway 0.17.0-9 is not build with neon option (aka ARM7=OFF
in cmake), the code build just fine and test suite run fine (-O1):

*
https://buildd.debian.org/status/fetch.php?pkg=highway&arch=armhf&ver=0.17.0-9&stamp=1657025640&raw=0

If you go through the logs of old compilation, you should fine the one with -O2
at:

*
https://buildd.debian.org/status/fetch.php?pkg=highway&arch=armhf&ver=0.17.0-2&stamp=1655882301&raw=0

I can also build and use it just fine on Debian porterbox `abel.d.o` which is:

```
% cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 2 (v7l)
BogoMIPS        : 50.00
Features        : half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt
vfpd32 lpae
CPU implementer : 0x56
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x584
CPU revision    : 2
```

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (19 preceding siblings ...)
  2022-07-08 15:03 ` malat at debian dot org
@ 2022-07-08 17:24 ` rearnsha at gcc dot gnu.org
  2022-07-08 17:31 ` rearnsha at gcc dot gnu.org
                   ` (38 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-08 17:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #21 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
I've finally managed to reproduce the test failure (someone has turned off
emu128 in the sources I have).

Rebuilding the failing test with -fno-strict-aliasing causes the test to pass. 
That strongly suggests (though doesn't prove) this is a problem in the sources
with aliasing violations, rather than a bug in GCC.  Aliasing problems can be
very hit-and-miss because the rules give more freedom to the compiler to move
instructions around.  Indeed disabling the instruction scheduler passes also
causes the test to pass.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (20 preceding siblings ...)
  2022-07-08 17:24 ` rearnsha at gcc dot gnu.org
@ 2022-07-08 17:31 ` rearnsha at gcc dot gnu.org
  2022-07-08 19:27 ` jan.wassenberg at gmail dot com
                   ` (37 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-08 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #22 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
I notice that the sources seem to do floating-point negation by casting values
to integers, xor-ing the sign bit and then casting the result back to a float. 
This is exactly the sort of operation that is likely to violate the aliasing
rules (though I don't know if this is the precise source of the problem in this
case).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (21 preceding siblings ...)
  2022-07-08 17:31 ` rearnsha at gcc dot gnu.org
@ 2022-07-08 19:27 ` jan.wassenberg at gmail dot com
  2022-07-14 13:03 ` rearnsha at gcc dot gnu.org
                   ` (36 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: jan.wassenberg at gmail dot com @ 2022-07-08 19:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #23 from Jan Wassenberg <jan.wassenberg at gmail dot com> ---
Thanks for having a look. For casting, we CopyBytes between the two
representations, which boils down to __builtin_memcpy
(https://github.com/google/highway/blob/master/hwy/base.h#L819). Is there some
other preferred way of doing this?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (22 preceding siblings ...)
  2022-07-08 19:27 ` jan.wassenberg at gmail dot com
@ 2022-07-14 13:03 ` rearnsha at gcc dot gnu.org
  2022-07-14 13:18 ` rearnsha at gcc dot gnu.org
                   ` (35 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-14 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #53276|0                           |1
        is obsolete|                            |
  Attachment #53277|0                           |1
        is obsolete|                            |

--- Comment #24 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Created attachment 53295
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53295&action=edit
reduced testcase

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (23 preceding siblings ...)
  2022-07-14 13:03 ` rearnsha at gcc dot gnu.org
@ 2022-07-14 13:18 ` rearnsha at gcc dot gnu.org
  2022-07-14 16:09 ` pinskia at gcc dot gnu.org
                   ` (34 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-14 13:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #25 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
A quick status update.

I've managed to reduce the testcase to the latest attachment.  The program is
heavily reduced (so some bits likely don't make much sense), but the test still
'passes' when compiled with -fno-strict-aliasing, but fails with the same error
when that option is omitted.

Looking at the assembler output of void
hwy::N_EMU128::TestMulAdd::operator()<float, hwy::N_EMU128::Simd<float, 4u, 0>
>(float, hwy::N_EMU128::Simd<float, 4u, 0>) [clone .isra.0]

we see (correct on left, incorrect on right):


        add     r3, sp, #148                    add     r3, sp, #148
        vmov.f32        s14, #3.0e+0            vmov.f32        s14, #3.0e+0
[1]     mov     r6, r4                          mov     r6, r4
        vmov.f32        s15, #2.0e+0            vmov.f32        s15, #2.0e+0
        add     r8, sp, #100                    add     r8, sp, #100
        add     lr, sp, #132                    add     lr, sp, #132
        ldm     r3, {r0, r1, r2, r3}            ldm     r3, {r0, r1, r2, r3}
        vstr.32 s14, [sp, #152]                 vstr.32 s14, [sp, #152]
        vmov.f32        s14, #4.0e+0            vmov.f32        s14, #4.0e+0
[2]     stm     r4, {r0, r1, r2, r3}  |         stm     r5, {r0, r1, r2, r3}
        add     ip, sp, #116                    add     ip, sp, #116
        vstr.32 s14, [sp, #156]                 vstr.32 s14, [sp, #156]
        vmov.f32        s14, #5.0e+0            vmov.f32        s14, #5.0e+0
        stm     r5, {r0, r1, r2, r3}  <
        add     r5, sp, #36                     add     r5, sp, #36
        add     r10, sp, #196                   add     r10, sp, #196
        vstr.32 s14, [sp, #160]                 vstr.32 s14, [sp, #160]
        add     r9, sp, #152                    add     r9, sp, #152
[3]     vldr.32 s14, [r6]                       vldr.32 s14, [r6]
[4]     stm     r8, {r0, r1, r2, r3}  |         stm     r4, {r0, r1, r2, r3}
        vmul.f32        s15, s14, s15           vmul.f32        s15, s14, s15
                                      >         stm     r8, {r0, r1, r2, r3}

at [1] we see that r6 and r4 are the same value.  We also see that at [3] a
register is read using r6 as the base.  In the good code on the left, the STM
to r4 is at 2, but in the incorrect code is does not occur until 4, ie
immediately after the load at [3].

I need to dig a bit deeper now on this specific function to see if the alias
information is correct, or if it has somehow been lost/corrupted during the
compilation.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (24 preceding siblings ...)
  2022-07-14 13:18 ` rearnsha at gcc dot gnu.org
@ 2022-07-14 16:09 ` pinskia at gcc dot gnu.org
  2022-07-18 15:45 ` rearnsha at gcc dot gnu.org
                   ` (33 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-14 16:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |UNCONFIRMED
     Ever confirmed|1                           |0

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (25 preceding siblings ...)
  2022-07-14 16:09 ` pinskia at gcc dot gnu.org
@ 2022-07-18 15:45 ` rearnsha at gcc dot gnu.org
  2022-07-18 15:48 ` rearnsha at gcc dot gnu.org
                   ` (32 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-18 15:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #26 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
git bisect points to commit r11-9688 resolving the issue.  Before that commit
the ivopts pass generates:

  ivtmp.761_217 = (unsigned int) &au;
  _222 = &bu + 4;
  ivtmp.767_220 = (unsigned int) _222;
  _225 = (unsigned int) &au;
  _228 = _225 + 16;

  <bb 9> [local count: 858993457]:
  # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)>
  # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)>
  # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)>
  # ivtmp.767_218 = PHI <ivtmp.767_219(10), ivtmp.767_220(8)>
  _16 = prephitmp_32 ^ prephitmp_136;
  _223 = (void *) ivtmp.761_278;
  MEM[(unsigned int *)_223] = _16;
  ivtmp.761_216 = ivtmp.761_278 + 4;
  if (ivtmp.761_216 != _228)
    goto <bb 10>; [75.00%]
  else
    goto <bb 11>; [25.00%]

  <bb 10> [local count: 644245086]:
  _230 = (void *) ivtmp.761_216;
  pretmp_120 = MEM[(unsigned int *)_230];
  _229 = (void *) ivtmp.767_218;
  pretmp_18 = MEM[(unsigned int *)_229];
  ivtmp.767_219 = ivtmp.767_218 + 4;
  goto <bb 9>; [100.00%]

And once that patch is applied we get:

  ivtmp.761_217 = (unsigned int) &au;
  ivtmp.766_220 = (unsigned int) &bu;
  _223 = (unsigned int) &au;
  _225 = _223 + 16;

  <bb 9> [local count: 858993457]:
  # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)>
  # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)>
  # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)>
  # ivtmp.766_218 = PHI <ivtmp.766_219(10), ivtmp.766_220(8)>
  _16 = prephitmp_32 ^ prephitmp_136;
  _222 = (void *) ivtmp.761_278;
  MEM[(unsigned int *)_222] = _16;
  ivtmp.761_216 = ivtmp.761_278 + 4;
  if (ivtmp.761_216 != _225)
    goto <bb 10>; [75.00%]
  else
    goto <bb 11>; [25.00%]

The main difference being that in the 'bad' code we start with &bu + 4, while
in the good code we start with &bu.

I'm afraid I don't know enough about this code to take this further.  Richi?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (26 preceding siblings ...)
  2022-07-18 15:45 ` rearnsha at gcc dot gnu.org
@ 2022-07-18 15:48 ` rearnsha at gcc dot gnu.org
  2022-07-19  7:27 ` rguenth at gcc dot gnu.org
                   ` (31 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-18 15:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #27 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
BTW, compile flags for an arm-none-eabi compiler are:

cc1plus -march=armv7-a+fp -mfloat-abi=hard -O2 -quiet  -mthumb -fno-short-enums
-fno-exceptions -fvisibility-inlines-hidden -fmath-errno -fmerge-all-constants
-fvisibility=hidden -fstack-protector-strong -std=c++11

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (27 preceding siblings ...)
  2022-07-18 15:48 ` rearnsha at gcc dot gnu.org
@ 2022-07-19  7:27 ` rguenth at gcc dot gnu.org
  2022-07-19  9:00 ` rearnsha at gcc dot gnu.org
                   ` (30 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-19  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #28 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #26)
> git bisect points to commit r11-9688 resolving the issue.  Before that
> commit the ivopts pass generates:
> 
>   ivtmp.761_217 = (unsigned int) &au;
>   _222 = &bu + 4;
>   ivtmp.767_220 = (unsigned int) _222;
>   _225 = (unsigned int) &au;
>   _228 = _225 + 16;
> 
>   <bb 9> [local count: 858993457]:
>   # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)>
>   # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)>
>   # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)>
>   # ivtmp.767_218 = PHI <ivtmp.767_219(10), ivtmp.767_220(8)>
>   _16 = prephitmp_32 ^ prephitmp_136;
>   _223 = (void *) ivtmp.761_278;
>   MEM[(unsigned int *)_223] = _16;
>   ivtmp.761_216 = ivtmp.761_278 + 4;
>   if (ivtmp.761_216 != _228)
>     goto <bb 10>; [75.00%]
>   else
>     goto <bb 11>; [25.00%]
> 
>   <bb 10> [local count: 644245086]:
>   _230 = (void *) ivtmp.761_216;
>   pretmp_120 = MEM[(unsigned int *)_230];
>   _229 = (void *) ivtmp.767_218;
>   pretmp_18 = MEM[(unsigned int *)_229];
>   ivtmp.767_219 = ivtmp.767_218 + 4;
>   goto <bb 9>; [100.00%]
> 
> And once that patch is applied we get:
> 
>   ivtmp.761_217 = (unsigned int) &au;
>   ivtmp.766_220 = (unsigned int) &bu;
>   _223 = (unsigned int) &au;
>   _225 = _223 + 16;
> 
>   <bb 9> [local count: 858993457]:
>   # prephitmp_136 = PHI <pretmp_120(10), 1073741824(8)>
>   # prephitmp_32 = PHI <pretmp_18(10), 2147483648(8)>
>   # ivtmp.761_278 = PHI <ivtmp.761_216(10), ivtmp.761_217(8)>
>   # ivtmp.766_218 = PHI <ivtmp.766_219(10), ivtmp.766_220(8)>
>   _16 = prephitmp_32 ^ prephitmp_136;
>   _222 = (void *) ivtmp.761_278;
>   MEM[(unsigned int *)_222] = _16;
>   ivtmp.761_216 = ivtmp.761_278 + 4;
>   if (ivtmp.761_216 != _225)
>     goto <bb 10>; [75.00%]
>   else
>     goto <bb 11>; [25.00%]
> 
> The main difference being that in the 'bad' code we start with &bu + 4,
> while in the good code we start with &bu.
> 
> I'm afraid I don't know enough about this code to take this further.  Richi?

There's no functional difference, you omitted BB9 after the patch which
for me looks like

  <bb 10> [local count: 644245086]:
  # PT = { D.22767 }
  _228 = (voidD.73 *) ivtmp.741_281;
  [t.ii:2167:17] pretmp_155 = MEM[(unsigned intD.11 *)_228];
  [t.ii:2167:26] ivtmp.746_28 = ivtmp.746_299 + 4;
  # PT = { D.22768 }
  _227 = (voidD.73 *) ivtmp.746_28;
  [t.ii:2167:26] pretmp_183 = MEM[(unsigned intD.11 *)_227];
  goto <bb 9>; [100.00%]

so we changed from post-increment to pre-increment of 4 - the accesses
happen to the same memory location.

I'm dumping with -alias-uid-lineno and alias info looks fine to me here.

It might very well be that the change above triggers a bug elsewhere.  Does
reverting the "fixing" revision make the issue appear on trunk as well?

The code at RTL expansion time looks reasonable (also from an aliasing POV),
if -fno-strict-aliasing fixes it, does -fno-schedule-insn{,2} also?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (28 preceding siblings ...)
  2022-07-19  7:27 ` rguenth at gcc dot gnu.org
@ 2022-07-19  9:00 ` rearnsha at gcc dot gnu.org
  2022-07-19  9:13 ` rguenth at gcc dot gnu.org
                   ` (29 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-19  9:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #29 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Thanks for having a look, yes, I was at a loss to understand how that change
(which is before the problematic hunk would be the cause of the problem.  It
looks like we can rule that change out as a real fix.

> The code at RTL expansion time looks reasonable (also from an aliasing POV),
> if -fno-strict-aliasing fixes it, does -fno-schedule-insn{,2} also?

Yes, disabling scheduling also solves the issue.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (29 preceding siblings ...)
  2022-07-19  9:00 ` rearnsha at gcc dot gnu.org
@ 2022-07-19  9:13 ` rguenth at gcc dot gnu.org
  2022-07-21  9:25 ` rearnsha at gcc dot gnu.org
                   ` (28 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-19  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #30 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #29)
> Thanks for having a look, yes, I was at a loss to understand how that change
> (which is before the problematic hunk would be the cause of the problem.  It
> looks like we can rule that change out as a real fix.
> 
> > The code at RTL expansion time looks reasonable (also from an aliasing POV),
> > if -fno-strict-aliasing fixes it, does -fno-schedule-insn{,2} also?
> 
> Yes, disabling scheduling also solves the issue.

There are several sched* debug counters, so maybe bisecting to the wrong
schedule via -fdbg-cnt=sched_insn might work.  I think the above strongly
hints at either RTL/target messing up alias info somewhere or scheduling not
properly computing dependences.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (30 preceding siblings ...)
  2022-07-19  9:13 ` rguenth at gcc dot gnu.org
@ 2022-07-21  9:25 ` rearnsha at gcc dot gnu.org
  2022-07-22 12:52 ` rearnsha at gcc dot gnu.org
                   ` (27 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-21  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rearnsha at gcc dot gnu.org

--- Comment #31 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Created attachment 53331
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53331&action=edit
non-executable testcase

Further reduced testcase; no-longer executable, but still shows the issue with
incorrect instruction re-ordering.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (31 preceding siblings ...)
  2022-07-21  9:25 ` rearnsha at gcc dot gnu.org
@ 2022-07-22 12:52 ` rearnsha at gcc dot gnu.org
  2022-07-22 13:24 ` rearnsha at gcc dot gnu.org
                   ` (26 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-22 12:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2022-07-04 00:00:00         |2022-07-22
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #32 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
So the problem is in postreload.

After register allocation we have (rewritten to make the short insns describe
what's happening):
r5=sp+0xb4
r7=sp+0x64
r4=sp+0xa4
r6=sp+0xa4

  102: r0:TI=[sp:SI + 0x84]     // Set 5
  103: [r4:SI] = r0:TI          // Set 5

  106: r0:TI=[r4:SI]            // Set 0 (memcpy)
  107: [r5:SI] = r0:TI          // Set 0

  110: r0:TI = [r5:SI]          // Set 2
  111: [r8:SI] = r0:TI          // Set 2

  114: r0:TI = [r8:SI]          // Set 2
  115: [lr:SI] = r0:TI          // Set 2

  118: r0:TI = [lr:SI]          // Set 2
  119: [r7:SI] = r0:TI          // Set 2

  122: r0:TI = [r7:SI]          // Set 2
  123: [r5:SI] = r0:TI          // Set 2

  126: r0:TI = [r5:SI]          // Set 2
  127: [SP:SI+0x14] = r0:TI     // Set 2

  130: r0:TI = [sp:SI+0x14]     // Set 2
  131: [r4:SI] = r0:TI          // Set 2

  143: s14:SF=[r6:SI]           // Set 1

Where alias set 1 is for float, alias set 2 is for v<float> and alias set 5 is
for v<int>.  Alias sets 1 and 2 conflict, but alias set 5 does not.

Posreload removes all the loads except the first and finally removes the store
at insn 131, because value-wise it replicates the store at insn 103.  But that
means that the alias dance through the memcpy is lost and so the compiler feels
it is now free (during sched2) to move insn 143 before insn 103.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (32 preceding siblings ...)
  2022-07-22 12:52 ` rearnsha at gcc dot gnu.org
@ 2022-07-22 13:24 ` rearnsha at gcc dot gnu.org
  2022-07-25  6:12 ` rguenth at gcc dot gnu.org
                   ` (25 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-22 13:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #33 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
I suspect there is still a question, though, as to whether it is safe in
general for two objects with non-conflicting alias sets to share a stack slot.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (33 preceding siblings ...)
  2022-07-22 13:24 ` rearnsha at gcc dot gnu.org
@ 2022-07-25  6:12 ` rguenth at gcc dot gnu.org
  2022-07-25  9:44 ` rearnsha at gcc dot gnu.org
                   ` (24 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-25  6:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #34 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #33)
> I suspect there is still a question, though, as to whether it is safe in
> general for two objects with non-conflicting alias sets to share a stack
> slot.

Might also be related to PR93946?  If postreload decides insn 131 is a
redundant
store to insn 103 then it needs to check for alias set compatibility.  Note
that in some cases is difficult (see PR101641).  I wonder if the PR93946
fix is still incomplete - did you trace to the code pieces in postreload.cc
that removes the store?

There's no union involved here though but a memcpy used in BitCast.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (34 preceding siblings ...)
  2022-07-25  6:12 ` rguenth at gcc dot gnu.org
@ 2022-07-25  9:44 ` rearnsha at gcc dot gnu.org
  2022-07-25  9:50 ` rguenther at suse dot de
                   ` (23 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-25  9:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #35 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> There's no union involved here though but a memcpy used in BitCast.
Agreed, but by creating a shared stack slot, the compiler is effectively
creating a union of its own, and I think that needs to be accounted for. 
update_alias_info_with_stack_vars handles the cases where we have pointers (at
the gimple level) into a shared stack slot, but doesn't (AFAICT) cater for RTL
lowering creating additional pointers (as it must since all objects on the
stack ultimately have to be addressed).

So if we create a shared stack slot for objects of different types, why do we
not also create an alias set for the combination of such types, much as we
would do for a union?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (35 preceding siblings ...)
  2022-07-25  9:44 ` rearnsha at gcc dot gnu.org
@ 2022-07-25  9:50 ` rguenther at suse dot de
  2022-07-25  9:59 ` rearnsha at gcc dot gnu.org
                   ` (22 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenther at suse dot de @ 2022-07-25  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #36 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 25 Jul 2022, rearnsha at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187
> 
> --- Comment #35 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> > There's no union involved here though but a memcpy used in BitCast.
> Agreed, but by creating a shared stack slot, the compiler is effectively
> creating a union of its own, and I think that needs to be accounted for. 
> update_alias_info_with_stack_vars handles the cases where we have pointers (at
> the gimple level) into a shared stack slot, but doesn't (AFAICT) cater for RTL
> lowering creating additional pointers (as it must since all objects on the
> stack ultimately have to be addressed).
> 
> So if we create a shared stack slot for objects of different types, why do we
> not also create an alias set for the combination of such types, much as we
> would do for a union?

Note that the only thing we have to do is fix points-to info, the TBAA
info should be correct and OK even when objects share location, so there's
nothing we can do at RTL expansion time.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (36 preceding siblings ...)
  2022-07-25  9:50 ` rguenther at suse dot de
@ 2022-07-25  9:59 ` rearnsha at gcc dot gnu.org
  2022-07-25 10:24 ` rguenth at gcc dot gnu.org
                   ` (21 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-25  9:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #37 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #36)

> Note that the only thing we have to do is fix points-to info, the TBAA
> info should be correct and OK even when objects share location, so there's
> nothing we can do at RTL expansion time.

I haven't really studied the way the TBAA code works before, so I may have
missed something, but we clearly end up creating two MEMs for the same location
with non-conflicting alias sets.  So perhaps the problem is when we assign the
alias set when we create the MEM (it's taken from the original type, without
regard to the stack slot assignment).

What would be in the TBAA code to prevent

struct A
{
  int a[4];
};

struct B
{
  float b[4];
};

struct A x;
struct B y;

f ()
{
 struct A m;
 struct B n;
  ...
 x = m;   // m dead
 n = y;   // n born
 ...
}

from moving these two assignments past each other at the RTL level if they
shared the same stack slot?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (37 preceding siblings ...)
  2022-07-25  9:59 ` rearnsha at gcc dot gnu.org
@ 2022-07-25 10:24 ` rguenth at gcc dot gnu.org
  2022-07-25 10:26 ` [Bug rtl-optimization/106187] " rguenth at gcc dot gnu.org
                   ` (20 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-25 10:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #38 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #37)
> (In reply to rguenther@suse.de from comment #36)
> 
> > Note that the only thing we have to do is fix points-to info, the TBAA
> > info should be correct and OK even when objects share location, so there's
> > nothing we can do at RTL expansion time.
> 
> I haven't really studied the way the TBAA code works before, so I may have
> missed something, but we clearly end up creating two MEMs for the same
> location with non-conflicting alias sets.  So perhaps the problem is when we
> assign the alias set when we create the MEM (it's taken from the original
> type, without regard to the stack slot assignment).
> 
> What would be in the TBAA code to prevent
> 
> struct A
> {
>   int a[4];
> };
> 
> struct B
> {
>   float b[4];
> };
> 
> struct A x;
> struct B y;
> 
> f ()
> {
>  struct A m;
>  struct B n;
>   ...
>  x = m;   // m dead
>  n = y;   // n born
>  ...
> }
> 
> from moving these two assignments past each other at the RTL level if they
> shared the same stack slot?

There's a WAR dependence between those assignments.  Write-after-read is
not allowed to use TBAA in our memory model (likewise write-after-write),
only read-after-write is.

One side-effect of this is that "redundant stores" (redundant in terms of
that the second store does not change any bits in the memory location)
are not always "redundant" with respect to the memory model.  Currently
we have to preserve those, their effect is to change the effective type
of the memory location for downstream reads.

There's a (maybe too short) documentation about our TBAA memory model
in tree-ssa.texi at the very end.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (38 preceding siblings ...)
  2022-07-25 10:24 ` rguenth at gcc dot gnu.org
@ 2022-07-25 10:26 ` rguenth at gcc dot gnu.org
  2022-07-25 10:33 ` rearnsha at gcc dot gnu.org
                   ` (19 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-25 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |rtl-optimization

--- Comment #39 from Richard Biener <rguenth at gcc dot gnu.org> ---
So to say - it's wrongdoing of postreload which should not remove the dead
store because it still needs to serve as a barrier for the subsequent load.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (39 preceding siblings ...)
  2022-07-25 10:26 ` [Bug rtl-optimization/106187] " rguenth at gcc dot gnu.org
@ 2022-07-25 10:33 ` rearnsha at gcc dot gnu.org
  2022-07-25 10:42 ` rguenth at gcc dot gnu.org
                   ` (18 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-25 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #40 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
reload_cse_noop_set_p is the function that decides the store is redundant.  For
this parallel case it's being called once for each set, but all the cases
return true, so the store insn gets removed.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (40 preceding siblings ...)
  2022-07-25 10:33 ` rearnsha at gcc dot gnu.org
@ 2022-07-25 10:42 ` rguenth at gcc dot gnu.org
  2022-07-25 10:48 ` rearnsha at gcc dot gnu.org
                   ` (17 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-25 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #41 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #40)
> reload_cse_noop_set_p is the function that decides the store is redundant. 
> For this parallel case it's being called once for each set, but all the
> cases return true, so the store insn gets removed.

OK, so in the end it's rtx_equal_for_cselib_1 that needs similar treatment
as in r10-7635 but only for a WAR mode (when we check whether a SET_DEST
already contains SET_SRC).  I'm not sure if the info required is
eventually available, I'd have to trace where it decides the values are
the same.  If we can't reasonably do that we'd have to axe
reload_cse_noop_set_p when flag_strict_aliasing is in effect.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (41 preceding siblings ...)
  2022-07-25 10:42 ` rguenth at gcc dot gnu.org
@ 2022-07-25 10:48 ` rearnsha at gcc dot gnu.org
  2022-07-25 11:03 ` rguenther at suse dot de
                   ` (16 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-25 10:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #42 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
That would be unfortunate, it removes a lot of pointless loads in this case;
and even the store it removes ought to be safe, if it weren't for the corrupted
alias info that results (the values /are/ the same).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (42 preceding siblings ...)
  2022-07-25 10:48 ` rearnsha at gcc dot gnu.org
@ 2022-07-25 11:03 ` rguenther at suse dot de
  2022-07-25 11:05 ` rguenth at gcc dot gnu.org
                   ` (15 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenther at suse dot de @ 2022-07-25 11:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #43 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 25 Jul 2022, rearnsha at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187
> 
> --- Comment #42 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> That would be unfortunate, it removes a lot of pointless loads in this case;
> and even the store it removes ought to be safe, if it weren't for the corrupted
> alias info that results (the values /are/ the same).

We only need to scrap the case removing _stores_, removing loads is
all fine.  And the fix would, in this case, cause us to preserve
the store anyway (but I expect there are cases where we can detect
the store can be safely removed as well)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (43 preceding siblings ...)
  2022-07-25 11:03 ` rguenther at suse dot de
@ 2022-07-25 11:05 ` rguenth at gcc dot gnu.org
  2022-07-25 13:04 ` rearnsha at gcc dot gnu.org
                   ` (14 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-25 11:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #44 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #43)
> On Mon, 25 Jul 2022, rearnsha at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187
> > 
> > --- Comment #42 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> > That would be unfortunate, it removes a lot of pointless loads in this case;
> > and even the store it removes ought to be safe, if it weren't for the corrupted
> > alias info that results (the values /are/ the same).
> 
> We only need to scrap the case removing _stores_, removing loads is
> all fine.  And the fix would, in this case, cause us to preserve
> the store anyway (but I expect there are cases where we can detect
> the store can be safely removed as well)

Btw, there's pass_rtl_dse2 after postreload which may be able to catch
most of the important cases as well and it already received the "fix"
(and has the info we need readily available).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (44 preceding siblings ...)
  2022-07-25 11:05 ` rguenth at gcc dot gnu.org
@ 2022-07-25 13:04 ` rearnsha at gcc dot gnu.org
  2022-07-25 14:45 ` rguenther at suse dot de
                   ` (13 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-25 13:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #45 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
The problem with changing rtx_equal_for_cselib_1 is that it is essentially
commutative in its operands - it doesn't disambiguate with x substituting for y
or vice-versa, so we cannot tell if an operation is a load or a store.

A minimal fix, which just suppresses stores would be:

@@ -81,6 +81,10 @@ reload_cse_noop_set_p (rtx set)
   if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set)))
     return 0;

+  /* Fixme: we need to check that removing a store doesn't change
+     the alias computations.  */
+  if (flag_strict_aliasing && MEM_P (SET_DEST (set)))
+    return 0;
   return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set));
 }

But we could no-doubt improve on that.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (45 preceding siblings ...)
  2022-07-25 13:04 ` rearnsha at gcc dot gnu.org
@ 2022-07-25 14:45 ` rguenther at suse dot de
  2022-07-27 13:35 ` rearnsha at gcc dot gnu.org
                   ` (12 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rguenther at suse dot de @ 2022-07-25 14:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #46 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 25 Jul 2022, rearnsha at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187
> 
> --- Comment #45 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> The problem with changing rtx_equal_for_cselib_1 is that it is essentially
> commutative in its operands - it doesn't disambiguate with x substituting for y
> or vice-versa, so we cannot tell if an operation is a load or a store.

True, but the new special mode could require the first to be a load or 
store and the second a store taking place after the first arg (so we
have either WAW or WAR).

> A minimal fix, which just suppresses stores would be:
> 
> @@ -81,6 +81,10 @@ reload_cse_noop_set_p (rtx set)
>    if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set)))
>      return 0;
> 
> +  /* Fixme: we need to check that removing a store doesn't change
> +     the alias computations.  */
> +  if (flag_strict_aliasing && MEM_P (SET_DEST (set)))
> +    return 0;
>    return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set));
>  }

Yeah, that works (does that catch all stores?  or at least all stores
that are simple enough for cselib to handle?).

> But we could no-doubt improve on that.

The issue here is that SET_SRC (set) is usually a REG, but we need
the corresponding earlier MEM SET_DEST is requal to which the
REG was derived from to make the decision on whether the store
is redundant from a TBAA perspective as well.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (46 preceding siblings ...)
  2022-07-25 14:45 ` rguenther at suse dot de
@ 2022-07-27 13:35 ` rearnsha at gcc dot gnu.org
  2022-07-28 16:51 ` rearnsha at gcc dot gnu.org
                   ` (11 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-27 13:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #47 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Created attachment 53361
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53361&action=edit
Possible patch

A slightly more thorough attempt to avoid the problem by detecting when the
alias sets are known to conflict.  We track through the list of same values
that CSELIB has recorded to find one that writes the same location (because the
addresses are considered equal).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (47 preceding siblings ...)
  2022-07-27 13:35 ` rearnsha at gcc dot gnu.org
@ 2022-07-28 16:51 ` rearnsha at gcc dot gnu.org
  2022-08-03  9:07 ` cvs-commit at gcc dot gnu.org
                   ` (10 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-07-28 16:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #48 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Improved version posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598993.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (48 preceding siblings ...)
  2022-07-28 16:51 ` rearnsha at gcc dot gnu.org
@ 2022-08-03  9:07 ` cvs-commit at gcc dot gnu.org
  2022-08-03  9:16 ` rearnsha at gcc dot gnu.org
                   ` (9 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-08-03  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #49 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Earnshaw <rearnsha@gcc.gnu.org>:

https://gcc.gnu.org/g:64ce76d940501cb04d14a0d36752b4f93473531c

commit r13-1948-g64ce76d940501cb04d14a0d36752b4f93473531c
Author: Richard Earnshaw <rearnsha@arm.com>
Date:   Wed Aug 3 10:01:51 2022 +0100

    cselib: add function to check if SET is redundant [PR106187]

    A SET operation that writes memory may have the same value as an
    earlier store but if the alias sets of the new and earlier store do
    not conflict then the set is not truly redundant.  This can happen,
    for example, if objects of different types share a stack slot.

    To fix this we define a new function in cselib that first checks for
    equality and if that is successful then finds the earlier store in the
    value history and checks the alias sets.

    The routine is used in two places elsewhere in the compiler:
    cfgcleanup and postreload.

    gcc/ChangeLog:

            PR rtl-optimization/106187
            * alias.h (mems_same_for_tbaa_p): Declare.
            * alias.cc (mems_same_for_tbaa_p): New function.
            * dse.cc (record_store): Use it instead of open-coding
            alias check.
            * cselib.h (cselib_redundant_set_p): Declare.
            * cselib.cc: Include alias.h
            (cselib_redundant_set_p): New function.
            * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
            of rtx_equal_for_cselib_p.
            * postreload.cc (reload_cse_simplify): Use cselib_redundant_set_p.
            (reload_cse_noop_set_p): Delete.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (49 preceding siblings ...)
  2022-08-03  9:07 ` cvs-commit at gcc dot gnu.org
@ 2022-08-03  9:16 ` rearnsha at gcc dot gnu.org
  2022-08-10  7:06 ` malat at debian dot org
                   ` (8 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-08-03  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rearnsha at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #50 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Fixed on master so far.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (50 preceding siblings ...)
  2022-08-03  9:16 ` rearnsha at gcc dot gnu.org
@ 2022-08-10  7:06 ` malat at debian dot org
  2022-08-10  7:11 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-08-10  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #51 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Richard Earnshaw from comment #50)
> Fixed on master so far.

Not clear how this is possible. I reported an issue against gcc-11 which could
not be reproduced using gcc-12. Are you saying the issue crept back in gcc-13.x
?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (51 preceding siblings ...)
  2022-08-10  7:06 ` malat at debian dot org
@ 2022-08-10  7:11 ` pinskia at gcc dot gnu.org
  2022-09-02  9:28 ` malat at debian dot org
                   ` (6 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-08-10  7:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #52 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #51)
> (In reply to Richard Earnshaw from comment #50)
> > Fixed on master so far.
> 
> Not clear how this is possible. I reported an issue against gcc-11 which
> could not be reproduced using gcc-12. Are you saying the issue crept back in
> gcc-13.x ?

The bug became latent in gcc 12.1.0 and on the trunk. The real fix for the
problem is on the trunk now.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (52 preceding siblings ...)
  2022-08-10  7:11 ` pinskia at gcc dot gnu.org
@ 2022-09-02  9:28 ` malat at debian dot org
  2022-09-02 12:30 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-09-02  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #53 from Mathieu Malaterre <malat at debian dot org> ---
For later reference, the gcc-11 symptoms disapear in upstream git after commit:

*
https://github.com/google/highway/commit/4fa872a2a0d9944cb5fe761669ac63096607d3a3

gcc-12 seems to be generating wrong-code for a different unit-test:

% tests/mul_test
"--gtest_filter=HwyMulTestGroup/HwyMulTest.TestAllMulHigh/EMU128"
Running main() from ./googletest/src/gtest_main.cc
Note: Google Test filter = HwyMulTestGroup/HwyMulTest.TestAllMulHigh/EMU128
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HwyMulTestGroup/HwyMulTest
[ RUN      ] HwyMulTestGroup/HwyMulTest.TestAllMulHigh/EMU128


i16x8 expect [0+ ->]:
  0x3FFF,0x0FFF,0x03FF,0x00FF,0x003F,0x000F,0x0003,
i16x8 actual [0+ ->]:
  0xBFFF,0x0FFF,0xE400,0x00FF,0xF840,0x000F,0xFE04,
Abort at /home/malat/highway/hwy/tests/mul_test.cc:131: EMU128, i16x8 lane 0
mismatch: expected '0x3FFF', got '0xBFFF'.

zsh: abort      tests/mul_test

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (53 preceding siblings ...)
  2022-09-02  9:28 ` malat at debian dot org
@ 2022-09-02 12:30 ` cvs-commit at gcc dot gnu.org
  2022-09-02 12:32 ` rearnsha at gcc dot gnu.org
                   ` (4 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-09-02 12:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #54 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Richard Earnshaw
<rearnsha@gcc.gnu.org>:

https://gcc.gnu.org/g:3835765ae96d294bb71dd8cb05db543d89725f7b

commit r12-8738-g3835765ae96d294bb71dd8cb05db543d89725f7b
Author: Richard Earnshaw <rearnsha@arm.com>
Date:   Wed Aug 3 10:01:51 2022 +0100

    cselib: add function to check if SET is redundant [PR106187]

    A SET operation that writes memory may have the same value as an
    earlier store but if the alias sets of the new and earlier store do
    not conflict then the set is not truly redundant.  This can happen,
    for example, if objects of different types share a stack slot.

    To fix this we define a new function in cselib that first checks for
    equality and if that is successful then finds the earlier store in the
    value history and checks the alias sets.

    The routine is used in two places elsewhere in the compiler:
    cfgcleanup and postreload.

    gcc/ChangeLog:

            PR rtl-optimization/106187
            * alias.h (mems_same_for_tbaa_p): Declare.
            * alias.cc (mems_same_for_tbaa_p): New function.
            * dse.cc (record_store): Use it instead of open-coding
            alias check.
            * cselib.h (cselib_redundant_set_p): Declare.
            * cselib.cc: Include alias.h
            (cselib_redundant_set_p): New function.
            * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
            of rtx_equal_for_cselib_p.
            * postreload.cc (reload_cse_simplify): Use cselib_redundant_set_p.
            (reload_cse_noop_set_p): Delete.

    (cherry picked from commit 64ce76d940501cb04d14a0d36752b4f93473531c)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (54 preceding siblings ...)
  2022-09-02 12:30 ` cvs-commit at gcc dot gnu.org
@ 2022-09-02 12:32 ` rearnsha at gcc dot gnu.org
  2022-09-02 14:07 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-09-02 12:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #55 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #53)

> 
> gcc-12 seems to be generating wrong-code for a different unit-test:

I've just pushed my patch to the gcc-12 branch, could you try that please?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (55 preceding siblings ...)
  2022-09-02 12:32 ` rearnsha at gcc dot gnu.org
@ 2022-09-02 14:07 ` cvs-commit at gcc dot gnu.org
  2022-09-27 15:28 ` malat at debian dot org
                   ` (2 subsequent siblings)
  59 siblings, 0 replies; 61+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-09-02 14:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #56 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Richard Earnshaw
<rearnsha@gcc.gnu.org>:

https://gcc.gnu.org/g:50982aa1145fbdb315162349833412639aa8bc4c

commit r11-10232-g50982aa1145fbdb315162349833412639aa8bc4c
Author: Richard Earnshaw <rearnsha@arm.com>
Date:   Wed Aug 3 10:01:51 2022 +0100

    cselib: add function to check if SET is redundant [PR106187]

    A SET operation that writes memory may have the same value as an
    earlier store but if the alias sets of the new and earlier store do
    not conflict then the set is not truly redundant.  This can happen,
    for example, if objects of different types share a stack slot.

    To fix this we define a new function in cselib that first checks for
    equality and if that is successful then finds the earlier store in the
    value history and checks the alias sets.

    The routine is used in two places elsewhere in the compiler:
    cfgcleanup and postreload.

    gcc/ChangeLog:

            PR rtl-optimization/106187
            * alias.h (mems_same_for_tbaa_p): Declare.
            * alias.c (mems_same_for_tbaa_p): New function.
            * dse.c (record_store): Use it instead of open-coding
            alias check.
            * cselib.h (cselib_redundant_set_p): Declare.
            * cselib.c: Include alias.h
            (cselib_redundant_set_p): New function.
            * cfgcleanup.c: (mark_effect): Use cselib_redundant_set_p instead
            of rtx_equal_for_cselib_p.
            * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
            (reload_cse_noop_set_p): Delete.

    (cherry picked from commit 64ce76d940501cb04d14a0d36752b4f93473531c)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (56 preceding siblings ...)
  2022-09-02 14:07 ` cvs-commit at gcc dot gnu.org
@ 2022-09-27 15:28 ` malat at debian dot org
  2022-09-27 15:54 ` rearnsha at gcc dot gnu.org
  2024-01-20 17:20 ` pinskia at gcc dot gnu.org
  59 siblings, 0 replies; 61+ messages in thread
From: malat at debian dot org @ 2022-09-27 15:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

--- Comment #57 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Richard Earnshaw from comment #55)
> (In reply to Mathieu Malaterre from comment #53)
> 
> > 
> > gcc-12 seems to be generating wrong-code for a different unit-test:
> 
> I've just pushed my patch to the gcc-12 branch, could you try that please?

Richard, I am using the latest gcc-12 update from doko@d.o:

*
https://tracker.debian.org/news/1363780/accepted-gcc-12-1220-3-source-into-unstable/

I can reproduce the failing test:

*
https://buildd.debian.org/status/fetch.php?pkg=highway&arch=armhf&ver=1.0.2%7Egit20220901.9b3bd6d-2&stamp=1664289672&raw=0

But I cannot reproduce it using gcc-snapshot:

* https://packages.qa.debian.org/g/gcc-snapshot/news/20220920T113715Z.html

Let me know if you want more info

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (57 preceding siblings ...)
  2022-09-27 15:28 ` malat at debian dot org
@ 2022-09-27 15:54 ` rearnsha at gcc dot gnu.org
  2024-01-20 17:20 ` pinskia at gcc dot gnu.org
  59 siblings, 0 replies; 61+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-09-27 15:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #58 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Thanks, I think that's enough evidence to resolve this then.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Bug rtl-optimization/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)
  2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
                   ` (58 preceding siblings ...)
  2022-09-27 15:54 ` rearnsha at gcc dot gnu.org
@ 2024-01-20 17:20 ` pinskia at gcc dot gnu.org
  59 siblings, 0 replies; 61+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-20 17:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.4

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2024-01-20 17:20 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-04 16:23 [Bug c++/106187] New: armhf: Miscompilation with -O2 mathieu.malaterre at gmail dot com
2022-07-04 16:37 ` [Bug c++/106187] " mathieu.malaterre at gmail dot com
2022-07-04 16:37 ` mathieu.malaterre at gmail dot com
2022-07-04 16:42 ` [Bug c++/106187] armhf: Miscompilation at all optimization levels mathieu.malaterre at gmail dot com
2022-07-04 20:19 ` pinskia at gcc dot gnu.org
2022-07-04 20:22 ` pinskia at gcc dot gnu.org
2022-07-05  7:18 ` [Bug target/106187] " mathieu.malaterre at gmail dot com
2022-07-05  7:46 ` [Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working) jan.wassenberg at gmail dot com
2022-07-05  8:00 ` mathieu.malaterre at gmail dot com
2022-07-07  7:50 ` mathieu.malaterre at gmail dot com
2022-07-07  8:00 ` mathieu.malaterre at gmail dot com
2022-07-07  9:38 ` rearnsha at gcc dot gnu.org
2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
2022-07-08  9:01 ` mathieu.malaterre at gmail dot com
2022-07-08  9:03 ` mathieu.malaterre at gmail dot com
2022-07-08 13:50 ` rearnsha at gcc dot gnu.org
2022-07-08 13:59 ` malat at debian dot org
2022-07-08 14:16 ` rearnsha at gcc dot gnu.org
2022-07-08 14:18 ` malat at debian dot org
2022-07-08 14:51 ` rearnsha at gcc dot gnu.org
2022-07-08 15:03 ` malat at debian dot org
2022-07-08 17:24 ` rearnsha at gcc dot gnu.org
2022-07-08 17:31 ` rearnsha at gcc dot gnu.org
2022-07-08 19:27 ` jan.wassenberg at gmail dot com
2022-07-14 13:03 ` rearnsha at gcc dot gnu.org
2022-07-14 13:18 ` rearnsha at gcc dot gnu.org
2022-07-14 16:09 ` pinskia at gcc dot gnu.org
2022-07-18 15:45 ` rearnsha at gcc dot gnu.org
2022-07-18 15:48 ` rearnsha at gcc dot gnu.org
2022-07-19  7:27 ` rguenth at gcc dot gnu.org
2022-07-19  9:00 ` rearnsha at gcc dot gnu.org
2022-07-19  9:13 ` rguenth at gcc dot gnu.org
2022-07-21  9:25 ` rearnsha at gcc dot gnu.org
2022-07-22 12:52 ` rearnsha at gcc dot gnu.org
2022-07-22 13:24 ` rearnsha at gcc dot gnu.org
2022-07-25  6:12 ` rguenth at gcc dot gnu.org
2022-07-25  9:44 ` rearnsha at gcc dot gnu.org
2022-07-25  9:50 ` rguenther at suse dot de
2022-07-25  9:59 ` rearnsha at gcc dot gnu.org
2022-07-25 10:24 ` rguenth at gcc dot gnu.org
2022-07-25 10:26 ` [Bug rtl-optimization/106187] " rguenth at gcc dot gnu.org
2022-07-25 10:33 ` rearnsha at gcc dot gnu.org
2022-07-25 10:42 ` rguenth at gcc dot gnu.org
2022-07-25 10:48 ` rearnsha at gcc dot gnu.org
2022-07-25 11:03 ` rguenther at suse dot de
2022-07-25 11:05 ` rguenth at gcc dot gnu.org
2022-07-25 13:04 ` rearnsha at gcc dot gnu.org
2022-07-25 14:45 ` rguenther at suse dot de
2022-07-27 13:35 ` rearnsha at gcc dot gnu.org
2022-07-28 16:51 ` rearnsha at gcc dot gnu.org
2022-08-03  9:07 ` cvs-commit at gcc dot gnu.org
2022-08-03  9:16 ` rearnsha at gcc dot gnu.org
2022-08-10  7:06 ` malat at debian dot org
2022-08-10  7:11 ` pinskia at gcc dot gnu.org
2022-09-02  9:28 ` malat at debian dot org
2022-09-02 12:30 ` cvs-commit at gcc dot gnu.org
2022-09-02 12:32 ` rearnsha at gcc dot gnu.org
2022-09-02 14:07 ` cvs-commit at gcc dot gnu.org
2022-09-27 15:28 ` malat at debian dot org
2022-09-27 15:54 ` rearnsha at gcc dot gnu.org
2024-01-20 17:20 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).