public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3
@ 2023-09-25 11:53 malat at debian dot org
  2023-09-25 11:55 ` [Bug target/111591] " malat at debian dot org
                   ` (43 more replies)
  0 siblings, 44 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-25 11:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

            Bug ID: 111591
           Summary: ppc64be: miscompilation with -mstrict-align / -O3
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: malat at debian dot org
  Target Milestone: ---

I am seeing a regression in highway unit test on ppc64be when using
-mstrict-align / -O3

454/530 Test #454:
HwyWidenMulTestGroup/HwyWidenMulTest.TestAllSatWidenMulPairwiseAdd/EMU128  #
GetParam() = 2305843009213693952 .............Subprocess aborted***Exception:  
0.00 sec
Running main() from ./googletest/src/gtest_main.cc
Note: Google Test filter =
HwyWidenMulTestGroup/HwyWidenMulTest.TestAllSatWidenMulPairwiseAdd/EMU128
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HwyWidenMulTestGroup/HwyWidenMulTest
[ RUN      ]
HwyWidenMulTestGroup/HwyWidenMulTest.TestAllSatWidenMulPairwiseAdd/EMU128


i16x4 expect [0+ ->]:
  0x7FFF,0x7FFF,0x7FFF,0x7FFF,
i16x4 actual [0+ ->]:
  0x7FFF,0x01A5,0x7FFF,0x7FFF,
Abort at ./hwy/tests/widen_mul_test.cc:205: EMU128, i16x4 lane 1 mismatch:
expected '0x7FFF', got '0x01A5'.



ref:
https://buildd.debian.org/status/fetch.php?pkg=highway&arch=ppc64&ver=1.0.8%7Egit20230918.1e3a3d7-4&stamp=1695113957&raw=0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
@ 2023-09-25 11:55 ` malat at debian dot org
  2023-09-25 11:59 ` rguenth at gcc dot gnu.org
                   ` (42 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-25 11:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #1 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55989
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55989&action=edit
cvise reduced test case

% g++ -std=c++11 -o works -DHWY_COMPILE_ONLY_EMU128 -DHWY_BROKEN_EMU128=0
-maltivec -mcpu=power8  -g -O3 alt.cc  -Wall -Wextra -Werror -Wfatal-errors

% g++ -std=c++11 -o fails -DHWY_COMPILE_ONLY_EMU128 -DHWY_BROKEN_EMU128=0
-maltivec -mcpu=power8 -mstrict-align -g -O3 alt.cc  -Wall -Wextra -Werror
-Wfatal-errors

should give:

% ./works
-> success

but:

% ./fails 
fails: alt.cc:395: void hwy::detail::AssertArrayEqual(const TypeInfo&, const
void*, const void*, size_t, const char*, const char*, int): Assertion
`memcmp(a, b, c * ti.sizeof_t) == 0' failed.
zsh: abort      ./fails

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
  2023-09-25 11:55 ` [Bug target/111591] " malat at debian dot org
@ 2023-09-25 11:59 ` rguenth at gcc dot gnu.org
  2023-09-25 12:20 ` malat at debian dot org
                   ` (41 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-09-25 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
does it work with older GCC?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
  2023-09-25 11:55 ` [Bug target/111591] " malat at debian dot org
  2023-09-25 11:59 ` rguenth at gcc dot gnu.org
@ 2023-09-25 12:20 ` malat at debian dot org
  2023-09-25 13:15 ` malat at debian dot org
                   ` (40 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-25 12:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #3 from Mathieu Malaterre <malat at debian dot org> ---
I can make the upstream code fails using g++-11 / g++-12 version (Debian/sid).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (2 preceding siblings ...)
  2023-09-25 12:20 ` malat at debian dot org
@ 2023-09-25 13:15 ` malat at debian dot org
  2023-09-25 13:41 ` malat at debian dot org
                   ` (39 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-25 13:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Mathieu Malaterre <malat at debian dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |10.5.0

--- Comment #4 from Mathieu Malaterre <malat at debian dot org> ---
g++-10 seems to handle -O3/-mstrict-align

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (3 preceding siblings ...)
  2023-09-25 13:15 ` malat at debian dot org
@ 2023-09-25 13:41 ` malat at debian dot org
  2023-09-26  6:50 ` linkw at gcc dot gnu.org
                   ` (38 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-25 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Mathieu Malaterre <malat at debian dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |11.4.0

--- Comment #5 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Mathieu Malaterre from comment #3)
> I can make the upstream code fails using g++-11 / g++-12 version
> (Debian/sid).

Nevermind, it seems g++ 11.4.0 can handle the original test case.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (4 preceding siblings ...)
  2023-09-25 13:41 ` malat at debian dot org
@ 2023-09-26  6:50 ` linkw at gcc dot gnu.org
  2023-09-26  7:56 ` rguenth at gcc dot gnu.org
                   ` (37 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-09-26  6:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bergner at gcc dot gnu.org,
                   |                            |linkw at gcc dot gnu.org,
                   |                            |segher at gcc dot gnu.org
   Last reconfirmed|                            |2023-09-26
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #6 from Kewen Lin <linkw at gcc dot gnu.org> ---
Confirmed, thanks for reporting.

I noticed that the reduced test case in #c1 can make gcc-13 complain with:

test.cc:67:16: error: expected type-specifier before ‘__remove_reference’
   67 |   using type = __remove_reference(_Tp);
      |                ^~~~~~~~~~~~~~~~~~

Also gcc-12, gcc-11 build.

Is it expected? If yes, could we have a reduced test case to survive for gcc-12
and gcc-11 compilation? I think it would help bisection.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (5 preceding siblings ...)
  2023-09-26  6:50 ` linkw at gcc dot gnu.org
@ 2023-09-26  7:56 ` rguenth at gcc dot gnu.org
  2023-09-26  8:14 ` linkw at gcc dot gnu.org
                   ` (36 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-09-26  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suppose it works with -fno-tree-vectorize ontop of the flags?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (6 preceding siblings ...)
  2023-09-26  7:56 ` rguenth at gcc dot gnu.org
@ 2023-09-26  8:14 ` linkw at gcc dot gnu.org
  2023-09-26  9:28 ` malat at debian dot org
                   ` (35 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-09-26  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #8 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> I suppose it works with -fno-tree-vectorize ontop of the flags?

Appending -fno-tree-vectorize at the end of the given flags in #c1
(-mstrict-align version), it still fails.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (7 preceding siblings ...)
  2023-09-26  8:14 ` linkw at gcc dot gnu.org
@ 2023-09-26  9:28 ` malat at debian dot org
  2023-09-26  9:28 ` malat at debian dot org
                   ` (34 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-26  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #9 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55992
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55992&action=edit
foo.cc

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (8 preceding siblings ...)
  2023-09-26  9:28 ` malat at debian dot org
@ 2023-09-26  9:28 ` malat at debian dot org
  2023-09-26  9:31 ` malat at debian dot org
                   ` (33 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-26  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #10 from Mathieu Malaterre <malat at debian dot org> ---
Created attachment 55993
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55993&action=edit
widen_mul_test.cc

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (9 preceding siblings ...)
  2023-09-26  9:28 ` malat at debian dot org
@ 2023-09-26  9:31 ` malat at debian dot org
  2023-09-26  9:31 ` malat at debian dot org
                   ` (32 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-26  9:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #11 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Kewen Lin from comment #6)
> Confirmed, thanks for reporting.
> 
> I noticed that the reduced test case in #c1 can make gcc-13 complain with:
> 
> test.cc:67:16: error: expected type-specifier before ‘__remove_reference’
>    67 |   using type = __remove_reference(_Tp);
>       |                ^~~~~~~~~~~~~~~~~~
> 
> Also gcc-12, gcc-11 build.
> 
> Is it expected? If yes, could we have a reduced test case to survive for
> gcc-12 and gcc-11 compilation? I think it would help bisection.

`__remove_reference` must be something new in g++-13.

Anyway I started a cvise regression using g++-11 against g++-12. You can try on
your side:

g++-11 -std=c++11 -o works -maltivec -mcpu=power8 -mstrict-align -g -O3
widen_mul_test.cc foo.cc -Wall -Wextra -Werror -Wfatal-errors

vs

g++-12 -std=c++11 -o fails -maltivec -mcpu=power8 -mstrict-align -g -O3
widen_mul_test.cc foo.cc -Wall -Wextra -Werror -Wfatal-errors

For some reason if I copy/paste foo.cc into the main cc file (gnu::noipa) I can
not reproduce the issue. So you'll have to download both *.cc files.

Thanks !

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (10 preceding siblings ...)
  2023-09-26  9:31 ` malat at debian dot org
@ 2023-09-26  9:31 ` malat at debian dot org
  2023-09-27  9:24 ` linkw at gcc dot gnu.org
                   ` (31 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-09-26  9:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #12 from Mathieu Malaterre <malat at debian dot org> ---
For reference

malat@perotto ~/pr2 % g++-11 --version
g++-11 (Debian 11.4.0-4) 11.4.0

malat@perotto ~/pr2 % g++-12 --version
g++-12 (Debian 12.3.0-9) 12.3.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (11 preceding siblings ...)
  2023-09-26  9:31 ` malat at debian dot org
@ 2023-09-27  9:24 ` linkw at gcc dot gnu.org
  2023-09-27  9:26 ` rguenth at gcc dot gnu.org
                   ` (30 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-09-27  9:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |msebor at gcc dot gnu.org

--- Comment #13 from Kewen Lin <linkw at gcc dot gnu.org> ---
Thanks again for the reduced test case and the information!

I tried to bisect it but encountered some build failures on _Float32 error
etc., through grepping the log I switched to start from r13-2887 (good) to
r13-7206 (bad).

The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
backported to GCC-12, it seems to match the observation new gcc-12 fail while
gcc-11 pass.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (12 preceding siblings ...)
  2023-09-27  9:24 ` linkw at gcc dot gnu.org
@ 2023-09-27  9:26 ` rguenth at gcc dot gnu.org
  2023-09-28  0:20 ` linkw at gcc dot gnu.org
                   ` (29 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-09-27  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #13)
> Thanks again for the reduced test case and the information!
> 
> I tried to bisect it but encountered some build failures on _Float32 error
> etc., through grepping the log I switched to start from r13-2887 (good) to
> r13-7206 (bad).
> 
> The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
> backported to GCC-12, it seems to match the observation new gcc-12 fail
> while gcc-11 pass.

Note this change likely triggers a latent issue but it might help analyzing the
issue.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (13 preceding siblings ...)
  2023-09-27  9:26 ` rguenth at gcc dot gnu.org
@ 2023-09-28  0:20 ` linkw at gcc dot gnu.org
  2023-10-13 10:19 ` linkw at gcc dot gnu.org
                   ` (28 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-09-28  0:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |linkw at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #15 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> (In reply to Kewen Lin from comment #13)
> > Thanks again for the reduced test case and the information!
> > 
> > I tried to bisect it but encountered some build failures on _Float32 error
> > etc., through grepping the log I switched to start from r13-2887 (good) to
> > r13-7206 (bad).
> > 
> > The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
> > backported to GCC-12, it seems to match the observation new gcc-12 fail
> > while gcc-11 pass.
> 
> Note this change likely triggers a latent issue but it might help analyzing
> the issue.

Thanks for the hint! Yeah, I tried -fdisable-tree-esra and -fdisable-tree-sra,
the failure is still there, I supposed that commit only takes effect when SRA
is enabled. I'll continue to investigate it. btw, I'm just starting two weeks
vacation so may respond slowly.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (14 preceding siblings ...)
  2023-09-28  0:20 ` linkw at gcc dot gnu.org
@ 2023-10-13 10:19 ` linkw at gcc dot gnu.org
  2023-10-13 11:08 ` rguenth at gcc dot gnu.org
                   ` (27 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-13 10:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #16 from Kewen Lin <linkw at gcc dot gnu.org> ---
Tracing down it with template specialization, the aborting happens on

  auto vn_b = Load(dn, in_b.get());
  HWY_ASSERT_VEC_EQ(
      dw, vw_signed_max,
      SatWidenMulPairwiseAdd(
          dw, InterleaveLower(dn_u, BitCast(dn_u, vn_b), vn_unsigned_max),
          InterleaveLower(dn, vn_b, vn_signed_max)));

with "void operator()(int8_t, CappedTag<int8_t, 8> dn)"

by isolating, it doesn't get the expected result on "b0" for function

template <class DI16, class VU8, class VI8>
HWY_API Vec<DI16> SatWidenMulPairwiseAdd(DI16 di16, VU8 a, VI8 b) {
  RebindToUnsigned<decltype(di16)> du16;
  auto a0 = And(BitCast(di16, a), Set(di16, 255));
  auto b0 = ShiftRight<8>(ShiftLeft<8>(BitCast(di16, b)));
  auto a1 = BitCast(di16, ShiftRight<8>(BitCast(du16, a)));
  auto b1 = ShiftRight<8>(BitCast(di16, b));
  return SaturatedAdd(Mul(a0, b0), Mul(a1, b1));
}

specialized with 
template <> HWY_API Vec128<int16_t, 4> SatWidenMulPairwiseAdd(Simd<int16_t, 4,
0> di16, Vec128<uint8_t, 8> a, Vec128<int8_t, 8> b)

further found that the unexpected values are from ShiftLeft<8>, the tree
optimized code looks expected but the final insn sequence look in wrong order.
Either -fdisable-rtl-sched2 or -fdisable-rtl-sched1 can make it pass. With
counter, I see an unexpected insn movement in sched2 on insn 395.

...

 1436: %10:DI=0x70
      REG_EQUIV 0x70
 1438: %9:DI=0xc0
      REG_EQUIV 0xc0
 1437: %8:DI=0x1e0
      REG_EQUIV 0x1e0
 1441: %7:DI=0xd0
      REG_EQUIV 0xd0
  389: %0:V2DI=[%1:DI+%9:DI]
      REG_DEAD %9:DI
      REG_EQUAL [sfp:DI+0xc0]
 1445: %5:DI=0xb0
      REG_EQUIV 0xb0
 1714: %9:DI=0xff0000
      REG_EQUIV 0xff0000
  373: [%1:DI+0x70]=%4:DI
      REG_DEAD %4:DI
  375: [%1:DI+0x78]=%6:DI
      REG_DEAD %6:DI
 1715: %9:DI=%9:DI|0xff
 1785: %25:DI=high(unspec[`*.LC8',%2:DI] 47)
 1716: %9:DI=%9:DI&0xffffffff|%9:DI<<0x20
      REG_EQUIV 0xff00ff00ff00ff
  410: %28:DI=%1:DI+0xae
      REG_EQUAL sfp:DI+0xae
    6: %31:SI=0
      REG_EQUAL 0
 1786: %25:DI=%25:DI+low(unspec[`*.LC8',%2:DI] 47)
      REG_DEAD %2:DI
      REG_EQUAL `*.LC8'
  392: [%1:DI+%7:DI]=%0:V2DI
      REG_DEAD %7:DI
                                         // unexpected version having insn 395
moved here.
 1738: %12:V2DI=[%1:DI+%10:DI]
  376: [%1:DI+%8:DI]=%12:V2DI
      REG_DEAD %12:V2DI
      REG_DEAD %8:DI
      REG_EQUIV [sfp:DI+%8:DI]
      REG_EQUAL [sfp:DI+0x70]
  390: [%1:DI+%10:DI]=%0:V2DI            // since this store updates
[%1:DI+0x70] in 16 bytes, so the read
                                         // can't pass this  
      REG_DEAD %0:V2DI
  395: %4:DI=zero_extend([%1:DI+0x70])   //  <------ this is expected
  398: %6:DI=zero_extend([%1:DI+0x72])
  401: %7:DI=zero_extend([%1:DI+0x74])
  404: %8:DI=zero_extend([%1:DI+0x76])
  396: %4:SI=%4:SI<<0x8
  399: %6:SI=%6:SI<<0x8
  402: %7:SI=%7:SI<<0x8
  405: %8:SI=%8:SI<<0x8

 ....

the tree optimized IR for this part looks expected?

  <bb 51> [local count: 119292722]:
  v = a;
  MEM <unsigned char[16]> [(char * {ref-all})&D.38735] = MEM <unsigned
char[16]> [(char * {ref-all})&v];
  v ={v} {CLOBBER(eol)};
  vect_a_raw_0_1121.562_722 = MEM <vector(4) short int> [(short int
*)&D.38735];
  _215 = VIEW_CONVERT_EXPR<long unsigned int>(vect_a_raw_0_1121.562_722);
  _830 = _215 & 71777214294589695;
  _1549 = BIT_FIELD_REF <_830, 16, 32>;
  _1537 = BIT_FIELD_REF <_830, 16, 16>;
  _323 = BIT_FIELD_REF <_830, 16, 0>;
  v = b;
  MEM <unsigned char[16]> [(char * {ref-all})&b00] = MEM <unsigned char[16]>
[(char * {ref-all})&v]; 

                          ==> ref-all here, so should be executed before any
reads below?

  v ={v} {CLOBBER(eol)};
  v = b00;
  raw_u_1323 = v.raw[0];
  _1324 = raw_u_1323 << 8;
  v.raw[0] = _1324;
  raw_u_1403 = v.raw[1];
  _1404 = raw_u_1403 << 8;
  v.raw[1] = _1404;
  raw_u_1447 = v.raw[2];
  _1448 = raw_u_1447 << 8;
  v.raw[2] = _1448;
  raw_u_128 = v.raw[3];
  _129 = raw_u_128 << 8;
  v.raw[3] = _129;
  b01 = v;
  v ={v} {CLOBBER(eol)};
  ivtmp.577_734 = (unsigned long) &MEM <struct Vec128> [(void *)&b01 + -2B];

...

I guess there is some way to keep this kind of aliasing information after
expanding, need more investigations why sched considers it's safe to move.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (15 preceding siblings ...)
  2023-10-13 10:19 ` linkw at gcc dot gnu.org
@ 2023-10-13 11:08 ` rguenth at gcc dot gnu.org
  2023-10-13 12:09 ` linkw at gcc dot gnu.org
                   ` (26 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-13 11:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
it stores to a different object - but seeing the CLOBBERs, does
-fstack-reuse=none fix the issue?  Is r1 the stack pointer?

ref-all is carried to RTL by MEM_ALIAS_SET == 0.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (16 preceding siblings ...)
  2023-10-13 11:08 ` rguenth at gcc dot gnu.org
@ 2023-10-13 12:09 ` linkw at gcc dot gnu.org
  2023-10-13 12:32 ` rguenth at gcc dot gnu.org
                   ` (25 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-13 12:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #18 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #17)
> it stores to a different object - but seeing the CLOBBERs, does
> -fstack-reuse=none fix the issue?  Is r1 the stack pointer?

Just tried with -fstack-reuse=none, it can make it pass! Yes, r1 is stack
pointer.

> 
> ref-all is carried to RTL by MEM_ALIAS_SET == 0.

Got it, thanks!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (17 preceding siblings ...)
  2023-10-13 12:09 ` linkw at gcc dot gnu.org
@ 2023-10-13 12:32 ` rguenth at gcc dot gnu.org
  2023-10-16  9:11 ` linkw at gcc dot gnu.org
                   ` (24 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-13 12:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
So maybe it's the same issue as PR90348 (you can verify the RTL expansion dump
on whether the two involved decls are coalesced and see whether that's valid).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (18 preceding siblings ...)
  2023-10-13 12:32 ` rguenth at gcc dot gnu.org
@ 2023-10-16  9:11 ` linkw at gcc dot gnu.org
  2023-10-19  7:27 ` linkw at gcc dot gnu.org
                   ` (23 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-16  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #20 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #19)
> So maybe it's the same issue as PR90348 (you can verify the RTL expansion
> dump on whether the two involved decls are coalesced and see whether that's
> valid).

Thanks for the hints! Unfortunately the internal BE machine which I worked on
for this is unreachable today, will post more findings when it comes back.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (19 preceding siblings ...)
  2023-10-16  9:11 ` linkw at gcc dot gnu.org
@ 2023-10-19  7:27 ` linkw at gcc dot gnu.org
  2023-10-19 11:06 ` rguenth at gcc dot gnu.org
                   ` (22 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-19  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #21 from Kewen Lin <linkw at gcc dot gnu.org> ---
For optimized IR:

  a$raw$3_220 = D.39813.rawD.30221[3];
  vect_a_raw_4_70.539_1584 = MEM <vector(4) short intD.20> [(short intD.20
*)&D.39813 + 8B];
  _1640 = a$raw$0_221 & 255;
  _1649 = a$raw$1_74 & 255;
  _1658 = a$raw$2_264 & 255;
  _52 = a$raw$3_220 & 255;
  vD.39776 = bD.39739;                  // involved decl1
  MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM
<unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776];
  vD.39776 ={v} {CLOBBER(eol)};
  vD.39779 = b00D.39742;                // involved decl2
  raw_u_1614 = vD.39779.rawD.30221[0];
  _1615 = raw_u_1614 << 8;
  vD.39779.rawD.30221[0] = _1615;
  raw_u_1622 = vD.39779.rawD.30221[1];
  _1623 = raw_u_1622 << 8;
  vD.39779.rawD.30221[1] = _1623;
...

Partition 1: size 16 align 16
        D.39819 vD.39749        vD.39756        vD.39764        aD.39773       
vD.39779        vD.39735        vD.39736        aD.39630        vD.39636       
aD.39640        vD.39753        vD.39761        vD.39776        vD.39782

vD.39776 and vD.39779 are coalesced.

It's expanded as:

  vD.39776 = bD.39739;    

(insn 383 382 384 (set (reg:V2DI 616)
        (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 48 [0x30])) [7 MEM[(struct Vec128D.30433 *)_1274]+0
S16 A128])) -1
     (nil))

(insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
S16 A128])
        (reg:V2DI 616)) -1
     (nil))

  MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM
<unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776];

(insn 385 384 386 (set (reg:V2DI 617)
        (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [0 MEM <unsigned charD.25[16]> [(charD.5
* {ref-all})_10]+0 S16 A128])) "test.cc":14:19 -1
     (nil))

(insn 386 385 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 80 [0x50])) [0 MEM <unsigned charD.25[16]> [(charD.5
* {ref-all})_1277]+0 S16 A128])
        (reg:V2DI 617)) "test.cc":14:19 -1
     (nil))

  vD.39776 ={v} {CLOBBER(eol)};

  vD.39779 = b00D.39742;

(insn 387 386 388 (set (reg:V2DI 618)
        (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 80 [0x50])) [5 MEM[(struct Vec128D.30212 *)_1277]+0
S16 A128])) -1
     (nil))

(insn 388 387 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [5 MEM[(struct Vec128D.30212 *)_10]+0
S16 A128])
        (reg:V2DI 618)) -1
     (nil))

  raw_u_1614 = vD.39779.rawD.30221[0];
  _1615 = raw_u_1614 << 8;
  vD.39779.rawD.30221[0] = _1615;

;; v.raw[0] = _1615;

(insn 389 388 390 (set (reg:HI 619)
        (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
*)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
     (nil))

(insn 390 389 391 (set (reg:SI 620)
        (ashift:SI (subreg:SI (reg:HI 619) 0)
            (const_int 8 [0x8]))) "test.cc":218:14 -1
     (nil))

(insn 391 390 0 (set (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
*)_10].rawD.30221[0]+0 S2 A128])
        (subreg:HI (reg:SI 620) 2)) "test.cc":218:14 -1
     (nil))

=========

Later, insn 388 gets removed (also insn 387 and 385), as the store value is
exactly the same as what insn 384 has. And the scheduler doesn't consider there
is a dependence between insn 389 and insn 384 then results in unexpected move.

Hi Richi, do you think that this is exactly duplicated of known -fstack-reuse
issue?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (20 preceding siblings ...)
  2023-10-19  7:27 ` linkw at gcc dot gnu.org
@ 2023-10-19 11:06 ` rguenth at gcc dot gnu.org
  2023-10-19 11:12 ` rguenth at gcc dot gnu.org
                   ` (21 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-19 11:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
I see the mems properly get their base adjusted:

(insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
S16 A128])
        (reg:V2DI 616)) -1
     (nil))

vs.

(insn 389 388 390 (set (reg:HI 619)
        (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
*)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
     (nil))

both are based off a fake _10.  But we get alias sets 7 and 4 used here
which might be a problem.

See update_alias_info_with_stack_vars and uses of decls_to_pointers,
in particular from set_mem_attributes_minus_bitpos where we preserve
TBAA info with the rewrite.  I'm not sure why that should be OK ...
(but I'm sure I must have thought of this problem back in time)

Does the following fix the testcase?

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 84b6833225e..81c0a63eddc 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
objectp,
              tree *orig_base = &attrs.expr;
              while (handled_component_p (*orig_base))
                orig_base = &TREE_OPERAND (*orig_base, 0);
-             tree aptrt = reference_alias_ptr_type (*orig_base);
+             tree aptrt = ptr_type_node;
              *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
                                   build_int_cst (aptrt, 0));
            }

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (21 preceding siblings ...)
  2023-10-19 11:06 ` rguenth at gcc dot gnu.org
@ 2023-10-19 11:12 ` rguenth at gcc dot gnu.org
  2023-10-20  5:53 ` linkw at gcc dot gnu.org
                   ` (20 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-19 11:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
A less strict patch would remember whether all accesses to the decls coalesced
have accesses compatible with a common effective type (just checking whether
all decls have the same type isn't enough, even when they are not
addressable!),
so it's going to be a bit awkward to check - an IL walk noting per decl
whether all accesses conform to its declared type might be a good first step,
but with C++ abstraction we will likely end up with not many cases that
can be easily identifyable as doing so.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (22 preceding siblings ...)
  2023-10-19 11:12 ` rguenth at gcc dot gnu.org
@ 2023-10-20  5:53 ` linkw at gcc dot gnu.org
  2023-10-20  6:25 ` rguenth at gcc dot gnu.org
                   ` (19 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-20  5:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #24 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #22)
> I see the mems properly get their base adjusted:
> 
> (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
> S16 A128])
>         (reg:V2DI 616)) -1
>      (nil))
> 
> vs.
> 
> (insn 389 388 390 (set (reg:HI 619)
>         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
> *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
>      (nil))
> 
> both are based off a fake _10.  But we get alias sets 7 and 4 used here
> which might be a problem.
> 
> See update_alias_info_with_stack_vars and uses of decls_to_pointers,
> in particular from set_mem_attributes_minus_bitpos where we preserve
> TBAA info with the rewrite.  I'm not sure why that should be OK ...
> (but I'm sure I must have thought of this problem back in time)
> 
> Does the following fix the testcase?
> 
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 84b6833225e..81c0a63eddc 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
> objectp,
>               tree *orig_base = &attrs.expr;
>               while (handled_component_p (*orig_base))
>                 orig_base = &TREE_OPERAND (*orig_base, 0);
> -             tree aptrt = reference_alias_ptr_type (*orig_base);
> +             tree aptrt = ptr_type_node;
>               *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
>                                    build_int_cst (aptrt, 0));
>             }

Sorry, this doesn't help.

I noticed that it makes insns 384 and 389 become to:

(insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [7 MEM <struct Vec128D.30433> [(voidD.48
*)_10]+0 S16 A128])
        (reg:V2DI 616)) -1
     (nil))

(insn 389 388 390 (set (reg:HI 619)
        (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
                (const_int 16 [0x10])) [4 MEM <struct Vec128D.30212> [(voidD.48
*)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
     (nil))

alias sets are not changed. Aggressively further hacking with attrs.alias = 0
can make it pass. Can we make an new alias set for each partition? then all
involved decls in the same partition is aliased. For a particular involved
decl, it's aliased to the previous ones and the new ones in its own partitions.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (23 preceding siblings ...)
  2023-10-20  5:53 ` linkw at gcc dot gnu.org
@ 2023-10-20  6:25 ` rguenth at gcc dot gnu.org
  2023-10-23  3:21 ` linkw at gcc dot gnu.org
                   ` (18 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-20  6:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at suse dot de

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #24)
> (In reply to Richard Biener from comment #22)
> > I see the mems properly get their base adjusted:
> > 
> > (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
> > S16 A128])
> >         (reg:V2DI 616)) -1
> >      (nil))
> > 
> > vs.
> > 
> > (insn 389 388 390 (set (reg:HI 619)
> >         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
> > *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
> >      (nil))
> > 
> > both are based off a fake _10.  But we get alias sets 7 and 4 used here
> > which might be a problem.
> > 
> > See update_alias_info_with_stack_vars and uses of decls_to_pointers,
> > in particular from set_mem_attributes_minus_bitpos where we preserve
> > TBAA info with the rewrite.  I'm not sure why that should be OK ...
> > (but I'm sure I must have thought of this problem back in time)
> > 
> > Does the following fix the testcase?
> > 
> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > index 84b6833225e..81c0a63eddc 100644
> > --- a/gcc/emit-rtl.cc
> > +++ b/gcc/emit-rtl.cc
> > @@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
> > objectp,
> >               tree *orig_base = &attrs.expr;
> >               while (handled_component_p (*orig_base))
> >                 orig_base = &TREE_OPERAND (*orig_base, 0);
> > -             tree aptrt = reference_alias_ptr_type (*orig_base);
> > +             tree aptrt = ptr_type_node;
> >               *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
> >                                    build_int_cst (aptrt, 0));
> >             }
> 
> Sorry, this doesn't help.
> 
> I noticed that it makes insns 384 and 389 become to:
> 
> (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [7 MEM <struct Vec128D.30433>
> [(voidD.48 *)_10]+0 S16 A128])
>         (reg:V2DI 616)) -1
>      (nil))
> 
> (insn 389 388 390 (set (reg:HI 619)
>         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
>                 (const_int 16 [0x10])) [4 MEM <struct Vec128D.30212>
> [(voidD.48 *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
>      (nil))
> 
> alias sets are not changed.

Ah, probably the alias-set is determined from the unmangled ref ...

> Aggressively further hacking with attrs.alias =
> 0 can make it pass. Can we make an new alias set for each partition? then
> all involved decls in the same partition is aliased. For a particular
> involved decl, it's aliased to the previous ones and the new ones in its own
> partitions.

hmm, no - this won't work.  In fact even attrs.alias = 0 will probably
not work reliably since we can coalesce variables that escape and thus
the above will only alter accesses via the original decls but not any
accesses done via pointers.  So indeed any alias-set mangling is pointless
here.

Consider

 {
   A x;
   int * volatile p = &x;
   *p = 1;
   .. = *p;
 }
 {
   B y;
   float * volatile q = &y;
   *q = 1;
   .. = *q;
 }

if we coalesce x and y then we are not rewriting any accesses
but obviously the accesses still need to conflict - but the
indirect accesses will have their original non-conflicting alias-set
and thus the scheduler would be free to move the store to *q across
the load from *p (the "trick" would be to make an incentive to do so
of course).

That means we'd have to constrain code motion of accesses to remain
within the declared lifetime of the objects not only on GIMPLE but also
on RTL.  I don't see how we can do that, not even with any of the proposed
fixes to the stack-reuse issues :/  The whole point of the stack
coalescing code is to allow addresses of objects to escape - thus not
requiring to see all accesses but rely on markers in the IL constraining
object lifetime.  And those markers are thrown away - but they also only
are barriers for TBAA compatible accesses, not "storage-reuse" accesses
that the GIMPLE memory model (and C++ with placement new) allows :/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (24 preceding siblings ...)
  2023-10-20  6:25 ` rguenth at gcc dot gnu.org
@ 2023-10-23  3:21 ` linkw at gcc dot gnu.org
  2023-10-23  9:48 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-23  3:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #26 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #25)
> (In reply to Kewen Lin from comment #24)
> > (In reply to Richard Biener from comment #22)
> > > I see the mems properly get their base adjusted:
> > > 
> > > (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> > >                 (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
> > > S16 A128])
> > >         (reg:V2DI 616)) -1
> > >      (nil))
> > > 
> > > vs.
> > > 
> > > (insn 389 388 390 (set (reg:HI 619)
> > >         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> > >                 (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
> > > *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
> > >      (nil))
> > > 
> > > both are based off a fake _10.  But we get alias sets 7 and 4 used here
> > > which might be a problem.
> > > 
> > > See update_alias_info_with_stack_vars and uses of decls_to_pointers,
> > > in particular from set_mem_attributes_minus_bitpos where we preserve
> > > TBAA info with the rewrite.  I'm not sure why that should be OK ...
> > > (but I'm sure I must have thought of this problem back in time)
> > > 
> > > Does the following fix the testcase?
> > > 
> > > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > > index 84b6833225e..81c0a63eddc 100644
> > > --- a/gcc/emit-rtl.cc
> > > +++ b/gcc/emit-rtl.cc
> > > @@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
> > > objectp,
> > >               tree *orig_base = &attrs.expr;
> > >               while (handled_component_p (*orig_base))
> > >                 orig_base = &TREE_OPERAND (*orig_base, 0);
> > > -             tree aptrt = reference_alias_ptr_type (*orig_base);
> > > +             tree aptrt = ptr_type_node;
> > >               *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
> > >                                    build_int_cst (aptrt, 0));
> > >             }
> > 
> > Sorry, this doesn't help.
> > 
> > I noticed that it makes insns 384 and 389 become to:
> > 
> > (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [7 MEM <struct Vec128D.30433>
> > [(voidD.48 *)_10]+0 S16 A128])
> >         (reg:V2DI 616)) -1
> >      (nil))
> > 
> > (insn 389 388 390 (set (reg:HI 619)
> >         (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> >                 (const_int 16 [0x10])) [4 MEM <struct Vec128D.30212>
> > [(voidD.48 *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
> >      (nil))
> > 
> > alias sets are not changed.
> 
> Ah, probably the alias-set is determined from the unmangled ref ...
> 
> > Aggressively further hacking with attrs.alias =
> > 0 can make it pass. Can we make an new alias set for each partition? then
> > all involved decls in the same partition is aliased. For a particular
> > involved decl, it's aliased to the previous ones and the new ones in its own
> > partitions.
> 
> hmm, no - this won't work.  In fact even attrs.alias = 0 will probably
> not work reliably since we can coalesce variables that escape and thus
> the above will only alter accesses via the original decls but not any
> accesses done via pointers.  So indeed any alias-set mangling is pointless
> here.
> 
> Consider
> 
>  {
>    A x;
>    int * volatile p = &x;
>    *p = 1;
>    .. = *p;
>  }
>  {
>    B y;
>    float * volatile q = &y;
>    *q = 1;
>    .. = *q;
>  }
> 
> if we coalesce x and y then we are not rewriting any accesses
> but obviously the accesses still need to conflict - but the
> indirect accesses will have their original non-conflicting alias-set
> and thus the scheduler would be free to move the store to *q across
> the load from *p (the "trick" would be to make an incentive to do so
> of course).

Thanks for the clarification! Is it possible to update the alias set for the
indirect accesses as well? since we know the address is originally taken from
one coalesced decl (also update its propagated ones).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (25 preceding siblings ...)
  2023-10-23  3:21 ` linkw at gcc dot gnu.org
@ 2023-10-23  9:48 ` rguenth at gcc dot gnu.org
  2023-10-23 12:36 ` matz at gcc dot gnu.org
                   ` (16 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-23  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #27 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #26)
> (In reply to Richard Biener from comment #25)
> > (In reply to Kewen Lin from comment #24)
[...]
> > Ah, probably the alias-set is determined from the unmangled ref ...
> > 
> > > Aggressively further hacking with attrs.alias =
> > > 0 can make it pass. Can we make an new alias set for each partition? then
> > > all involved decls in the same partition is aliased. For a particular
> > > involved decl, it's aliased to the previous ones and the new ones in its own
> > > partitions.
> > 
> > hmm, no - this won't work.  In fact even attrs.alias = 0 will probably
> > not work reliably since we can coalesce variables that escape and thus
> > the above will only alter accesses via the original decls but not any
> > accesses done via pointers.  So indeed any alias-set mangling is pointless
> > here.
> > 
> > Consider
> > 
> >  {
> >    A x;
> >    int * volatile p = &x;
> >    *p = 1;
> >    .. = *p;
> >  }
> >  {
> >    B y;
> >    float * volatile q = &y;
> >    *q = 1;
> >    .. = *q;
> >  }
> > 
> > if we coalesce x and y then we are not rewriting any accesses
> > but obviously the accesses still need to conflict - but the
> > indirect accesses will have their original non-conflicting alias-set
> > and thus the scheduler would be free to move the store to *q across
> > the load from *p (the "trick" would be to make an incentive to do so
> > of course).
> 
> Thanks for the clarification! Is it possible to update the alias set for the
> indirect accesses as well? since we know the address is originally taken
> from one coalesced decl (also update its propagated ones).

I suppose we could record a bitmap of all decls participating in any
coalescing, check whether a MEM could possibly refer to any of them
via the points-to API and then force alias-set zero for those.  We
could also try to do sophisticated analysis to make assigning a new
alias-set for each coalesce group work, merging groups when there's
indirect accesses that could alias a member of more than a single
group.

Note that the other bugs linked perform wrong coalescings (for things
which have overlapping life time) while this one performs coalescing
wrong (not properly adjusting accesses so they later conflict).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (26 preceding siblings ...)
  2023-10-23  9:48 ` rguenth at gcc dot gnu.org
@ 2023-10-23 12:36 ` matz at gcc dot gnu.org
  2023-10-23 13:30 ` rguenth at gcc dot gnu.org
                   ` (15 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: matz at gcc dot gnu.org @ 2023-10-23 12:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Michael Matz <matz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu.org

--- Comment #28 from Michael Matz <matz at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #27)
> (In reply to Kewen Lin from comment #26)
> > Thanks for the clarification! Is it possible to update the alias set for the
> > indirect accesses as well? since we know the address is originally taken
> > from one coalesced decl (also update its propagated ones).

That's not generally possible, the address-taking and the actual access might
be
separated by arbitrary obfuscating code:

   char *p = &x;
   char *p2 = get_some_pointer(p);
   *p2 = ...

Here p2 may, or may not, point to x.  So we'd need to be fairly conservative
here ...

> I suppose we could record a bitmap of all decls participating in any
> coalescing, check whether a MEM could possibly refer to any of them
> via the points-to API

... which the points-to API of course will be.

> and then force alias-set zero for those.

So that will work.  But I wonder if the result then won't be that essentially
all of the mem accesses will get alias set zero, at least if there was any
coalescing.  At that point we may also bite the bullet and just do away
with any TBAA alias sets in RTL at all.

> We
> could also try to do sophisticated analysis to make assigning a new
> alias-set for each coalesce group work, merging groups when there's
> indirect accesses that could alias a member of more than a single
> group.

Question is if the sophistication is worth it.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (27 preceding siblings ...)
  2023-10-23 12:36 ` matz at gcc dot gnu.org
@ 2023-10-23 13:30 ` rguenth at gcc dot gnu.org
  2023-10-23 14:04 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-23 13:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|linkw at gcc dot gnu.org           |rguenth at gcc dot gnu.org

--- Comment #29 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will try to implement the simple approach.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (28 preceding siblings ...)
  2023-10-23 13:30 ` rguenth at gcc dot gnu.org
@ 2023-10-23 14:04 ` rguenth at gcc dot gnu.org
  2023-10-24  2:59 ` linkw at gcc dot gnu.org
                   ` (13 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-23 14:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #30 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 56175
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56175&action=edit
prototype patch

This is an (untested) fix, API wise needs some cleanup still.  It's the most
simple fix that doesn't completely give up on TBAA.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (29 preceding siblings ...)
  2023-10-23 14:04 ` rguenth at gcc dot gnu.org
@ 2023-10-24  2:59 ` linkw at gcc dot gnu.org
  2023-10-31  6:40 ` linkw at gcc dot gnu.org
                   ` (12 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-24  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #31 from Kewen Lin <linkw at gcc dot gnu.org> ---
Thanks for the explanation from both of you!

(In reply to Richard Biener from comment #30)
> Created attachment 56175 [details]
> prototype patch

I confirmed that this fix can make test case (#c9 + #c10) and its reduced case
pass, but the original test case (#c1) can't pass with this, it can't pass with
-fstack-reuse=none + -fno-strict-aliasing + -O2 either, I think the original
case still suffers another latent bug.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (30 preceding siblings ...)
  2023-10-24  2:59 ` linkw at gcc dot gnu.org
@ 2023-10-31  6:40 ` linkw at gcc dot gnu.org
  2023-12-13  8:00 ` cvs-commit at gcc dot gnu.org
                   ` (11 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-10-31  6:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #32 from Kewen Lin <linkw at gcc dot gnu.org> ---
> case pass, but the original test case (#c1) can't pass with this, it can't
> pass with -fstack-reuse=none + -fno-strict-aliasing + -O2 either, I think
> the original case still suffers another latent bug.

Well, I spent some time to narrow down the remaining issue on #c1 and found the
test case reduction generated unexpected test source (uninitialized array which
is used for assert comparison), it locates at below lines 438 to line 439:

   428  struct TestSatWidenMulPairwiseAdd {
   429    template <typename TN, class DN> void operator()(TN, DN dn) {
   430      using TN_U = MakeUnsigned<TN>;
   431      using TW = MakeWide<TN>;
   432      RepartitionToWide<DN> dw;
   433      using VW = Vec<decltype(dw)>;
   434      using VN = Vec<decltype(dn)>;
   435      size_t NN = Lanes(dn), NW = Lanes(dw), kMaxLanesPerNBlock =
sizeof(TN),
   436             kMaxLanesPerWBlock = sizeof(TW);
   437      RebindToUnsigned<decltype(dn)> dn_u;
   438      VW f0;
   439      VN nf0, nf1 = Set(dn, TN{});
   440      AssertVecEqual(dw, f0, SatWidenMulPairwiseAdd(dw, BitCast(dn_u,
nf0), nf0),
   441                     "", 0);


The expected result is f0 while the one used for the following
SatWidenMulPairwiseAdd computation is nf0, both are not initialized.
I'd expect they are initialized, like:

  VW f0 = Set(dw, TW{});
  VN nf0 = Set(dn, TN{});

So IMHO #c1 test case is problematic, hi @Mathieu, could you have a double
check?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (31 preceding siblings ...)
  2023-10-31  6:40 ` linkw at gcc dot gnu.org
@ 2023-12-13  8:00 ` cvs-commit at gcc dot gnu.org
  2023-12-13  8:02 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-13  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #33 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:97094d2ffd7d00261e6d7cc5d4a62dc7c2c89b64

commit r14-6481-g97094d2ffd7d00261e6d7cc5d4a62dc7c2c89b64
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Dec 13 08:54:49 2023 +0100

    middle-end/111591 - explain why TBAA doesn't need adjustment

    While tidying the prototype patch I've done for the reduced testcase
    in PR111591 and in that process trying to produce a testcase that
    is miscompiled by stack slot coalescing and the TBAA info that
    remains un-altered I've realized we do not need to adjust TBAA info.

    The following documents this in the place we adjust points-to info
    which we do need to adjust.

            PR middle-end/111591
            * cfgexpand.cc (update_alias_info_with_stack_vars): Document
            why not adjusting TBAA info on accesses is OK.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (32 preceding siblings ...)
  2023-12-13  8:00 ` cvs-commit at gcc dot gnu.org
@ 2023-12-13  8:02 ` rguenth at gcc dot gnu.org
  2023-12-13  8:12 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-12-13  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #56175|0                           |1
        is obsolete|                            |

--- Comment #34 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 56867
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56867&action=edit
patch adjusting TBAA info after stack slot sharing (unneeded)

This is the fully fleshed out patch adjusting TBAA info which in hindsight
shouldn't be necessary.  As side-effect the alias-set zero could of course
avoid the miscompile of the reduced testcase, but not by design.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (33 preceding siblings ...)
  2023-12-13  8:02 ` rguenth at gcc dot gnu.org
@ 2023-12-13  8:12 ` rguenth at gcc dot gnu.org
  2023-12-15  7:32 ` malat at debian dot org
                   ` (8 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-12-13  8:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|rguenth at gcc dot gnu.org         |unassigned at gcc dot gnu.org
             Status|ASSIGNED                    |WAITING

--- Comment #35 from Richard Biener <rguenth at gcc dot gnu.org> ---
Back to the reporter for the question in comment#32 around the validity of the
testcase.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (34 preceding siblings ...)
  2023-12-13  8:12 ` rguenth at gcc dot gnu.org
@ 2023-12-15  7:32 ` malat at debian dot org
  2023-12-15  7:43 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-12-15  7:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #36 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Kewen Lin from comment #32)
[...]
> So IMHO #c1 test case is problematic, hi @Mathieu, could you have a double
> check?

I vaguely recall crafting this test-case with cvise with gcc-13. This is why it
is using some kind of gcc-13 specific `__remove_reference`. I cannot verify the
code using clang because of this. I also do not see anything wrong under
valgrind.

If the other test-case(s) (also cvise-reduced) is/are now working I would say
let's close this one as fixed and if it re-appear in the original highway
source code, I'll re-run yet another cvise reduction.

Thanks everyone for your work ! Very much appreciated.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (35 preceding siblings ...)
  2023-12-15  7:32 ` malat at debian dot org
@ 2023-12-15  7:43 ` rguenth at gcc dot gnu.org
  2023-12-15 10:01 ` linkw at gcc dot gnu.org
                   ` (6 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-12-15  7:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |FIXED

--- Comment #37 from Richard Biener <rguenth at gcc dot gnu.org> ---
Let's do that then.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (36 preceding siblings ...)
  2023-12-15  7:43 ` rguenth at gcc dot gnu.org
@ 2023-12-15 10:01 ` linkw at gcc dot gnu.org
  2023-12-15 10:23 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-12-15 10:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #38 from Kewen Lin <linkw at gcc dot gnu.org> ---
I found this has been marked as resolved but it seems that the patch in comment
#34 hasn't been pushed, is it intended? or did I miss something that one commit
was pushed but wasn't associated to this PR?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (37 preceding siblings ...)
  2023-12-15 10:01 ` linkw at gcc dot gnu.org
@ 2023-12-15 10:23 ` rguenth at gcc dot gnu.org
  2023-12-15 11:28 ` malat at debian dot org
                   ` (4 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-12-15 10:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #39 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #38)
> I found this has been marked as resolved but it seems that the patch in
> comment #34 hasn't been pushed, is it intended? or did I miss something that
> one commit was pushed but wasn't associated to this PR?

Yes, that was intended - the patch is not necessary, there's no bug on the TBAA
side.  I understand that highway itself is now fine (for whatever reason) and
the reduced testcase invokes undefined behavior.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (38 preceding siblings ...)
  2023-12-15 10:23 ` rguenth at gcc dot gnu.org
@ 2023-12-15 11:28 ` malat at debian dot org
  2023-12-15 11:32 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: malat at debian dot org @ 2023-12-15 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #40 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Richard Biener from comment #39)
> (In reply to Kewen Lin from comment #38)
> > I found this has been marked as resolved but it seems that the patch in
> > comment #34 hasn't been pushed, is it intended? or did I miss something that
> > one commit was pushed but wasn't associated to this PR?
> 
> Yes, that was intended - the patch is not necessary, there's no bug on the
> TBAA side.  I understand that highway itself is now fine (for whatever
> reason) and
> the reduced testcase invokes undefined behavior.

Could someone confirm that patch in c30 is merged (commit id would be nice!),
since it solve the other test case ? Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (39 preceding siblings ...)
  2023-12-15 11:28 ` malat at debian dot org
@ 2023-12-15 11:32 ` rguenth at gcc dot gnu.org
  2023-12-18  6:01 ` linkw at gcc dot gnu.org
                   ` (2 subsequent siblings)
  43 siblings, 0 replies; 45+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-12-15 11:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #41 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Mathieu Malaterre from comment #40)
> (In reply to Richard Biener from comment #39)
> > (In reply to Kewen Lin from comment #38)
> > > I found this has been marked as resolved but it seems that the patch in
> > > comment #34 hasn't been pushed, is it intended? or did I miss something that
> > > one commit was pushed but wasn't associated to this PR?
> > 
> > Yes, that was intended - the patch is not necessary, there's no bug on the
> > TBAA side.  I understand that highway itself is now fine (for whatever
> > reason) and
> > the reduced testcase invokes undefined behavior.
> 
> Could someone confirm that patch in c30 is merged (commit id would be
> nice!), since it solve the other test case ? Thanks

No it's not merged, it's basically the same as the comment#34 one and shouldn't
be needed (but it might cause another unidentified issue to become latent).

What's the "other" testcase?  Do we know that doesn't suffer from the same
uninitialized issue?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (40 preceding siblings ...)
  2023-12-15 11:32 ` rguenth at gcc dot gnu.org
@ 2023-12-18  6:01 ` linkw at gcc dot gnu.org
  2023-12-18  6:03 ` linkw at gcc dot gnu.org
  2023-12-19  5:38 ` linkw at gcc dot gnu.org
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-12-18  6:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #42 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #41)
> What's the "other" testcase?  Do we know that doesn't suffer from the same
> uninitialized issue?

For "other" test cases, I guessed he referred to my comment #c31, these are
comment #c9 and #c10. Previously I further reduced #c10 and I didn't detect
obvious uninitialized issue (but I could be wrong).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (41 preceding siblings ...)
  2023-12-18  6:01 ` linkw at gcc dot gnu.org
@ 2023-12-18  6:03 ` linkw at gcc dot gnu.org
  2023-12-19  5:38 ` linkw at gcc dot gnu.org
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-12-18  6:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #43 from Kewen Lin <linkw at gcc dot gnu.org> ---
Created attachment 56899
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56899&action=edit
Previously reduced case for comment 10

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
  2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
                   ` (42 preceding siblings ...)
  2023-12-18  6:03 ` linkw at gcc dot gnu.org
@ 2023-12-19  5:38 ` linkw at gcc dot gnu.org
  43 siblings, 0 replies; 45+ messages in thread
From: linkw at gcc dot gnu.org @ 2023-12-19  5:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |---
             Status|RESOLVED                    |REOPENED

--- Comment #44 from Kewen Lin <linkw at gcc dot gnu.org> ---
I just checked test case in comment #43, I think those Set/Load are able to
initialize those arrays as expected, so re-opening this.

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2023-12-19  5:38 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-25 11:53 [Bug target/111591] New: ppc64be: miscompilation with -mstrict-align / -O3 malat at debian dot org
2023-09-25 11:55 ` [Bug target/111591] " malat at debian dot org
2023-09-25 11:59 ` rguenth at gcc dot gnu.org
2023-09-25 12:20 ` malat at debian dot org
2023-09-25 13:15 ` malat at debian dot org
2023-09-25 13:41 ` malat at debian dot org
2023-09-26  6:50 ` linkw at gcc dot gnu.org
2023-09-26  7:56 ` rguenth at gcc dot gnu.org
2023-09-26  8:14 ` linkw at gcc dot gnu.org
2023-09-26  9:28 ` malat at debian dot org
2023-09-26  9:28 ` malat at debian dot org
2023-09-26  9:31 ` malat at debian dot org
2023-09-26  9:31 ` malat at debian dot org
2023-09-27  9:24 ` linkw at gcc dot gnu.org
2023-09-27  9:26 ` rguenth at gcc dot gnu.org
2023-09-28  0:20 ` linkw at gcc dot gnu.org
2023-10-13 10:19 ` linkw at gcc dot gnu.org
2023-10-13 11:08 ` rguenth at gcc dot gnu.org
2023-10-13 12:09 ` linkw at gcc dot gnu.org
2023-10-13 12:32 ` rguenth at gcc dot gnu.org
2023-10-16  9:11 ` linkw at gcc dot gnu.org
2023-10-19  7:27 ` linkw at gcc dot gnu.org
2023-10-19 11:06 ` rguenth at gcc dot gnu.org
2023-10-19 11:12 ` rguenth at gcc dot gnu.org
2023-10-20  5:53 ` linkw at gcc dot gnu.org
2023-10-20  6:25 ` rguenth at gcc dot gnu.org
2023-10-23  3:21 ` linkw at gcc dot gnu.org
2023-10-23  9:48 ` rguenth at gcc dot gnu.org
2023-10-23 12:36 ` matz at gcc dot gnu.org
2023-10-23 13:30 ` rguenth at gcc dot gnu.org
2023-10-23 14:04 ` rguenth at gcc dot gnu.org
2023-10-24  2:59 ` linkw at gcc dot gnu.org
2023-10-31  6:40 ` linkw at gcc dot gnu.org
2023-12-13  8:00 ` cvs-commit at gcc dot gnu.org
2023-12-13  8:02 ` rguenth at gcc dot gnu.org
2023-12-13  8:12 ` rguenth at gcc dot gnu.org
2023-12-15  7:32 ` malat at debian dot org
2023-12-15  7:43 ` rguenth at gcc dot gnu.org
2023-12-15 10:01 ` linkw at gcc dot gnu.org
2023-12-15 10:23 ` rguenth at gcc dot gnu.org
2023-12-15 11:28 ` malat at debian dot org
2023-12-15 11:32 ` rguenth at gcc dot gnu.org
2023-12-18  6:01 ` linkw at gcc dot gnu.org
2023-12-18  6:03 ` linkw at gcc dot gnu.org
2023-12-19  5:38 ` linkw at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).