From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 27CE83858285; Tue, 19 Jul 2022 07:58:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 27CE83858285 From: "ubizjak at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/106322] i386: Wrong code at O2 level (O0 / O1 are working) Date: Tue, 19 Jul 2022 07:58:11 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: ubizjak at gmail dot com X-Bugzilla-Status: WAITING X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2022 07:58:12 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106322 --- Comment #10 from Uro=C5=A1 Bizjak --- (In reply to Mathieu Malaterre from comment #9) > Technically I can also execute the `uint16` portion of the unit test and > produce a failure (so this seems to be consistent behavior with signed > counterpart): >=20 > ``` > HWY_NOINLINE void TestAllMulHigh() { > ForPartialVectors test; > // test(int16_t()); > test(uint16_t()); > } As this is a runtime failure, you will have to provide a (minimized) runtime testcase. I took a quick look at the sources and it looks to me that the following procedure can obtain a testcase: Use tests/mul_tests.cc and strip out as much lines as possible. Above the p= art that you show are several tests. Please find out which test fails. As can be seen from the test run, the failure is in the 128bit emulation pa= rt. These operations are in hwy/ops/emu128-inl.h, specifically: --cut here-- HWY_API Vec128 MulHigh(Vec128 a, const Vec128 b) { for (size_t i =3D 0; i < N; ++i) { // Cast to uint32_t first to prevent overflow. Otherwise the result of // uint16_t * uint16_t is in "int" which may overflow. In practice the // result is the same but this way it is also defined. a.raw[i] =3D static_cast( (static_cast(a.raw[i]) * static_cast(b.raw[i]))= >> 16); } return a; } --cut here-- Put everything together in one file, check if it still fails, and you have a testcase. If it is possible, simplify it as much as possible and if you can convert it to a plain C, the testcase will be much easier to analyse. The reason the test fails with gcc-12 is that gcc-12 enabled auto-vectorisa= tion for -O2. The failure suggests there are some issues with the vectorisation = of the above code, or perhaps with the preparation of test values before the l= oop.=