public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/110274] New: [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above
@ 2023-06-15 22:52 slyfox at gcc dot gnu.org
2023-06-15 22:53 ` [Bug target/110274] " slyfox at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-06-15 22:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110274
Bug ID: 110274
Summary: [14 Regression] Wrong AVX2 code on highway-1.0.4 on
-O1 and above
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: slyfox at gcc dot gnu.org
Target Milestone: ---
Initially observed test failures on highway-1.0.4 project against
r14-1868-ga4df0ce78d6f1b . There testsuite fails as:
The following tests FAILED:
299 - HwyCombineTestGroup/HwyCombineTest.TestAllConcatOddEven/AVX2 #
GetParam() = 512 (Subprocess aborted)
684 - HwyMulTestGroup/HwyMulTest.TestAllRearrangeToOddPlusEven/AVX2 #
GetParam() = 512 (Subprocess aborted)
If I did not miss anything here is the self-contained example:
$ cat shift_test.cc
#include <cstring> /* memcmp() */
#include <immintrin.h>
typedef __m256i v16x256;
typedef __m256i v32x256;
typedef __m128i v16x128;
static v16x256 Zero_() {
return _mm256_setzero_si256();
}
static v16x256 Iota0_() {
return _mm256_set_epi16(
int16_t{15}, int16_t{14}, int16_t{13}, int16_t{12},
int16_t{11}, int16_t{10}, int16_t{9}, int16_t{8},
int16_t{7}, int16_t{6}, int16_t{5}, int16_t{4},
int16_t{3}, int16_t{2}, int16_t{1}, int16_t{0});
}
static v16x128 LowerHalf_(v16x256 v) { return _mm256_castsi256_si128(v); }
static v16x128 UpperHalf_(v16x256 v) { return _mm256_extracti128_si256(v, 1); }
static v32x256 bcast_16_to_32(v16x256 v) { return v; }
static v32x256 And_(v32x256 a, v32x256 b) { return _mm256_and_si256(a, b); }
static v32x256 Set_16(int t) { return _mm256_set1_epi32(t); }
static v16x256 ConcatEven_(v16x256 hi, v16x256 lo) {
// Isolate lower 16 bits per u32 so we can pack.
const v32x256 mask = Set_16(0x0000FFFF);
const v32x256 uH = And_(bcast_16_to_32(hi), mask);
const v32x256 uL = And_(bcast_16_to_32(lo), mask);
const __m256i u16 = _mm256_packus_epi32(uL, uH);
return _mm256_permute4x64_epi64(u16, _MM_SHUFFLE(3, 1, 2, 0));
}
static v32x256 PromoteTo_(v16x128 v) { return _mm256_cvtepu16_epi32(v); }
static v32x256 Shl_32(v32x256 v, v32x256 bits) { return _mm256_sllv_epi32(v,
bits); }
static v16x256 bcast_32_to_16(v32x256 v) { return v; }
static v16x256 AVX2ShlU16Vec256_(v16x256 v, v16x256 bits) {
const v32x256 lo_shl_result = Shl_32(PromoteTo_(LowerHalf_(v)),
PromoteTo_(LowerHalf_(bits)));
const v32x256 hi_shl_result = Shl_32(PromoteTo_(UpperHalf_(v)),
PromoteTo_(UpperHalf_(bits)));
return ConcatEven_(bcast_32_to_16(hi_shl_result),
bcast_32_to_16(lo_shl_result));
}
static v16x256 Shl_16(v16x256 v, v16x256 bits) { return AVX2ShlU16Vec256_(v,
bits); }
static void TestAllVariableShifts() {
const auto v0 = Zero_();
const auto values = Iota0_();
const auto r = Shl_16(values, v0);
// is there a better way to compare __m256i?
if (memcmp(&values, &r, sizeof(r)) != 0)
__builtin_trap();
}
int main() { TestAllVariableShifts(); }
Triggering the bug:
$ g++ -o bug -O0 -mavx2 shift_test.cc && ./bug
<ok>
$ g++ -o bug -O1 -mavx2 shift_test.cc && ./bug
Illegal instruction (core dumped)
From what I understand the test generates an Iota sample vector and shifts it
left for 0 bits via mask register. Test expects that the result will not change
Iota value. But somehow gcc-14 generates something else.
Chances are I extracted the input incorrectly and introduced the bug. But
neither asan nor ubsan complain about it. Thus I expect -O0/-O1 to produce the
same result in any case.
$ g++ -v
Using built-in specs.
COLLECT_GCC=/<<NIX>>/gcc-14.0.0/bin/g++
COLLECT_LTO_WRAPPER=/<<NIX>>/gcc-14.0.0/libexec/gcc/x86_64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 99999999 (experimental) (GCC)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/110274] [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above
2023-06-15 22:52 [Bug target/110274] New: [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above slyfox at gcc dot gnu.org
@ 2023-06-15 22:53 ` slyfox at gcc dot gnu.org
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
2023-06-15 23:02 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-06-15 22:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110274
--- Comment #1 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
Created attachment 55336
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55336&action=edit
shift_test.cc
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/110274] [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above
2023-06-15 22:52 [Bug target/110274] New: [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above slyfox at gcc dot gnu.org
2023-06-15 22:53 ` [Bug target/110274] " slyfox at gcc dot gnu.org
@ 2023-06-15 23:01 ` pinskia at gcc dot gnu.org
2023-06-15 23:02 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-15 23:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110274
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 110235.
*** This bug has been marked as a duplicate of bug 110235 ***
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/110274] [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above
2023-06-15 22:52 [Bug target/110274] New: [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above slyfox at gcc dot gnu.org
2023-06-15 22:53 ` [Bug target/110274] " slyfox at gcc dot gnu.org
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
@ 2023-06-15 23:02 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-15 23:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110274
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note it is _mm256_packus_epi32 which is being miscompiled.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-15 23:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-15 22:52 [Bug target/110274] New: [14 Regression] Wrong AVX2 code on highway-1.0.4 on -O1 and above slyfox at gcc dot gnu.org
2023-06-15 22:53 ` [Bug target/110274] " slyfox at gcc dot gnu.org
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
2023-06-15 23:02 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).