* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
@ 2021-03-01 17:04 ` marxin at gcc dot gnu.org
2021-03-01 17:05 ` jakub at gcc dot gnu.org
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-03-01 17:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.0
Last reconfirmed| |2021-03-01
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Summary|[11 Regression] ICE: in |[11 Regression] ICE: in
|extract_constrain_insn, at |extract_constrain_insn, at
|recog.c:2670: insn does not |recog.c:2670: insn does not
|satisfy its constraints: |satisfy its constraints:
|{*uminv16qi3} |{*uminv16qi3} since
| |r11-7121-g37876976b0511ec9
CC| |jakub at gcc dot gnu.org,
| |marxin at gcc dot gnu.org
--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
Confirmed, started with r11-7121-g37876976b0511ec9.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
2021-03-01 17:04 ` [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9 marxin at gcc dot gnu.org
@ 2021-03-01 17:05 ` jakub at gcc dot gnu.org
2021-03-01 19:06 ` jakub at gcc dot gnu.org
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-01 17:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
2021-03-01 17:04 ` [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9 marxin at gcc dot gnu.org
2021-03-01 17:05 ` jakub at gcc dot gnu.org
@ 2021-03-01 19:06 ` jakub at gcc dot gnu.org
2021-03-02 17:18 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-01 19:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I'm afraid we have multiple problems with -mavx512vl -mno-avx512bw (are there
any CPUs with that combination of ISA sets though?).
In r7-618-g9bdf001b7a2232753e4a92582218bb4f24c8d809 I've fixed the 16-byte
vp{min,max}ub to not allow v constraints when not AVX512BW.
But clearly many other patterns need something like that and don't have that.
E.g. vp{add,sub,{min,max},{u,s}}{b,w}, both 16-byte and 32-byte.
The result of that aren't ICEs, but code silently using AVX512BW features when
AVX512VL is enabled but AVX512BW is not.
Similarly, vpmullq needs AVX512DQ.
And, another thing is that the:
(define_peephole2
[(set (match_operand 0 "sse_reg_operand")
(match_operand 1 "sse_reg_operand"))
(set (match_dup 0)
(match_operator 3 "commutative_operator"
[(match_dup 0)
(match_operand 2 "memory_operand")]))]
"REGNO (operands[0]) != REGNO (operands[1])"
[(set (match_dup 0) (match_dup 2))
(set (match_dup 0)
(match_op_dup 3 [(match_dup 0) (match_dup 1)]))])
peephole2 doesn't work and results in ICEs if the patterns are correct (as is
the case of *uminv16qi3) if one is unlucky and operands[1] is [xy]mm16 or
higher register and operands[0] is not.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (2 preceding siblings ...)
2021-03-01 19:06 ` jakub at gcc dot gnu.org
@ 2021-03-02 17:18 ` jakub at gcc dot gnu.org
2021-03-03 9:07 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-02 17:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50288
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50288&action=edit
gcc11-pr99321.patch
Untested fix for the peephole2.
The rest will be done separately.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (3 preceding siblings ...)
2021-03-02 17:18 ` jakub at gcc dot gnu.org
@ 2021-03-03 9:07 ` cvs-commit at gcc dot gnu.org
2021-03-05 17:37 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-03 9:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:f1b13064609a41fcaf4d1859663453bba237e277
commit r11-7474-gf1b13064609a41fcaf4d1859663453bba237e277
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Mar 3 10:06:14 2021 +0100
i386: Fix a peephole2 for -mavx512vl -mno-avx512bw [PR99321]
As the testcase shows, the
(define_peephole2
[(set (match_operand 0 "sse_reg_operand")
(match_operand 1 "sse_reg_operand"))
(set (match_dup 0)
(match_operator 3 "commutative_operator"
[(match_dup 0)
(match_operand 2 "memory_operand")]))]
peephole2 can for AVX512VL without AVX512BW (I guess it is a hyphothetical
CPU, but unfortunately they are separate CPUID bits and we have separate
options for them) turn something that is valid without that peephole2
into something that is invalid (and in this case ICEs).
The problem is that the vpadd[bw], vpmullw, vpmin[su][bw] and vpmax[su][bw]
instructions require both AVX512BW and AVX512VL when they have
16-byte or 32-byte operands. If operands[0] is %[xy]mm0 .. %[xy]mm15
but operands[1] is %[xy]mm16 .. %[xy]mm31, then before we have
a vector move which uses vmovdqa{32,64} and doesn't need AVX512BW,
AVX512VL is I think implied from HARD_REGNO_MODE_OK only supporting
V{16Q,32Q,8H,16H}imode in EXT_REX_SSE_REGNO_P regs with AVX512VL, and then
we have a commutative operation with that %[xy]mm0 .. %[xy]mm15 destination
and one source and a memory operand, so VEX encoded operation.
And, the peephole2 wants to replace it with a load into the destination
register from memory (ok) and then the commutative arith instruction.
But that needs EVEX encoding because of the high register and so requires
AVX512BW which might not be enabled.
The exception is and/ior/xor, because the hw doesn't have
vp{and,or,xor}{b,w} instructions at all, it uses vp{and,or,xor}d instead
and that of course doesn't need AVX512BW.
BTW, there are other bugs I need to look at, while the vp{min,max}ub with
16-byte operands instruction properly requires avx512bw for v constraints
and otherwise uses x, e.g. the vpadd[bw] etc. instructions don't.
I'll try to handle that incrementally later this week.
2021-03-03 Jakub Jelinek <jakub@redhat.com>
PR target/99321
* config/i386/predicates.md (logic_operator): New define_predicate.
* config/i386/i386.md (mov + mem using comm arith peephole2):
Punt if operands[1] is EXT_REX_SSE_REGNO_P, AVX512BW is not enabled
and the inner mode is [QH]Imode.
* gcc.target/i386/pr99321.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (4 preceding siblings ...)
2021-03-03 9:07 ` cvs-commit at gcc dot gnu.org
@ 2021-03-05 17:37 ` jakub at gcc dot gnu.org
2021-03-05 18:16 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 17:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hjl.tools at gmail dot com
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Unfinished testcase showing various problems:
/* PR target/99321 */
/* Would need some effective target for GNU as that supports -march=+noavx512bw
etc. */
/* { dg-do compile { lp64 } } */
/* { dg-options "-O2 -mavx512vl -mno-avx512bw -Wa,-march=+noavx512bw" } */
#include <x86intrin.h>
typedef unsigned char V1 __attribute__((vector_size (16)));
typedef unsigned char V2 __attribute__((vector_size (32)));
typedef unsigned short V3 __attribute__((vector_size (16)));
typedef unsigned short V4 __attribute__((vector_size (32)));
void f1 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
void f2 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
void f3 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
void f4 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
void f5 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
void f6 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
void f7 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
void f8 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
void f9 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a *= b; __asm ("" : : "v" (a)); }
void f10 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a *= b; __asm ("" : : "v" (a)); }
void f11 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V1) _mm_min_epu8 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f12 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V2) _mm256_min_epu8 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f13 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V3) _mm_min_epu16 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f14 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V4) _mm256_min_epu16 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f15 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V1) _mm_min_epi8 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f16 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V2) _mm256_min_epi8 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f17 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V3) _mm_min_epi16 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f18 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V4) _mm256_min_epi16 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f19 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V1) _mm_max_epu8 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f20 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V2) _mm256_max_epu8 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f21 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V3) _mm_max_epu16 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f22 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V4) _mm256_max_epu16 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f23 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V1) _mm_max_epi8 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f24 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V2) _mm256_max_epi8 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
void f25 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V3) _mm_max_epi16 ((__m128i) a, (__m128i) b); __asm
("" : : "v" (a)); }
void f26 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm (""
: "=v" (a), "=v" (b)); a = (V4) _mm256_max_epi16 ((__m256i) a, (__m256i) b);
__asm ("" : : "v" (a)); }
/tmp/ccW4PsfG.s: Assembler messages:
/tmp/ccW4PsfG.s:9: Error: unsupported instruction `vpaddb'
/tmp/ccW4PsfG.s:20: Error: unsupported instruction `vpaddb'
/tmp/ccW4PsfG.s:31: Error: unsupported instruction `vpaddw'
/tmp/ccW4PsfG.s:42: Error: unsupported instruction `vpaddw'
/tmp/ccW4PsfG.s:53: Error: unsupported instruction `vpsubb'
/tmp/ccW4PsfG.s:64: Error: unsupported instruction `vpsubb'
/tmp/ccW4PsfG.s:75: Error: unsupported instruction `vpsubw'
/tmp/ccW4PsfG.s:86: Error: unsupported instruction `vpsubw'
/tmp/ccW4PsfG.s:97: Error: unsupported instruction `vpmullw'
/tmp/ccW4PsfG.s:108: Error: unsupported instruction `vpmullw'
/tmp/ccW4PsfG.s:133: Error: unsupported instruction `vpminub'
/tmp/ccW4PsfG.s:144: Error: unsupported instruction `vpminuw'
/tmp/ccW4PsfG.s:155: Error: unsupported instruction `vpminuw'
/tmp/ccW4PsfG.s:166: Error: unsupported instruction `vpminsb'
/tmp/ccW4PsfG.s:177: Error: unsupported instruction `vpminsb'
/tmp/ccW4PsfG.s:202: Error: unsupported instruction `vpminsw'
/tmp/ccW4PsfG.s:227: Error: unsupported instruction `vpmaxub'
/tmp/ccW4PsfG.s:238: Error: unsupported instruction `vpmaxuw'
/tmp/ccW4PsfG.s:249: Error: unsupported instruction `vpmaxuw'
/tmp/ccW4PsfG.s:260: Error: unsupported instruction `vpmaxsb'
/tmp/ccW4PsfG.s:271: Error: unsupported instruction `vpmaxsb'
/tmp/ccW4PsfG.s:296: Error: unsupported instruction `vpmaxsw'
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (5 preceding siblings ...)
2021-03-05 17:37 ` jakub at gcc dot gnu.org
@ 2021-03-05 18:16 ` jakub at gcc dot gnu.org
2021-03-05 18:41 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 18:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50311
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50311&action=edit
gcc11-pr99321.patch
One possible way to fix the above testcase (but not the many other insns, like
maybe:
vdbpsadbw vmovdqu16 vmovdqu8 vpabsb vpabsw vpackssdw vpacksswb vpackusdw
vpackuswb vpaddb vpaddsb vpaddsw vpaddusb vpaddusw vpaddw vpalignr vpavgb
vpavgw vpblendmb vpblendmw vpbroadcastb vpbroadcastw vpcmpb vpcmpeqb vpcmpeqw
vpcmpgtb vpcmpgtw vpcmpub vpcmpuw vpcmpw vpermi2w vpermt2w vpermw vpextrb
vpextrw vpinsrb vpinsrw vpmaddubsw vpmaddwd vpmaxsb vpmaxsw vpmaxub vpmaxuw
vpminsb vpminsw vpminub vpminuw vpmovb2m vpmovm2b vpmovm2w vpmovswb vpmovsxbw
vpmovuswb vpmovw2m vpmovwb vpmovzxbw vpmulhrsw vpmulhuw vpmulhw vpmullw vpsadbw
vpshufb vpshufhw vpshuflw vpslldq vpsllvw vpsllw vpsravw vpsraw vpsrldq vpsrlvw
vpsrlw vpsubb vpsubsb vpsubsw vpsubusb vpsubusw vpsubw vptestmb vptestmw
vptestnmb vptestnmw vpunpckhbw vpunpckhwd vpunpcklbw vpunpcklwd
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (6 preceding siblings ...)
2021-03-05 18:16 ` jakub at gcc dot gnu.org
@ 2021-03-05 18:41 ` jakub at gcc dot gnu.org
2021-03-07 9:30 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 18:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #50311|0 |1
is obsolete| |
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50312
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50312&action=edit
gcc11-pr99321.patch
Better version of that fix.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (7 preceding siblings ...)
2021-03-05 18:41 ` jakub at gcc dot gnu.org
@ 2021-03-07 9:30 ` cvs-commit at gcc dot gnu.org
2021-03-12 13:36 ` cvs-commit at gcc dot gnu.org
2021-03-12 13:39 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-07 9:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:a18ebd6c439227b048a91fbfa66f5983f884c157
commit r11-7548-ga18ebd6c439227b048a91fbfa66f5983f884c157
Author: Jakub Jelinek <jakub@redhat.com>
Date: Sun Mar 7 10:27:28 2021 +0100
i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]
As I wrote in the mail with the previous PR99321 fix, we have various
bugs where we emit instructions that need avx512bw and avx512vl
ISAs when compiling with -mavx512vl -mno-avx512bw.
Without the following patch, the attached testcase fails with:
/tmp/ccW4PsfG.s: Assembler messages:
/tmp/ccW4PsfG.s:9: Error: unsupported instruction `vpaddb'
/tmp/ccW4PsfG.s:20: Error: unsupported instruction `vpaddb'
/tmp/ccW4PsfG.s:31: Error: unsupported instruction `vpaddw'
/tmp/ccW4PsfG.s:42: Error: unsupported instruction `vpaddw'
/tmp/ccW4PsfG.s:53: Error: unsupported instruction `vpsubb'
/tmp/ccW4PsfG.s:64: Error: unsupported instruction `vpsubb'
/tmp/ccW4PsfG.s:75: Error: unsupported instruction `vpsubw'
/tmp/ccW4PsfG.s:86: Error: unsupported instruction `vpsubw'
/tmp/ccW4PsfG.s:97: Error: unsupported instruction `vpmullw'
/tmp/ccW4PsfG.s:108: Error: unsupported instruction `vpmullw'
/tmp/ccW4PsfG.s:133: Error: unsupported instruction `vpminub'
/tmp/ccW4PsfG.s:144: Error: unsupported instruction `vpminuw'
/tmp/ccW4PsfG.s:155: Error: unsupported instruction `vpminuw'
/tmp/ccW4PsfG.s:166: Error: unsupported instruction `vpminsb'
/tmp/ccW4PsfG.s:177: Error: unsupported instruction `vpminsb'
/tmp/ccW4PsfG.s:202: Error: unsupported instruction `vpminsw'
/tmp/ccW4PsfG.s:227: Error: unsupported instruction `vpmaxub'
/tmp/ccW4PsfG.s:238: Error: unsupported instruction `vpmaxuw'
/tmp/ccW4PsfG.s:249: Error: unsupported instruction `vpmaxuw'
/tmp/ccW4PsfG.s:260: Error: unsupported instruction `vpmaxsb'
/tmp/ccW4PsfG.s:271: Error: unsupported instruction `vpmaxsb'
/tmp/ccW4PsfG.s:296: Error: unsupported instruction `vpmaxsw'
We already have Yw constraint which is equivalent to v for
-mavx512bw -mavx512vl and to nothing otherwise, per discussions
this patch changes it to stand for x otherwise. As it is an
undocumented internal constraint, hopefully it won't affect
any inline asm in the wild.
For the instructions that need both we need to use Yw and
v for modes that don't need that.
2021-03-07 Jakub Jelinek <jakub@redhat.com>
PR target/99321
* config/i386/constraints.md (Yw): Use SSE_REGS if TARGET_SSE
but TARGET_AVX512BW or TARGET_AVX512VL is not set. Adjust
description
and comment.
* config/i386/sse.md (v_Yw): New define_mode_attr.
(*<insn><mode>3, *mul<mode>3<mask_name>, *avx2_<code><mode>3,
*sse4_1_<code><mode>3<mask_name>): Use <v_Yw> instead of v
in constraints.
* config/i386/mmx.md (mmx_pshufw_1, *vec_dupv4hi): Use Yw instead
of
xYw in constraints.
* lib/target-supports.exp
(check_effective_target_assembler_march_noavx512bw): New effective
target.
* gcc.target/i386/avx512vl-pr99321-1.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (8 preceding siblings ...)
2021-03-07 9:30 ` cvs-commit at gcc dot gnu.org
@ 2021-03-12 13:36 ` cvs-commit at gcc dot gnu.org
2021-03-12 13:39 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-12 13:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:3bb345c9313ad8f6a6c24abd7d5eaa11413bbe22
commit r11-7646-g3bb345c9313ad8f6a6c24abd7d5eaa11413bbe22
Author: Jakub Jelinek <jakub@redhat.com>
Date: Fri Mar 12 14:34:32 2021 +0100
i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]
This is the final patch of the series started with
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566139.html
and continued with
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566356.html
This time, I went through all the remaining instructions marked
by gas as requiring both AVX512BW and AVX512VL and for each checked
tmp-mddump.md, figure out if it ever could be a problem (e.g. instructions
that require AVX512BW+AVX512VL, but didn't exist before AVX512F are usually
fine, the patterns have the right conditions, the bugs are typically on
pre-AVX512F patterns where we have just blindly added v while they actually
can't access those unless AVX512BW+AVX512VL), added test where possible
(the test doesn't cover MMX though)and fixed md bugs.
For mmx pextr[bw]/pinsr[bw] patterns it introduces per discussions
a new YW constraint that only requires AVX512BW and not AVX512VL, because
those instructions only require the former and not latter when using EVEX
encoding.
There are some other interesting details, e.g. most of the 8 interleave
patterns (vpunck[hl]{bw,wd}) had correctly
&& <mask_avx512vl_condition> && <mask_avx512bw_condition>
in the conditions because for masking it needs to be always EVEX encoded
and then it needs both VL+BW, but 2 of those 8 had just
&& <mask_avx512vl_condition>
and so again would run into the -mavx512vl -mno-avx512bw problems.
Another problem different from others was mmx eq/gt comparisons, that was
using Yv constraints, so would happily accept %xmm16+ registers for
-mavx512vl, but there actually are no such EVEX encoded instructions,
as AVX512 comparisons work with %k* registers instead.
The newly added testcase without the patch fails with:
/tmp/ccVROLo2.s: Assembler messages:
/tmp/ccVROLo2.s:9: Error: unsupported instruction `vpabsb'
/tmp/ccVROLo2.s:20: Error: unsupported instruction `vpabsb'
/tmp/ccVROLo2.s:31: Error: unsupported instruction `vpabsw'
/tmp/ccVROLo2.s:42: Error: unsupported instruction `vpabsw'
/tmp/ccVROLo2.s:53: Error: unsupported instruction `vpaddsb'
/tmp/ccVROLo2.s:64: Error: unsupported instruction `vpaddsb'
/tmp/ccVROLo2.s:75: Error: unsupported instruction `vpaddsw'
/tmp/ccVROLo2.s:86: Error: unsupported instruction `vpaddsw'
/tmp/ccVROLo2.s:97: Error: unsupported instruction `vpsubsb'
/tmp/ccVROLo2.s:108: Error: unsupported instruction `vpsubsb'
/tmp/ccVROLo2.s:119: Error: unsupported instruction `vpsubsw'
/tmp/ccVROLo2.s:130: Error: unsupported instruction `vpsubsw'
/tmp/ccVROLo2.s:141: Error: unsupported instruction `vpaddusb'
/tmp/ccVROLo2.s:152: Error: unsupported instruction `vpaddusb'
/tmp/ccVROLo2.s:163: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:174: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:185: Error: unsupported instruction `vpsubusb'
/tmp/ccVROLo2.s:196: Error: unsupported instruction `vpsubusb'
/tmp/ccVROLo2.s:207: Error: unsupported instruction `vpsubusw'
/tmp/ccVROLo2.s:218: Error: unsupported instruction `vpsubusw'
/tmp/ccVROLo2.s:258: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:269: Error: unsupported instruction `vpavgb'
/tmp/ccVROLo2.s:280: Error: unsupported instruction `vpavgb'
/tmp/ccVROLo2.s:291: Error: unsupported instruction `vpavgw'
/tmp/ccVROLo2.s:302: Error: unsupported instruction `vpavgw'
/tmp/ccVROLo2.s:475: Error: unsupported instruction `vpmovsxbw'
/tmp/ccVROLo2.s:486: Error: unsupported instruction `vpmovsxbw'
/tmp/ccVROLo2.s:497: Error: unsupported instruction `vpmovzxbw'
/tmp/ccVROLo2.s:508: Error: unsupported instruction `vpmovzxbw'
/tmp/ccVROLo2.s:548: Error: unsupported instruction `vpmulhuw'
/tmp/ccVROLo2.s:559: Error: unsupported instruction `vpmulhuw'
/tmp/ccVROLo2.s:570: Error: unsupported instruction `vpmulhw'
/tmp/ccVROLo2.s:581: Error: unsupported instruction `vpmulhw'
/tmp/ccVROLo2.s:592: Error: unsupported instruction `vpsadbw'
/tmp/ccVROLo2.s:603: Error: unsupported instruction `vpsadbw'
/tmp/ccVROLo2.s:643: Error: unsupported instruction `vpshufhw'
/tmp/ccVROLo2.s:654: Error: unsupported instruction `vpshufhw'
/tmp/ccVROLo2.s:665: Error: unsupported instruction `vpshuflw'
/tmp/ccVROLo2.s:676: Error: unsupported instruction `vpshuflw'
/tmp/ccVROLo2.s:687: Error: unsupported instruction `vpslldq'
/tmp/ccVROLo2.s:698: Error: unsupported instruction `vpslldq'
/tmp/ccVROLo2.s:709: Error: unsupported instruction `vpsrldq'
/tmp/ccVROLo2.s:720: Error: unsupported instruction `vpsrldq'
/tmp/ccVROLo2.s:899: Error: unsupported instruction `vpunpckhbw'
/tmp/ccVROLo2.s:910: Error: unsupported instruction `vpunpckhbw'
/tmp/ccVROLo2.s:921: Error: unsupported instruction `vpunpckhwd'
/tmp/ccVROLo2.s:932: Error: unsupported instruction `vpunpckhwd'
/tmp/ccVROLo2.s:943: Error: unsupported instruction `vpunpcklbw'
/tmp/ccVROLo2.s:954: Error: unsupported instruction `vpunpcklbw'
/tmp/ccVROLo2.s:965: Error: unsupported instruction `vpunpcklwd'
/tmp/ccVROLo2.s:976: Error: unsupported instruction `vpunpcklwd'
2021-03-12 Jakub Jelinek <jakub@redhat.com>
PR target/99321
* config/i386/constraints.md (YW): New internal constraint.
* config/i386/sse.md (v_Yw): Add V4TI, V2TI, V1TI and TI cases.
(*<sse2_avx2>_<insn><mode>3<mask_name>,
*<sse2_avx2>_uavg<mode>3<mask_name>, *abs<mode>2,
*<s>mul<mode>3_highpart<mask_name>): Use <v_Yw> instead of v in
constraints.
(<sse2_avx2>_psadbw): Use YW instead of v in constraints.
(*avx2_pmaddwd, *sse2_pmaddwd, *<code>v8hi3, *<code>v16qi3,
avx2_pmaddubsw256, ssse3_pmaddubsw128): Merge last two alternatives
into one, use Yw instead of former x,v.
(ashr<mode>3, <insn><mode>3): Use <v_Yw> instead of x in
constraints of
the last alternative.
(<sse2_avx2>_packsswb<mask_name>, <sse2_avx2>_packssdw<mask_name>,
<sse2_avx2>_packuswb<mask_name>, <sse4_1_avx2>_packusdw<mask_name>,
*<ssse3_avx2>_pmulhrsw<mode>3<mask_name>,
<ssse3_avx2>_palignr<mode>,
<ssse3_avx2>_pshufb<mode>3<mask_name>): Merge last two alternatives
into one, use <v_Yw> instead of former x,v.
(avx2_interleave_highv32qi<mask_name>,
vec_interleave_highv16qi<mask_name>): Use Yw instead of v in
constraints. Add && <mask_avx512bw_condition> to condition.
(avx2_interleave_lowv32qi<mask_name>,
vec_interleave_lowv16qi<mask_name>,
avx2_interleave_highv16hi<mask_name>,
vec_interleave_highv8hi<mask_name>,
avx2_interleave_lowv16hi<mask_name>,
vec_interleave_lowv8hi<mask_name>,
avx2_pshuflw_1<mask_name>, sse2_pshuflw_1<mask_name>,
avx2_pshufhw_1<mask_name>, sse2_pshufhw_1<mask_name>,
avx2_<code>v16qiv16hi2<mask_name>,
sse4_1_<code>v8qiv8hi2<mask_name>,
*sse4_1_<code>v8qiv8hi2<mask_name>_1, <sse2_avx2>_<insn><mode>3):
Use
Yw instead of v in constraints.
* config/i386/mmx.md (Yv_Yw): New define_mode_attr.
(*mmx_<insn><mode>3, mmx_ashr<mode>3, mmx_<insn><mode>3): Use
<Yv_Yw>
instead of Yv in constraints.
(*mmx_<insn><mode>3, *mmx_mulv4hi3, *mmx_smulv4hi3_highpart,
*mmx_umulv4hi3_highpart, *mmx_pmaddwd, *mmx_<code>v4hi3,
*mmx_<code>v8qi3, mmx_pack<s_trunsuffix>swb, mmx_packssdw,
mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd,
*mmx_uavgv8qi3, *mmx_uavgv4hi3, mmx_psadbw): Use Yw instead of Yv
in
constraints.
(*mmx_pinsrw, *mmx_pinsrb, *mmx_pextrw, *mmx_pextrw_zext,
*mmx_pextrb,
*mmx_pextrb_zext): Use YW instead of Yv in constraints.
(*mmx_eq<mode>3, mmx_gt<mode>3): Use x instead of Yv in
constraints.
(mmx_andnot<mode>3, *mmx_<code><mode>3): Split last alternative
into
two, one with just x, another isa avx512vl with v.
* gcc.target/i386/avx512vl-pr99321-2.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug target/99321] [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} since r11-7121-g37876976b0511ec9
2021-03-01 15:53 [Bug target/99321] New: [11 Regression] ICE: in extract_constrain_insn, at recog.c:2670: insn does not satisfy its constraints: {*uminv16qi3} zsojka at seznam dot cz
` (9 preceding siblings ...)
2021-03-12 13:36 ` cvs-commit at gcc dot gnu.org
@ 2021-03-12 13:39 ` jakub at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-12 13:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99321
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed now.
^ permalink raw reply [flat|nested] 12+ messages in thread