public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector @ 2021-10-28 12:32 zsojka at seznam dot cz 2021-10-28 18:23 ` [Bug rtl-optimization/102986] " jakub at gcc dot gnu.org ` (7 more replies) 0 siblings, 8 replies; 9+ messages in thread From: zsojka at seznam dot cz @ 2021-10-28 12:32 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 Bug ID: 102986 Summary: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Created attachment 51689 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51689&action=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc testcase.c testcase.c: In function 'foo': testcase.c:8:5: warning: right shift count is negative [-Wshift-count-negative] 8 | v >>= -1; | ^~~ during RTL pass: expand testcase.c:8:5: internal compiler error: in expand_shift_1, at expmed.c:2671 8 | v >>= -1; | ^~~ 0x6b5040 expand_shift_1 /repo/gcc-trunk/gcc/expmed.c:2671 0xf62135 expand_variable_shift(tree_code, machine_mode, rtx_def*, tree_node*, rtx_def*, int) /repo/gcc-trunk/gcc/expmed.c:2712 0xf774cf expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /repo/gcc-trunk/gcc/expr.c:9945 0xe4109f expand_gimple_stmt_1 /repo/gcc-trunk/gcc/cfgexpand.c:3979 0xe4109f expand_gimple_stmt /repo/gcc-trunk/gcc/cfgexpand.c:4040 0xe46ed8 expand_gimple_basic_block /repo/gcc-trunk/gcc/cfgexpand.c:6085 0xe490c7 execute /repo/gcc-trunk/gcc/cfgexpand.c:6811 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-4764-20211028112842-ga84b9d5373c-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-4764-20211028112842-ga84b9d5373c-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20211028 (experimental) (GCC) ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz @ 2021-10-28 18:23 ` jakub at gcc dot gnu.org 2021-10-29 10:02 ` jakub at gcc dot gnu.org ` (6 subsequent siblings) 7 siblings, 0 replies; 9+ messages in thread From: jakub at gcc dot gnu.org @ 2021-10-28 18:23 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW CC| |jakub at gcc dot gnu.org Last reconfirmed| |2021-10-28 Priority|P3 |P1 Target Milestone|--- |12.0 Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Started with r12-4702-g6b8b25575570ffde37cc8997af096514b929779d ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz 2021-10-28 18:23 ` [Bug rtl-optimization/102986] " jakub at gcc dot gnu.org @ 2021-10-29 10:02 ` jakub at gcc dot gnu.org 2021-10-29 10:05 ` jakub at gcc dot gnu.org ` (5 subsequent siblings) 7 siblings, 0 replies; 9+ messages in thread From: jakub at gcc dot gnu.org @ 2021-10-29 10:02 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sayle at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- /* PR target/102986 */ /* { dg-do compile } */ /* { dg-options "-O2" } */ typedef unsigned __int128 __attribute__((__vector_size__ (16))) V; V v; void foo (int x) { v >>= x; } ICEs too. The shift expanders aren't allowed to fail, at least when may_fail isn't set in calls to expand_shift_1, but the new expanders only allow const_int_operand and so ICE if the shift amount is not constant or it is an out of bounds shift (in that case the code earlier forces the shift amount into a register). I guess a way out of this would be to change the predicate from "const_int_operand" to "nonmemory_operand" and if the last operand isn't CONST_INT in ix86_expand_v1ti_shift, subreg the first operand to TImode, expand_variable_shift it (unfortunately with make_tree on the second operand, or instead export expand_shift_1 and use that directly?) and then subreg to V1TImode back. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz 2021-10-28 18:23 ` [Bug rtl-optimization/102986] " jakub at gcc dot gnu.org 2021-10-29 10:02 ` jakub at gcc dot gnu.org @ 2021-10-29 10:05 ` jakub at gcc dot gnu.org 2021-10-29 11:02 ` [Bug target/102986] " roger at nextmovesoftware dot com ` (4 subsequent siblings) 7 siblings, 0 replies; 9+ messages in thread From: jakub at gcc dot gnu.org @ 2021-10-29 10:05 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Or tree-vect-generic.c would need to check the predicates of the vector shift and try to figure out if they accept REGs or not. But no idea how to do that cleanly, the predicate could be very well some target specific predicate... ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz ` (2 preceding siblings ...) 2021-10-29 10:05 ` jakub at gcc dot gnu.org @ 2021-10-29 11:02 ` roger at nextmovesoftware dot com 2021-10-30 10:21 ` roger at nextmovesoftware dot com ` (3 subsequent siblings) 7 siblings, 0 replies; 9+ messages in thread From: roger at nextmovesoftware dot com @ 2021-10-29 11:02 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 Roger Sayle <roger at nextmovesoftware dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |roger at nextmovesoftware dot com Status|NEW |ASSIGNED Component|rtl-optimization |target Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com --- Comment #4 from Roger Sayle <roger at nextmovesoftware dot com> --- My apologies for the inconvenience. I'm already bootstrapping and regression testing a fix that I'd hoped to submit before anyone noticed the breakage. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz ` (3 preceding siblings ...) 2021-10-29 11:02 ` [Bug target/102986] " roger at nextmovesoftware dot com @ 2021-10-30 10:21 ` roger at nextmovesoftware dot com 2021-11-02 8:27 ` cvs-commit at gcc dot gnu.org ` (2 subsequent siblings) 7 siblings, 0 replies; 9+ messages in thread From: roger at nextmovesoftware dot com @ 2021-10-30 10:21 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 --- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> --- Patch proposed: https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582931.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz ` (4 preceding siblings ...) 2021-10-30 10:21 ` roger at nextmovesoftware dot com @ 2021-11-02 8:27 ` cvs-commit at gcc dot gnu.org 2021-11-03 13:53 ` roger at nextmovesoftware dot com 2022-03-23 9:30 ` cvs-commit at gcc dot gnu.org 7 siblings, 0 replies; 9+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2021-11-02 8:27 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 --- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:1188cf5fb7d9c3f0753cdb11d961fe90113991e8 commit r12-4838-g1188cf5fb7d9c3f0753cdb11d961fe90113991e8 Author: Roger Sayle <roger@nextmovesoftware.com> Date: Tue Nov 2 08:23:04 2021 +0000 x86_64: Expand ashrv1ti (and PR target/102986) This patch was originally intended to implement 128-bit arithmetic right shifts by constants of vector registers (V1TImode), but while working on it I discovered the (my) recent ICE on valid regression now known as PR target/102986. As diagnosed by Jakub, expanders for shifts are not allowed to fail, and so any backend that provides a shift optab needs to handle variable amount shifts as well as constant shifts [even though the middle-end knows how to synthesize these for vector modes]. This constraint could be relaxed in the middle-end, but it makes sense to fix this in my erroneous code. The solution is to change the constraints on the recently added (and new) shift expanders from SImode const_int_register to QImode general operand, matching the TImode expanders' constraints, and then simply check for !CONST_INT_P at the start of the ix86_expand_v1ti_* functions, converting the operands from V1TImode to TImode, performing the TImode operation and converting the result back to V1TImode. One nice benefit of this strategy, is that it allows us to implement Uros' recent suggestion, that we should be more efficiently converting between these modes, avoiding the use of memory and using the same idiom as LLVM or using pextrq/pinsrq where available. The new helper functions ix86_expand_v1ti_to_ti and ix86_expand_ti_to_v1ti are sufficient to take care of this. Interestingly partial support for this is already present, but x86_64's generic tuning prefers memory transfers to avoid penalizing microarchitectures with significant interunit delays. With these changes we now generate both pextrq and pinsrq for -mtune=native. The main body of the patch is to implement arithmetic right shift in addition to the logical right shift and left shift implemented in the previous patch. This expander provides no less than 13 different code sequences, special casing the different constant shifts, including variants taking advantage of TARGET_AVX2 and TARGET_SSE4_1. The code is structured with the faster/shorter sequences and the start, and the generic implementations at the end. For the record, the implementations are: ashr_127: // Shift 127, 2 operations, 10 bytes pshufd $255, %xmm0, %xmm0 psrad $31, %xmm0 ret ashr_64: // Shift by 64, 3 operations, 14 bytes pshufd $255, %xmm0, %xmm1 psrad $31, %xmm1 punpckhqdq %xmm1, %xmm0 ret ashr_96: // Shift by 96, 3 operations, 18 bytes movdqa %xmm0, %xmm1 psrad $31, %xmm1 punpckhqdq %xmm1, %xmm0 pshufd $253, %xmm0, %xmm0 ret ashr_8: // Shift by 8/16/24/32 on AVX2, 3 operations, 16 bytes vpsrad $8, %xmm0, %xmm1 vpsrldq $1, %xmm0, %xmm0 vpblendd $7, %xmm0, %xmm1, %xmm0 ret ashr_8: // Shift by 8/16/24/32 on SSE4.1, 3 operations, 24 bytes movdqa %xmm0, %xmm1 psrldq $1, %xmm0 psrad $8, %xmm1 pblendw $63, %xmm0, %xmm1 movdqa %xmm1, %xmm0 ret ashr_97: // Shifts by 97..126, 4 operations, 23 bytes movdqa %xmm0, %xmm1 psrad $31, %xmm0 psrad $1, %xmm1 punpckhqdq %xmm0, %xmm1 pshufd $253, %xmm1, %xmm0 ret ashr_48: // Shifts by 48/80 on SSE4.1, 4 operations, 25 bytes movdqa %xmm0, %xmm1 pshufd $255, %xmm0, %xmm0 psrldq $6, %xmm1 psrad $31, %xmm0 pblendw $31, %xmm1, %xmm0 ret ashr_8: // Shifts by multiples of 8, 5 operations, 28 bytes movdqa %xmm0, %xmm1 pshufd $255, %xmm0, %xmm0 psrad $31, %xmm0 psrldq $1, %xmm1 pslldq $15, %xmm0 por %xmm1, %xmm0 ret ashr_1: // Shifts by 1..31 on AVX2, 6 operations, 30 bytes vpsrldq $8, %xmm0, %xmm2 vpsrad $1, %xmm0, %xmm1 vpsllq $63, %xmm2, %xmm2 vpsrlq $1, %xmm0, %xmm0 vpor %xmm2, %xmm0, %xmm0 vpblendd $7, %xmm0, %xmm1, %xmm0 ret ashr_1: // Shifts by 1..15 on SSE4.1, 6 operations, 42 bytes movdqa %xmm0, %xmm2 movdqa %xmm0, %xmm1 psrldq $8, %xmm2 psrlq $1, %xmm0 psllq $63, %xmm2 psrad $1, %xmm1 por %xmm2, %xmm0 pblendw $63, %xmm0, %xmm1 movdqa %xmm1, %xmm0 ret ashr_1: // Shift by 1, 8 operations, 46 bytes movdqa %xmm0, %xmm1 movdqa %xmm0, %xmm2 psrldq $8, %xmm2 psrlq $63, %xmm1 psllq $63, %xmm2 psrlq $1, %xmm0 pshufd $191, %xmm1, %xmm1 por %xmm2, %xmm0 psllq $31, %xmm1 por %xmm1, %xmm0 ret ashr_65: // Shifts by 65..95, 8 operations, 42 bytes pshufd $255, %xmm0, %xmm1 psrldq $8, %xmm0 psrad $31, %xmm1 psrlq $1, %xmm0 movdqa %xmm1, %xmm2 psllq $63, %xmm1 pslldq $8, %xmm2 por %xmm2, %xmm1 por %xmm1, %xmm0 ret ashr_2: // Shifts from 2..63, 9 operations, 47 bytes pshufd $255, %xmm0, %xmm1 movdqa %xmm0, %xmm2 psrad $31, %xmm1 psrldq $8, %xmm2 psllq $62, %xmm2 psrlq $2, %xmm0 pslldq $8, %xmm1 por %xmm2, %xmm0 psllq $62, %xmm1 por %xmm1, %xmm0 ret To test these changes there are several new test cases. sse2-v1ti-shift-2.c is a compile-test designed to spot/catch PR target/102986 [for all shifts and rotates by variable amounts], and sse2-v1ti-shift-3.c is an execution test to confirm shifts/rotates by variable amounts produce the same results for TImode and V1TImode. sse2-v1ti-ashiftrt-1.c is a (similar) execution test to confirm arithmetic right shifts by different constants produce identical results between TImode and V1TImode. sse2-v1ti-ashift-[23].c are duplicates of this file as compilation tests specifying -mavx2 and -msse4.1 respectively to trigger all the paths through the new expander. 2021-11-02 Roger Sayle <roger@nextmovesoftware.com> Jakub Jelinek <jakub@redhat.com> gcc/ChangeLog PR target/102986 * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti, ix86_expand_ti_to_v1ti): New helper functions. (ix86_expand_v1ti_shift): Check if the amount operand is an integer constant, and expand as a TImode shift if it isn't. (ix86_expand_v1ti_rotate): Check if the amount operand is an integer constant, and expand as a TImode rotate if it isn't. (ix86_expand_v1ti_ashiftrt): New function to expand arithmetic right shifts of V1TImode quantities. * config/i386/i386-protos.h (ix86_expand_v1ti_ashift): Prototype. * config/i386/sse.md (ashlv1ti3, lshrv1ti3): Change constraints to QImode general_operand, and let the helper functions lower shifts by non-constant operands, as TImode shifts. Make conditional on TARGET_64BIT. (ashrv1ti3): New expander calling ix86_expand_v1ti_ashiftrt. (rotlv1ti3, rotrv1ti3): Change shift operand to QImode. Make conditional on TARGET_64BIT. gcc/testsuite/ChangeLog PR target/102986 * gcc.target/i386/sse2-v1ti-ashiftrt-1.c: New test case. * gcc.target/i386/sse2-v1ti-ashiftrt-2.c: New test case. * gcc.target/i386/sse2-v1ti-ashiftrt-3.c: New test case. * gcc.target/i386/sse2-v1ti-shift-2.c: New test case. * gcc.target/i386/sse2-v1ti-shift-3.c: New test case. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz ` (5 preceding siblings ...) 2021-11-02 8:27 ` cvs-commit at gcc dot gnu.org @ 2021-11-03 13:53 ` roger at nextmovesoftware dot com 2022-03-23 9:30 ` cvs-commit at gcc dot gnu.org 7 siblings, 0 replies; 9+ messages in thread From: roger at nextmovesoftware dot com @ 2021-11-03 13:53 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 Roger Sayle <roger at nextmovesoftware dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|ASSIGNED |RESOLVED --- Comment #7 from Roger Sayle <roger at nextmovesoftware dot com> --- This is now fixed on mainline. Sorry again for the breakage. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/102986] [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz ` (6 preceding siblings ...) 2021-11-03 13:53 ` roger at nextmovesoftware dot com @ 2022-03-23 9:30 ` cvs-commit at gcc dot gnu.org 7 siblings, 0 replies; 9+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2022-03-23 9:30 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102986 --- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:4a9e92164a547afcf8cd3fc593c7660238ad2d59 commit r12-7777-g4a9e92164a547afcf8cd3fc593c7660238ad2d59 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Mar 23 10:29:37 2022 +0100 testsuite: Fix up sse2-v1ti-shift-3.c test [PR102986] This test is dg-do run and invokes UB when these rotate functions are called with 0 as second argument. There are some other tests that do this but they are dg-do compile only and not even call those functions at all, so it IMHO doesn't matter that they are only well defined for [1,127] and not [0,127]. The following patch fixes it, we pattern recognize both forms as rotates and we emit identical assembly. 2022-03-23 Jakub Jelinek <jakub@redhat.com> PR target/102986 * gcc.target/i386/sse2-v1ti-shift-3.c (rotr_v1ti, rotl_v1ti, rotr_ti, rotl_ti): Use -i&127 instead of 128-i to avoid UB on i == 0. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-03-23 9:30 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-28 12:32 [Bug rtl-optimization/102986] New: [12 Regression] ICE: in expand_shift_1, at expmed.c:2671 with a negative shift of a vector zsojka at seznam dot cz 2021-10-28 18:23 ` [Bug rtl-optimization/102986] " jakub at gcc dot gnu.org 2021-10-29 10:02 ` jakub at gcc dot gnu.org 2021-10-29 10:05 ` jakub at gcc dot gnu.org 2021-10-29 11:02 ` [Bug target/102986] " roger at nextmovesoftware dot com 2021-10-30 10:21 ` roger at nextmovesoftware dot com 2021-11-02 8:27 ` cvs-commit at gcc dot gnu.org 2021-11-03 13:53 ` roger at nextmovesoftware dot com 2022-03-23 9:30 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).