* fix ssse3_pshufbv8qi3 post-reload const pool load @ 2021-03-19 6:29 Alexandre Oliva 2021-03-19 8:10 ` Uros Bizjak 0 siblings, 1 reply; 6+ messages in thread From: Alexandre Oliva @ 2021-03-19 6:29 UTC (permalink / raw) To: gcc-patches; +Cc: Jan Hubicka, Uros Bizjak, Kirill Yukhin The split in ssse3_pshufbv8qi3 forces a const vector into the constant pool, and loads from it. That runs after reload, so if the load requires any reloading, we're out of luck. Indeed, if the load address is not legitimate, e.g. -mcmodel=large, the insn is no longer recognized. This patch turns the constant into an input operand, introduces an expander to generate the constant unconditionally, and arranges for this input operand to be retained as an unused immediate in the alternatives that don't undergo splitting, and for it to be loaded into the scratch register for those that do. It is now the register allocator that arranges to load the const vector into a register, so it deals with whatever legitimizing steps needed for the target configuration. Regstrapped on x86_64-linux-gnu. Ok to install? for gcc/ChangeLog * config/i386/predicates.md (register_or_const_vec_operand): New. * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for the now *-prefixed insn_and_split, turn the splitter const vec into an input for the insn, making it an ignored immediate for non-split cases, and loaded into the scratch register otherwise. --- gcc/config/i386/predicates.md | 6 ++++++ gcc/config/i386/sse.md | 26 +++++++++++++++++++------- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index b6dd5e9d3b243..f1da005c95cf3 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" (ior (match_operand 0 "nonimmediate_operand") (match_code "const_vector"))) +;; Return true when OP is either register operand, or any +;; CONST_VECTOR. +(define_predicate "register_or_const_vector_operand" + (ior (match_operand 0 "register_operand") + (match_code "const_vector"))) + ;; Return true when OP is nonimmediate or standard SSE constant. (define_predicate "nonimmediate_or_sse_const_operand" (ior (match_operand 0 "nonimmediate_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 43e4d57ec6a3d..b693864e62d1b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17159,10 +17159,26 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>" (set_attr "btver2_decode" "vector") (set_attr "mode" "<sseinsnmode>")]) -(define_insn_and_split "ssse3_pshufbv8qi3" +(define_expand "ssse3_pshufbv8qi3" + [(parallel + [(set (match_operand:V8QI 0 "register_operand" "=") + (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "") + (match_operand:V8QI 2 "register_mmxmem_operand" "") + (const_vector:V4SI [(match_dup 3) (match_dup 3) + (match_dup 3) (match_dup 3)])] + UNSPEC_PSHUFB)) + (clobber (match_scratch:V4SI 4 "="))])] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" +{ + operands[3] = gen_int_mode (0xf7f7f7f7, SImode); +}) + +(define_insn_and_split "*ssse3_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") + (match_operand:V4SI 4 "register_or_const_vector_operand" + "i,3,3")] UNSPEC_PSHUFB)) (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" @@ -17172,8 +17188,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" #" "TARGET_SSSE3 && reload_completed && SSE_REGNO_P (REGNO (operands[0]))" - [(set (match_dup 3) (match_dup 5)) - (set (match_dup 3) + [(set (match_dup 3) (and:V4SI (match_dup 3) (match_dup 2))) (set (match_dup 0) (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] @@ -17188,9 +17203,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" GET_MODE (operands[2])); operands[4] = lowpart_subreg (V16QImode, operands[3], GET_MODE (operands[3])); - rtx vec_const = ix86_build_const_vector (V4SImode, true, - gen_int_mode (0xf7f7f7f7, SImode)); - operands[5] = force_const_mem (V4SImode, vec_const); } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "prefix_extra" "1") -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix ssse3_pshufbv8qi3 post-reload const pool load 2021-03-19 6:29 fix ssse3_pshufbv8qi3 post-reload const pool load Alexandre Oliva @ 2021-03-19 8:10 ` Uros Bizjak 2021-03-19 22:44 ` Alexandre Oliva 0 siblings, 1 reply; 6+ messages in thread From: Uros Bizjak @ 2021-03-19 8:10 UTC (permalink / raw) To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka, Kirill Yukhin On Fri, Mar 19, 2021 at 7:29 AM Alexandre Oliva <oliva@gnu.org> wrote: > > > The split in ssse3_pshufbv8qi3 forces a const vector into the constant > pool, and loads from it. That runs after reload, so if the load > requires any reloading, we're out of luck. Indeed, if the load > address is not legitimate, e.g. -mcmodel=large, the insn is no longer > recognized. > > This patch turns the constant into an input operand, introduces an > expander to generate the constant unconditionally, and arranges for > this input operand to be retained as an unused immediate in the > alternatives that don't undergo splitting, and for it to be loaded > into the scratch register for those that do. > > It is now the register allocator that arranges to load the const > vector into a register, so it deals with whatever legitimizing steps > needed for the target configuration. > > Regstrapped on x86_64-linux-gnu. Ok to install? > > > for gcc/ChangeLog > > * config/i386/predicates.md (register_or_const_vec_operand): > New. > * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for > the now *-prefixed insn_and_split, turn the splitter const vec > into an input for the insn, making it an ignored immediate for > non-split cases, and loaded into the scratch register > otherwise. Testcase? > --- > gcc/config/i386/predicates.md | 6 ++++++ > gcc/config/i386/sse.md | 26 +++++++++++++++++++------- > 2 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md > index b6dd5e9d3b243..f1da005c95cf3 100644 > --- a/gcc/config/i386/predicates.md > +++ b/gcc/config/i386/predicates.md > @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" > (ior (match_operand 0 "nonimmediate_operand") > (match_code "const_vector"))) > > +;; Return true when OP is either register operand, or any > +;; CONST_VECTOR. > +(define_predicate "register_or_const_vector_operand" please name this "reg_or_const_vector_operand" > + (ior (match_operand 0 "register_operand") > + (match_code "const_vector"))) > + > ;; Return true when OP is nonimmediate or standard SSE constant. > (define_predicate "nonimmediate_or_sse_const_operand" > (ior (match_operand 0 "nonimmediate_operand") > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 43e4d57ec6a3d..b693864e62d1b 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17159,10 +17159,26 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>" > (set_attr "btver2_decode" "vector") > (set_attr "mode" "<sseinsnmode>")]) > > -(define_insn_and_split "ssse3_pshufbv8qi3" > +(define_expand "ssse3_pshufbv8qi3" > + [(parallel > + [(set (match_operand:V8QI 0 "register_operand" "=") > + (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "") > + (match_operand:V8QI 2 "register_mmxmem_operand" "") > + (const_vector:V4SI [(match_dup 3) (match_dup 3) > + (match_dup 3) (match_dup 3)])] > + UNSPEC_PSHUFB)) > + (clobber (match_scratch:V4SI 4 "="))])] All constraints should be removed from an expander. > + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" > +{ > + operands[3] = gen_int_mode (0xf7f7f7f7, SImode); You can use: ix86_build_const_vector (V4SImode, true, gen_int_mode (0xf7f7f7f7, SImode)); to generate the whole const_vector. Uros. > +}) > + > +(define_insn_and_split "*ssse3_pshufbv8qi3" > [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") > (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") > - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] > + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") > + (match_operand:V4SI 4 "register_or_const_vector_operand" > + "i,3,3")] > UNSPEC_PSHUFB)) > (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] > "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" > @@ -17172,8 +17188,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" > #" > "TARGET_SSSE3 && reload_completed > && SSE_REGNO_P (REGNO (operands[0]))" > - [(set (match_dup 3) (match_dup 5)) > - (set (match_dup 3) > + [(set (match_dup 3) > (and:V4SI (match_dup 3) (match_dup 2))) > (set (match_dup 0) > (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] > @@ -17188,9 +17203,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" > GET_MODE (operands[2])); > operands[4] = lowpart_subreg (V16QImode, operands[3], > GET_MODE (operands[3])); > - rtx vec_const = ix86_build_const_vector (V4SImode, true, > - gen_int_mode (0xf7f7f7f7, SImode)); > - operands[5] = force_const_mem (V4SImode, vec_const); > } > [(set_attr "mmx_isa" "native,sse_noavx,avx") > (set_attr "prefix_extra" "1") > > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix ssse3_pshufbv8qi3 post-reload const pool load 2021-03-19 8:10 ` Uros Bizjak @ 2021-03-19 22:44 ` Alexandre Oliva 2021-03-19 22:57 ` Jakub Jelinek 0 siblings, 1 reply; 6+ messages in thread From: Alexandre Oliva @ 2021-03-19 22:44 UTC (permalink / raw) To: Uros Bizjak; +Cc: gcc-patches, Jan Hubicka, Kirill Yukhin On Mar 19, 2021, Uros Bizjak <ubizjak@gmail.com> wrote: >> * config/i386/predicates.md (register_or_const_vec_operand): >> New. >> * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for >> the now *-prefixed insn_and_split, turn the splitter const vec >> into an input for the insn, making it an ignored immediate for >> non-split cases, and loaded into the scratch register >> otherwise. > Testcase? Uhh, sorry I failed to mention that. gcc.target/i386/pr94467-1.c was what I used (with -mcmodel=large) to duplicate the problem and then confirm the fix in the trunk. However, I had a total of 15 similar fails within gcc.target/i386 in a gcc-10 tree configured with -mcmodel=large, all fixed by this patch: avx-1.c, avx-2.c, avx-vpshufb-1.c, pr82483-1.c, pr94467-1.c, pr94467-2.c, sse-13.c, sse-14.c, sse-22.c, sse-22a.c, sse-23.c, sse-24.c, sse-25.c, sse-26.c, and ssse3-pshufb.c. > please name this "reg_or_const_vector_operand" 'k > All constraints should be removed from an expander. 'k >> + operands[3] = gen_int_mode (0xf7f7f7f7, SImode); > You can use: > ix86_build_const_vector (V4SImode, true, > gen_int_mode (0xf7f7f7f7, SImode)); heh, I had it spelled that way in my local tree at some point. I thought the explicit const_vector in the pattern looked clearer. Changed, here's what I'm regstrapping. Ok to install if it succeeds? for gcc/ChangeLog * config/i386/predicates.md (reg_or_const_vec_operand): New. * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for the now *-prefixed insn_and_split, turn the splitter const vec into an input for the insn, making it an ignored immediate for non-split cases, and loaded into the scratch register otherwise. --- gcc/config/i386/predicates.md | 6 ++++++ gcc/config/i386/sse.md | 25 ++++++++++++++++++------- 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index b6dd5e9d3b243..b1df8548af639 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" (ior (match_operand 0 "nonimmediate_operand") (match_code "const_vector"))) +;; Return true when OP is either register operand, or any +;; CONST_VECTOR. +(define_predicate "reg_or_const_vector_operand" + (ior (match_operand 0 "register_operand") + (match_code "const_vector"))) + ;; Return true when OP is nonimmediate or standard SSE constant. (define_predicate "nonimmediate_or_sse_const_operand" (ior (match_operand 0 "nonimmediate_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 43e4d57ec6a3d..9d3728d1cb08b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17159,10 +17159,25 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>" (set_attr "btver2_decode" "vector") (set_attr "mode" "<sseinsnmode>")]) -(define_insn_and_split "ssse3_pshufbv8qi3" +(define_expand "ssse3_pshufbv8qi3" + [(parallel + [(set (match_operand:V8QI 0 "register_operand") + (unspec:V8QI [(match_operand:V8QI 1 "register_operand") + (match_operand:V8QI 2 "register_mmxmem_operand") + (match_dup 3)] UNSPEC_PSHUFB)) + (clobber (match_scratch:V4SI 4))])] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" +{ + operands[3] = ix86_build_const_vector (V4SImode, true, + gen_int_mode (0xf7f7f7f7, SImode)); +}) + +(define_insn_and_split "*ssse3_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") + (match_operand:V4SI 4 "reg_or_const_vector_operand" + "i,3,3")] UNSPEC_PSHUFB)) (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" @@ -17172,8 +17187,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" #" "TARGET_SSSE3 && reload_completed && SSE_REGNO_P (REGNO (operands[0]))" - [(set (match_dup 3) (match_dup 5)) - (set (match_dup 3) + [(set (match_dup 3) (and:V4SI (match_dup 3) (match_dup 2))) (set (match_dup 0) (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] @@ -17188,9 +17202,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" GET_MODE (operands[2])); operands[4] = lowpart_subreg (V16QImode, operands[3], GET_MODE (operands[3])); - rtx vec_const = ix86_build_const_vector (V4SImode, true, - gen_int_mode (0xf7f7f7f7, SImode)); - operands[5] = force_const_mem (V4SImode, vec_const); } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "prefix_extra" "1") -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix ssse3_pshufbv8qi3 post-reload const pool load 2021-03-19 22:44 ` Alexandre Oliva @ 2021-03-19 22:57 ` Jakub Jelinek 2021-03-24 6:46 ` Alexandre Oliva 0 siblings, 1 reply; 6+ messages in thread From: Jakub Jelinek @ 2021-03-19 22:57 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Uros Bizjak, gcc-patches, Jan Hubicka On Fri, Mar 19, 2021 at 07:44:01PM -0300, Alexandre Oliva wrote: > > Testcase? > > Uhh, sorry I failed to mention that. > > gcc.target/i386/pr94467-1.c was what I used (with -mcmodel=large) to > duplicate the problem and then confirm the fix in the trunk. > > However, I had a total of 15 similar fails within gcc.target/i386 in a > gcc-10 tree configured with -mcmodel=large, all fixed by this patch: > avx-1.c, avx-2.c, avx-vpshufb-1.c, pr82483-1.c, pr94467-1.c, > pr94467-2.c, sse-13.c, sse-14.c, sse-22.c, sse-22a.c, sse-23.c, > sse-24.c, sse-25.c, sse-26.c, and ssse3-pshufb.c. But then we should add at least one new testcase for that, it can #include an existing testcase and just add -mcmodel=large in dg-options or dg-additional-options, and lp64 effective target (or target on dg-additional-options). Jakub ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix ssse3_pshufbv8qi3 post-reload const pool load 2021-03-19 22:57 ` Jakub Jelinek @ 2021-03-24 6:46 ` Alexandre Oliva 2021-03-24 6:52 ` Uros Bizjak 0 siblings, 1 reply; 6+ messages in thread From: Alexandre Oliva @ 2021-03-24 6:46 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches, Jan Hubicka Hello, Jakub, On Mar 19, 2021, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Mar 19, 2021 at 07:44:01PM -0300, Alexandre Oliva wrote: >> However, I had a total of 15 similar fails within gcc.target/i386 in a >> gcc-10 tree configured with -mcmodel=large > But then we should add at least one new testcase for that Yeah, that's a good idea. I don't suppose x86_64 -mcmodel=large gets much community testing. This (minus lp64 in the test) was regstrapped on x86_64-linux-gnu, along with other testsuite patches I'm about to post, and tested on x86_64-vx7r2. After noticing the failure to constrain the new test to lp64 only, I fixed and retested it on x86_64-linux-gnu. Ok to install? fix ssse3_pshufbv8qi3 post-reload const pool load The split in ssse3_pshufbv8qi3 forces a const vector into the constant pool, and loads from it. That runs after reload, so if the load requires any reloading, we're out of luck. Indeed, if the load address is not legitimate, e.g. -mcmodel=large, the insn is no longer recognized. This patch turns the constant into an input operand, introduces an expander to generate the constant unconditionally, and arranges for this input operand to be retained as an unused immediate in the alternatives that don't undergo splitting, and for it to be loaded into the scratch register for those that do. It is now the register allocator that arranges to load the const vector into a register, so it deals with whatever legitimizing steps needed for the target configuration. for gcc/ChangeLog * config/i386/predicates.md (reg_or_const_vec_operand): New. * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for the now *-prefixed insn_and_split, turn the splitter const vec into an input for the insn, making it an ignored immediate for non-split cases, and loaded into the scratch register otherwise. for gcc/testsuite/ChangeLog * gcc.target/i386/pr94467-3.c: New. --- gcc/config/i386/predicates.md | 6 ++++++ gcc/config/i386/sse.md | 25 ++++++++++++++++++------- gcc/testsuite/gcc.target/i386/pr94467-3.c | 4 ++++ 3 files changed, 28 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr94467-3.c diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index b6dd5e9d3b243..b1df8548af639 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" (ior (match_operand 0 "nonimmediate_operand") (match_code "const_vector"))) +;; Return true when OP is either register operand, or any +;; CONST_VECTOR. +(define_predicate "reg_or_const_vector_operand" + (ior (match_operand 0 "register_operand") + (match_code "const_vector"))) + ;; Return true when OP is nonimmediate or standard SSE constant. (define_predicate "nonimmediate_or_sse_const_operand" (ior (match_operand 0 "nonimmediate_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 43e4d57ec6a3d..9d3728d1cb08b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17159,10 +17159,25 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>" (set_attr "btver2_decode" "vector") (set_attr "mode" "<sseinsnmode>")]) -(define_insn_and_split "ssse3_pshufbv8qi3" +(define_expand "ssse3_pshufbv8qi3" + [(parallel + [(set (match_operand:V8QI 0 "register_operand") + (unspec:V8QI [(match_operand:V8QI 1 "register_operand") + (match_operand:V8QI 2 "register_mmxmem_operand") + (match_dup 3)] UNSPEC_PSHUFB)) + (clobber (match_scratch:V4SI 4))])] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" +{ + operands[3] = ix86_build_const_vector (V4SImode, true, + gen_int_mode (0xf7f7f7f7, SImode)); +}) + +(define_insn_and_split "*ssse3_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") + (match_operand:V4SI 4 "reg_or_const_vector_operand" + "i,3,3")] UNSPEC_PSHUFB)) (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" @@ -17172,8 +17187,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" #" "TARGET_SSSE3 && reload_completed && SSE_REGNO_P (REGNO (operands[0]))" - [(set (match_dup 3) (match_dup 5)) - (set (match_dup 3) + [(set (match_dup 3) (and:V4SI (match_dup 3) (match_dup 2))) (set (match_dup 0) (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] @@ -17188,9 +17202,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" GET_MODE (operands[2])); operands[4] = lowpart_subreg (V16QImode, operands[3], GET_MODE (operands[3])); - rtx vec_const = ix86_build_const_vector (V4SImode, true, - gen_int_mode (0xf7f7f7f7, SImode)); - operands[5] = force_const_mem (V4SImode, vec_const); } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "prefix_extra" "1") diff --git a/gcc/testsuite/gcc.target/i386/pr94467-3.c b/gcc/testsuite/gcc.target/i386/pr94467-3.c new file mode 100644 index 0000000000000..b415847b25634 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr94467-3.c @@ -0,0 +1,4 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O -mavx -mcmodel=large" } */ + +#include "pr94467-1.c" -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fix ssse3_pshufbv8qi3 post-reload const pool load 2021-03-24 6:46 ` Alexandre Oliva @ 2021-03-24 6:52 ` Uros Bizjak 0 siblings, 0 replies; 6+ messages in thread From: Uros Bizjak @ 2021-03-24 6:52 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Jakub Jelinek, gcc-patches, Jan Hubicka On Wed, Mar 24, 2021 at 7:46 AM Alexandre Oliva <oliva@adacore.com> wrote: > > Hello, Jakub, > > On Mar 19, 2021, Jakub Jelinek <jakub@redhat.com> wrote: > > > On Fri, Mar 19, 2021 at 07:44:01PM -0300, Alexandre Oliva wrote: > > >> However, I had a total of 15 similar fails within gcc.target/i386 in a > >> gcc-10 tree configured with -mcmodel=large > > > But then we should add at least one new testcase for that > > Yeah, that's a good idea. I don't suppose x86_64 -mcmodel=large gets > much community testing. > > This (minus lp64 in the test) was regstrapped on x86_64-linux-gnu, along > with other testsuite patches I'm about to post, and tested on > x86_64-vx7r2. After noticing the failure to constrain the new test to > lp64 only, I fixed and retested it on x86_64-linux-gnu. Ok to install? > > > fix ssse3_pshufbv8qi3 post-reload const pool load > > The split in ssse3_pshufbv8qi3 forces a const vector into the constant > pool, and loads from it. That runs after reload, so if the load > requires any reloading, we're out of luck. Indeed, if the load > address is not legitimate, e.g. -mcmodel=large, the insn is no longer > recognized. > > This patch turns the constant into an input operand, introduces an > expander to generate the constant unconditionally, and arranges for > this input operand to be retained as an unused immediate in the > alternatives that don't undergo splitting, and for it to be loaded > into the scratch register for those that do. > > It is now the register allocator that arranges to load the const > vector into a register, so it deals with whatever legitimizing steps > needed for the target configuration. > > > for gcc/ChangeLog > > * config/i386/predicates.md (reg_or_const_vec_operand): New. > * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for > the now *-prefixed insn_and_split, turn the splitter const vec > into an input for the insn, making it an ignored immediate for > non-split cases, and loaded into the scratch register > otherwise. > > for gcc/testsuite/ChangeLog > > * gcc.target/i386/pr94467-3.c: New. OK. Thanks, Uros. > --- > gcc/config/i386/predicates.md | 6 ++++++ > gcc/config/i386/sse.md | 25 ++++++++++++++++++------- > gcc/testsuite/gcc.target/i386/pr94467-3.c | 4 ++++ > 3 files changed, 28 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr94467-3.c > > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md > index b6dd5e9d3b243..b1df8548af639 100644 > --- a/gcc/config/i386/predicates.md > +++ b/gcc/config/i386/predicates.md > @@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand" > (ior (match_operand 0 "nonimmediate_operand") > (match_code "const_vector"))) > > +;; Return true when OP is either register operand, or any > +;; CONST_VECTOR. > +(define_predicate "reg_or_const_vector_operand" > + (ior (match_operand 0 "register_operand") > + (match_code "const_vector"))) > + > ;; Return true when OP is nonimmediate or standard SSE constant. > (define_predicate "nonimmediate_or_sse_const_operand" > (ior (match_operand 0 "nonimmediate_operand") > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 43e4d57ec6a3d..9d3728d1cb08b 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17159,10 +17159,25 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>" > (set_attr "btver2_decode" "vector") > (set_attr "mode" "<sseinsnmode>")]) > > -(define_insn_and_split "ssse3_pshufbv8qi3" > +(define_expand "ssse3_pshufbv8qi3" > + [(parallel > + [(set (match_operand:V8QI 0 "register_operand") > + (unspec:V8QI [(match_operand:V8QI 1 "register_operand") > + (match_operand:V8QI 2 "register_mmxmem_operand") > + (match_dup 3)] UNSPEC_PSHUFB)) > + (clobber (match_scratch:V4SI 4))])] > + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" > +{ > + operands[3] = ix86_build_const_vector (V4SImode, true, > + gen_int_mode (0xf7f7f7f7, SImode)); > +}) > + > +(define_insn_and_split "*ssse3_pshufbv8qi3" > [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") > (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") > - (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] > + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") > + (match_operand:V4SI 4 "reg_or_const_vector_operand" > + "i,3,3")] > UNSPEC_PSHUFB)) > (clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))] > "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" > @@ -17172,8 +17187,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3" > #" > "TARGET_SSSE3 && reload_completed > && SSE_REGNO_P (REGNO (operands[0]))" > - [(set (match_dup 3) (match_dup 5)) > - (set (match_dup 3) > + [(set (match_dup 3) > (and:V4SI (match_dup 3) (match_dup 2))) > (set (match_dup 0) > (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] > @@ -17188,9 +17202,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3" > GET_MODE (operands[2])); > operands[4] = lowpart_subreg (V16QImode, operands[3], > GET_MODE (operands[3])); > - rtx vec_const = ix86_build_const_vector (V4SImode, true, > - gen_int_mode (0xf7f7f7f7, SImode)); > - operands[5] = force_const_mem (V4SImode, vec_const); > } > [(set_attr "mmx_isa" "native,sse_noavx,avx") > (set_attr "prefix_extra" "1") > diff --git a/gcc/testsuite/gcc.target/i386/pr94467-3.c b/gcc/testsuite/gcc.target/i386/pr94467-3.c > new file mode 100644 > index 0000000000000..b415847b25634 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94467-3.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile { target { lp64 } } } */ > +/* { dg-options "-O -mavx -mcmodel=large" } */ > + > +#include "pr94467-1.c" > > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-24 6:53 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-19 6:29 fix ssse3_pshufbv8qi3 post-reload const pool load Alexandre Oliva 2021-03-19 8:10 ` Uros Bizjak 2021-03-19 22:44 ` Alexandre Oliva 2021-03-19 22:57 ` Jakub Jelinek 2021-03-24 6:46 ` Alexandre Oliva 2021-03-24 6:52 ` Uros Bizjak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).