public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH] PR target/106577: force_reg may clobber operands during split.
@ 2022-08-12 20:40 Roger Sayle
  2022-08-15  7:45 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Sayle @ 2022-08-12 20:40 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2241 bytes --]


This patch fixes PR target/106577 which is a recent ICE on valid regression
caused by my introduction of a *testti_doubleword pre-reload splitter in
i386.md.  During the split pass before reload, this converts the virtual
*testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
checking that any immediate operand is a valid "x86_64_hilo_general_operand"
and placing it into a TImode register using force_reg if it isn't.

The unexpected behaviour (that caught me out) is that calling force_reg
may occasionally clobber the contents of the global operands array, or
more accurately recog_data.operand[0], which means that by the time
split_XXX calls gen_split_YYY the replacement insn's operands have been
corrupted.

It's difficult to tell who (if anyone is at fault).  The re-entrant
stack trace (for the attached PR) looks like:

gen_split_203 (*testti_doubleword) calls
force_reg calls
emit_move_insn calls
emit_move_insn_1 calls
gen_movti calls
ix86_expand_move calls
ix86_convert_const_wide_int_to_broadcast calls
ix86_vector_duplicate_value calls
recog_memoized calls
recog.

By far the simplest and possibly correct fix is rather than attempt
to push and pop recog_data, to simply (in pre-reload splits) save a
copy of any operands that will be needed after force_reg, and use
these copies afterwards.  Many pre-reload splitters avoid this issue
using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
but in our case we still need to save a copy of operands[0] (even if we
call emit_insn or expand_* ourselves), so we might as well continue to
use the conveniently generated gen_split.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures. Ok for mainline?


2022-08-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/106577
        * config/i386/i386.md (*testti_doubleword): Preserve a copy of
        operands[0], and move initialization of operands[2] later, as the
        call to force_reg may clobber the contents of the operands array.

gcc/testsuite/ChangeLog
        PR target/106577
        * gcc.target/i386/pr106577.c: New test case.


Thanks,
Roger
--


[-- Attachment #2: patchfr.txt --]
[-- Type: text/plain, Size: 1113 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2fde8cd..e9232cd 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -9772,9 +9772,12 @@
               (clobber (reg:CC FLAGS_REG))])
    (set (reg:CCZ FLAGS_REG) (compare:CCZ (match_dup 2) (const_int 0)))]
 {
-  operands[2] = gen_reg_rtx (TImode);
+  /* Calling force_reg may clobber operands[0].  */
+  rtx save_op0 = operands[0];
   if (!x86_64_hilo_general_operand (operands[1], TImode))
     operands[1] = force_reg (TImode, operands[1]);
+  operands[2] = gen_reg_rtx (TImode);
+  operands[0] = save_op0;
 })
 
 ;; Combine likes to form bit extractions for some tests.  Humor it.
diff --git a/gcc/testsuite/gcc.target/i386/pr106577.c b/gcc/testsuite/gcc.target/i386/pr106577.c
new file mode 100644
index 0000000..4e35031
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106577.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -mavx" } */
+
+int i;
+void foo(void) {
+  i ^= !(((unsigned __int128)0xf0f0f0f0f0f0f0f0 << 64 | 0xf0f0f0f0f0f0f0f0) & i);
+}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 PATCH] PR target/106577: force_reg may clobber operands during split.
  2022-08-12 20:40 [x86 PATCH] PR target/106577: force_reg may clobber operands during split Roger Sayle
@ 2022-08-15  7:45 ` Richard Biener
  2022-08-16  8:14   ` Richard Sandiford
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2022-08-15  7:45 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches

On Fri, Aug 12, 2022 at 10:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes PR target/106577 which is a recent ICE on valid regression
> caused by my introduction of a *testti_doubleword pre-reload splitter in
> i386.md.  During the split pass before reload, this converts the virtual
> *testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
> checking that any immediate operand is a valid "x86_64_hilo_general_operand"
> and placing it into a TImode register using force_reg if it isn't.
>
> The unexpected behaviour (that caught me out) is that calling force_reg
> may occasionally clobber the contents of the global operands array, or
> more accurately recog_data.operand[0], which means that by the time
> split_XXX calls gen_split_YYY the replacement insn's operands have been
> corrupted.
>
> It's difficult to tell who (if anyone is at fault).  The re-entrant
> stack trace (for the attached PR) looks like:
>
> gen_split_203 (*testti_doubleword) calls
> force_reg calls
> emit_move_insn calls
> emit_move_insn_1 calls
> gen_movti calls
> ix86_expand_move calls
> ix86_convert_const_wide_int_to_broadcast calls
> ix86_vector_duplicate_value calls
> recog_memoized calls
> recog.
>
> By far the simplest and possibly correct fix is rather than attempt
> to push and pop recog_data, to simply (in pre-reload splits) save a
> copy of any operands that will be needed after force_reg, and use
> these copies afterwards.  Many pre-reload splitters avoid this issue
> using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
> but in our case we still need to save a copy of operands[0] (even if we
> call emit_insn or expand_* ourselves), so we might as well continue to
> use the conveniently generated gen_split.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures. Ok for mainline?

Why this obviously fixes the issue seen I wonder whether there's
more of recog_data that might be used after control flow returns
to recog_memoized and thus the fix would be there, not in any
backend pattern triggering the issue like this?

The "easiest" fix would maybe to add a in_recog flag and
simply return FAIL from recog when recursing.  Not sure what
the effect on this particular pattern would be though?

The better(?) fix might be to push/pop recog_data in 'recog', but
of course give that recog_data is currently a global leakage
in intermediate code can still happen.

That said - does anybody know of similar fixes for this issue in other
backends patterns?

Thanks,
Richard.

>
>
> 2022-08-12  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/106577
>         * config/i386/i386.md (*testti_doubleword): Preserve a copy of
>         operands[0], and move initialization of operands[2] later, as the
>         call to force_reg may clobber the contents of the operands array.
>
> gcc/testsuite/ChangeLog
>         PR target/106577
>         * gcc.target/i386/pr106577.c: New test case.
>
>
> Thanks,
> Roger
> --
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 PATCH] PR target/106577: force_reg may clobber operands during split.
  2022-08-15  7:45 ` Richard Biener
@ 2022-08-16  8:14   ` Richard Sandiford
  2022-08-16  8:26     ` Richard Biener
  2022-12-02  9:39     ` [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577] Jakub Jelinek
  0 siblings, 2 replies; 7+ messages in thread
From: Richard Sandiford @ 2022-08-16  8:14 UTC (permalink / raw)
  To: Richard Biener via Gcc-patches; +Cc: Roger Sayle, Richard Biener

Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Fri, Aug 12, 2022 at 10:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>>
>>
>> This patch fixes PR target/106577 which is a recent ICE on valid regression
>> caused by my introduction of a *testti_doubleword pre-reload splitter in
>> i386.md.  During the split pass before reload, this converts the virtual
>> *testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
>> checking that any immediate operand is a valid "x86_64_hilo_general_operand"
>> and placing it into a TImode register using force_reg if it isn't.
>>
>> The unexpected behaviour (that caught me out) is that calling force_reg
>> may occasionally clobber the contents of the global operands array, or
>> more accurately recog_data.operand[0], which means that by the time
>> split_XXX calls gen_split_YYY the replacement insn's operands have been
>> corrupted.
>>
>> It's difficult to tell who (if anyone is at fault).  The re-entrant
>> stack trace (for the attached PR) looks like:
>>
>> gen_split_203 (*testti_doubleword) calls
>> force_reg calls
>> emit_move_insn calls
>> emit_move_insn_1 calls
>> gen_movti calls
>> ix86_expand_move calls
>> ix86_convert_const_wide_int_to_broadcast calls
>> ix86_vector_duplicate_value calls
>> recog_memoized calls
>> recog.
>>
>> By far the simplest and possibly correct fix is rather than attempt
>> to push and pop recog_data, to simply (in pre-reload splits) save a
>> copy of any operands that will be needed after force_reg, and use
>> these copies afterwards.  Many pre-reload splitters avoid this issue
>> using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
>> but in our case we still need to save a copy of operands[0] (even if we
>> call emit_insn or expand_* ourselves), so we might as well continue to
>> use the conveniently generated gen_split.
>>
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check, both with and without --target_board=unix{-m32},
>> with no new failures. Ok for mainline?
>
> Why this obviously fixes the issue seen I wonder whether there's
> more of recog_data that might be used after control flow returns
> to recog_memoized and thus the fix would be there, not in any
> backend pattern triggering the issue like this?
>
> The "easiest" fix would maybe to add a in_recog flag and
> simply return FAIL from recog when recursing.  Not sure what
> the effect on this particular pattern would be though?
>
> The better(?) fix might be to push/pop recog_data in 'recog', but
> of course give that recog_data is currently a global leakage
> in intermediate code can still happen.
>
> That said - does anybody know of similar fixes for this issue in other
> backends patterns?

I don't think it's valid for a simple query function like
ix86_vector_duplicate_value to clobber global state.  Doing that
could cause problems in other situations, not just splits.

Ideally, it would be good to wean insn-recog.cc:recog off global state.
The only parts of recog_data it uses (if I didn't miss something)
are recog_data.operands and recog_data.insn (but only to nullify
it for recog_memoized, which wouldn't be necessary if recog didn't
clobber recog_data.operands).  But I guess some .md expand/insn
conditions probably rely on the operands array being in recog_data,
so that might not be easy.

IMO the correct low-effort fix is to save and restore recog_data
in ix86_vector_duplicate_value.  It's a relatively big copy,
but the current code is pretty wasteful anyway (allocating at
least a new SET and INSN for every query).  Compared to the
overhead of doing that, a copy to and from the stack shouldn't
be too bad.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 PATCH] PR target/106577: force_reg may clobber operands during split.
  2022-08-16  8:14   ` Richard Sandiford
@ 2022-08-16  8:26     ` Richard Biener
  2022-08-16  9:02       ` Richard Sandiford
  2022-12-02  9:39     ` [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577] Jakub Jelinek
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2022-08-16  8:26 UTC (permalink / raw)
  To: Richard Biener via Gcc-patches, Roger Sayle, Richard Biener,
	Richard Sandiford

On Tue, Aug 16, 2022 at 10:14 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Fri, Aug 12, 2022 at 10:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
> >>
> >>
> >> This patch fixes PR target/106577 which is a recent ICE on valid regression
> >> caused by my introduction of a *testti_doubleword pre-reload splitter in
> >> i386.md.  During the split pass before reload, this converts the virtual
> >> *testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
> >> checking that any immediate operand is a valid "x86_64_hilo_general_operand"
> >> and placing it into a TImode register using force_reg if it isn't.
> >>
> >> The unexpected behaviour (that caught me out) is that calling force_reg
> >> may occasionally clobber the contents of the global operands array, or
> >> more accurately recog_data.operand[0], which means that by the time
> >> split_XXX calls gen_split_YYY the replacement insn's operands have been
> >> corrupted.
> >>
> >> It's difficult to tell who (if anyone is at fault).  The re-entrant
> >> stack trace (for the attached PR) looks like:
> >>
> >> gen_split_203 (*testti_doubleword) calls
> >> force_reg calls
> >> emit_move_insn calls
> >> emit_move_insn_1 calls
> >> gen_movti calls
> >> ix86_expand_move calls
> >> ix86_convert_const_wide_int_to_broadcast calls
> >> ix86_vector_duplicate_value calls
> >> recog_memoized calls
> >> recog.
> >>
> >> By far the simplest and possibly correct fix is rather than attempt
> >> to push and pop recog_data, to simply (in pre-reload splits) save a
> >> copy of any operands that will be needed after force_reg, and use
> >> these copies afterwards.  Many pre-reload splitters avoid this issue
> >> using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
> >> but in our case we still need to save a copy of operands[0] (even if we
> >> call emit_insn or expand_* ourselves), so we might as well continue to
> >> use the conveniently generated gen_split.
> >>
> >> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> >> and make -k check, both with and without --target_board=unix{-m32},
> >> with no new failures. Ok for mainline?
> >
> > Why this obviously fixes the issue seen I wonder whether there's
> > more of recog_data that might be used after control flow returns
> > to recog_memoized and thus the fix would be there, not in any
> > backend pattern triggering the issue like this?
> >
> > The "easiest" fix would maybe to add a in_recog flag and
> > simply return FAIL from recog when recursing.  Not sure what
> > the effect on this particular pattern would be though?
> >
> > The better(?) fix might be to push/pop recog_data in 'recog', but
> > of course give that recog_data is currently a global leakage
> > in intermediate code can still happen.
> >
> > That said - does anybody know of similar fixes for this issue in other
> > backends patterns?
>
> I don't think it's valid for a simple query function like
> ix86_vector_duplicate_value to clobber global state.  Doing that
> could cause problems in other situations, not just splits.
>
> Ideally, it would be good to wean insn-recog.cc:recog off global state.
> The only parts of recog_data it uses (if I didn't miss something)
> are recog_data.operands and recog_data.insn (but only to nullify
> it for recog_memoized, which wouldn't be necessary if recog didn't
> clobber recog_data.operands).  But I guess some .md expand/insn
> conditions probably rely on the operands array being in recog_data,
> so that might not be easy.
>
> IMO the correct low-effort fix is to save and restore recog_data
> in ix86_vector_duplicate_value.  It's a relatively big copy,
> but the current code is pretty wasteful anyway (allocating at
> least a new SET and INSN for every query).  Compared to the
> overhead of doing that, a copy to and from the stack shouldn't
> be too bad.

I see.  I wonder if we should at least add some public API for
save/restore of recog_data so the many places don't need to
invent their own version and they are more easily to find later.
Maybe some RAII

{
  push_recog_data saved ();

}

?  Shall we armor recog () for recursive invocation by adding a
->in_recog member to recog_data?

>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [x86 PATCH] PR target/106577: force_reg may clobber operands during split.
  2022-08-16  8:26     ` Richard Biener
@ 2022-08-16  9:02       ` Richard Sandiford
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Sandiford @ 2022-08-16  9:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener via Gcc-patches, Roger Sayle

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 16, 2022 at 10:14 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > On Fri, Aug 12, 2022 at 10:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>> >>
>> >>
>> >> This patch fixes PR target/106577 which is a recent ICE on valid regression
>> >> caused by my introduction of a *testti_doubleword pre-reload splitter in
>> >> i386.md.  During the split pass before reload, this converts the virtual
>> >> *testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
>> >> checking that any immediate operand is a valid "x86_64_hilo_general_operand"
>> >> and placing it into a TImode register using force_reg if it isn't.
>> >>
>> >> The unexpected behaviour (that caught me out) is that calling force_reg
>> >> may occasionally clobber the contents of the global operands array, or
>> >> more accurately recog_data.operand[0], which means that by the time
>> >> split_XXX calls gen_split_YYY the replacement insn's operands have been
>> >> corrupted.
>> >>
>> >> It's difficult to tell who (if anyone is at fault).  The re-entrant
>> >> stack trace (for the attached PR) looks like:
>> >>
>> >> gen_split_203 (*testti_doubleword) calls
>> >> force_reg calls
>> >> emit_move_insn calls
>> >> emit_move_insn_1 calls
>> >> gen_movti calls
>> >> ix86_expand_move calls
>> >> ix86_convert_const_wide_int_to_broadcast calls
>> >> ix86_vector_duplicate_value calls
>> >> recog_memoized calls
>> >> recog.
>> >>
>> >> By far the simplest and possibly correct fix is rather than attempt
>> >> to push and pop recog_data, to simply (in pre-reload splits) save a
>> >> copy of any operands that will be needed after force_reg, and use
>> >> these copies afterwards.  Many pre-reload splitters avoid this issue
>> >> using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
>> >> but in our case we still need to save a copy of operands[0] (even if we
>> >> call emit_insn or expand_* ourselves), so we might as well continue to
>> >> use the conveniently generated gen_split.
>> >>
>> >> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> >> and make -k check, both with and without --target_board=unix{-m32},
>> >> with no new failures. Ok for mainline?
>> >
>> > Why this obviously fixes the issue seen I wonder whether there's
>> > more of recog_data that might be used after control flow returns
>> > to recog_memoized and thus the fix would be there, not in any
>> > backend pattern triggering the issue like this?
>> >
>> > The "easiest" fix would maybe to add a in_recog flag and
>> > simply return FAIL from recog when recursing.  Not sure what
>> > the effect on this particular pattern would be though?
>> >
>> > The better(?) fix might be to push/pop recog_data in 'recog', but
>> > of course give that recog_data is currently a global leakage
>> > in intermediate code can still happen.
>> >
>> > That said - does anybody know of similar fixes for this issue in other
>> > backends patterns?
>>
>> I don't think it's valid for a simple query function like
>> ix86_vector_duplicate_value to clobber global state.  Doing that
>> could cause problems in other situations, not just splits.
>>
>> Ideally, it would be good to wean insn-recog.cc:recog off global state.
>> The only parts of recog_data it uses (if I didn't miss something)
>> are recog_data.operands and recog_data.insn (but only to nullify
>> it for recog_memoized, which wouldn't be necessary if recog didn't
>> clobber recog_data.operands).  But I guess some .md expand/insn
>> conditions probably rely on the operands array being in recog_data,
>> so that might not be easy.
>>
>> IMO the correct low-effort fix is to save and restore recog_data
>> in ix86_vector_duplicate_value.  It's a relatively big copy,
>> but the current code is pretty wasteful anyway (allocating at
>> least a new SET and INSN for every query).  Compared to the
>> overhead of doing that, a copy to and from the stack shouldn't
>> be too bad.
>
> I see.  I wonder if we should at least add some public API for
> save/restore of recog_data so the many places don't need to
> invent their own version and they are more easily to find later.

Plain assignment should work.  The structure isn't very fancy ;-)

> Maybe some RAII
>
> {
>   push_recog_data saved ();
>
> }
>
> ?

Maybe.  But if we're going to spend effort on something, moving away
from the global state seems better IMO.

> Shall we armor recog () for recursive invocation by adding a
> ->in_recog member to recog_data?

Yeah, we could do that, but it wouldn't catch the current bug.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577]
  2022-08-16  8:14   ` Richard Sandiford
  2022-08-16  8:26     ` Richard Biener
@ 2022-12-02  9:39     ` Jakub Jelinek
  2022-12-02  9:51       ` Uros Bizjak
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2022-12-02  9:39 UTC (permalink / raw)
  To: Uros Bizjak, Richard Biener, richard.sandiford, Roger Sayle; +Cc: gcc-patches

Hi!

On Tue, Aug 16, 2022 at 09:14:06AM +0100, Richard Sandiford via Gcc-patches wrote:
> IMO the correct low-effort fix is to save and restore recog_data
> in ix86_vector_duplicate_value.  It's a relatively big copy,
> but the current code is pretty wasteful anyway (allocating at
> least a new SET and INSN for every query).  Compared to the
> overhead of doing that, a copy to and from the stack shouldn't
> be too bad.

The following patch does that.
It isn't the first spot in the compiler that does that, not even the first
spot in the i386 backend.
In i386-expand.cc beyond these 2 recog_memoized calls there is one in
expand_vselect, but I think it is unlikely we'd run into these issues trying
to expand new permutations from splitters.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-02  Jakub Jelinek  <jakub@redhat.com>

	PR target/106577
	* config/i386/i386-expand.cc (ix86_vector_duplicate_value): Save/restore
	recog_data around recog_memoized calls.

	* gcc.target/i386/pr106577.c: New test.

--- gcc/config/i386/i386-expand.cc.jj	2022-12-01 09:29:15.233466321 +0100
+++ gcc/config/i386/i386-expand.cc	2022-12-01 14:05:55.901157211 +0100
@@ -15187,6 +15187,10 @@ ix86_vector_duplicate_value (machine_mod
   bool ok;
   rtx_insn *insn;
   rtx dup;
+  /* Save/restore recog_data in case this is called from splitters
+     or other routines where recog_data needs to stay valid across
+     force_reg.  See PR106577.  */
+  recog_data_d recog_data_save = recog_data;
 
   /* First attempt to recognize VAL as-is.  */
   dup = gen_vec_duplicate (mode, val);
@@ -15212,6 +15216,7 @@ ix86_vector_duplicate_value (machine_mod
       ok = recog_memoized (insn) >= 0;
       gcc_assert (ok);
     }
+  recog_data = recog_data_save;
   return true;
 }
 
--- gcc/testsuite/gcc.target/i386/pr106577.c.jj	2022-12-01 14:13:03.973872383 +0100
+++ gcc/testsuite/gcc.target/i386/pr106577.c	2022-12-01 14:13:03.973872383 +0100
@@ -0,0 +1,10 @@
+/* PR target/106577 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -mavx" } */
+
+int i;
+void
+foo (void)
+{
+  i ^= !(((unsigned __int128)0xf0f0f0f0f0f0f0f0 << 64 | 0xf0f0f0f0f0f0f0f0) & i);
+}


	Jakub


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577]
  2022-12-02  9:39     ` [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577] Jakub Jelinek
@ 2022-12-02  9:51       ` Uros Bizjak
  0 siblings, 0 replies; 7+ messages in thread
From: Uros Bizjak @ 2022-12-02  9:51 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, richard.sandiford, Roger Sayle, gcc-patches

On Fri, Dec 2, 2022 at 10:39 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> On Tue, Aug 16, 2022 at 09:14:06AM +0100, Richard Sandiford via Gcc-patches wrote:
> > IMO the correct low-effort fix is to save and restore recog_data
> > in ix86_vector_duplicate_value.  It's a relatively big copy,
> > but the current code is pretty wasteful anyway (allocating at
> > least a new SET and INSN for every query).  Compared to the
> > overhead of doing that, a copy to and from the stack shouldn't
> > be too bad.
>
> The following patch does that.
> It isn't the first spot in the compiler that does that, not even the first
> spot in the i386 backend.
> In i386-expand.cc beyond these 2 recog_memoized calls there is one in
> expand_vselect, but I think it is unlikely we'd run into these issues trying
> to expand new permutations from splitters.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-12-02  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/106577
>         * config/i386/i386-expand.cc (ix86_vector_duplicate_value): Save/restore
>         recog_data around recog_memoized calls.
>
>         * gcc.target/i386/pr106577.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-expand.cc.jj   2022-12-01 09:29:15.233466321 +0100
> +++ gcc/config/i386/i386-expand.cc      2022-12-01 14:05:55.901157211 +0100
> @@ -15187,6 +15187,10 @@ ix86_vector_duplicate_value (machine_mod
>    bool ok;
>    rtx_insn *insn;
>    rtx dup;
> +  /* Save/restore recog_data in case this is called from splitters
> +     or other routines where recog_data needs to stay valid across
> +     force_reg.  See PR106577.  */
> +  recog_data_d recog_data_save = recog_data;
>
>    /* First attempt to recognize VAL as-is.  */
>    dup = gen_vec_duplicate (mode, val);
> @@ -15212,6 +15216,7 @@ ix86_vector_duplicate_value (machine_mod
>        ok = recog_memoized (insn) >= 0;
>        gcc_assert (ok);
>      }
> +  recog_data = recog_data_save;
>    return true;
>  }
>
> --- gcc/testsuite/gcc.target/i386/pr106577.c.jj 2022-12-01 14:13:03.973872383 +0100
> +++ gcc/testsuite/gcc.target/i386/pr106577.c    2022-12-01 14:13:03.973872383 +0100
> @@ -0,0 +1,10 @@
> +/* PR target/106577 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O2 -mavx" } */
> +
> +int i;
> +void
> +foo (void)
> +{
> +  i ^= !(((unsigned __int128)0xf0f0f0f0f0f0f0f0 << 64 | 0xf0f0f0f0f0f0f0f0) & i);
> +}
>
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-12-02  9:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12 20:40 [x86 PATCH] PR target/106577: force_reg may clobber operands during split Roger Sayle
2022-08-15  7:45 ` Richard Biener
2022-08-16  8:14   ` Richard Sandiford
2022-08-16  8:26     ` Richard Biener
2022-08-16  9:02       ` Richard Sandiford
2022-12-02  9:39     ` [PATCH] i386: Save/restore recog_data in ix86_vector_duplicate_value [PR106577] Jakub Jelinek
2022-12-02  9:51       ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).