public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p
@ 2021-10-15 12:15 Roger Sayle
  2021-10-15 14:41 ` Uros Bizjak
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Sayle @ 2021-10-15 12:15 UTC (permalink / raw)
  To: 'GCC Patches'

[-- Attachment #1: Type: text/plain, Size: 1846 bytes --]


My previous patch, which was intended to reduce the differences seen by
the combination of -march=cascadelake and -m32, has additionally found
some more instances where this combination behaves differently to regular
x86_64-pc-linux-gnu.  The middle-end always, and backends usually, use
emit_move_insn to emit/expand move instructions allowing the backend
control over placing things in constant pools, adding REG_EQUAL notes,
and so on.  Several of the AVX512 built-in expanders bypass this logic,
and instead generate moves directly using emit_insn(gen_rtx_SET (dst,src)).

For example, i386-expand.c line 12004 contains:
      for (i = 0; i < 8; i++)
        emit_insn (gen_rtx_SET (xmm_regs[i], const0_rtx));

I suspect that in this case, loading of standard_sse_constant_p, my
change to require loading of likely spilled hard registers via a
pseudo is perhaps overly strict, so this patch/fix reallows these
immediate constants values to be loaded directly prior to reload.

If anyone notices a (SPEC benchmark) performance regression with
this patch, I'll propose the more invasive fix to make more use of
emit_move_insn in the backend (and revert this fix), but all things
being equal it's best to leave things the way they previously were.

This patch not only cures the regressions reported by Sunil's
tester, but in combination with the previous patch now has 7 fewer
unexpected failures in the testsuite with -m32 -march=cascadelake.
This patch has also been tested with "make bootstrap" and
"make -k check" on x86_64-pc-linux-gnu with no new failures.

Ok for mainline?
Sorry again for the temporary inconvenience.


2021-10-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.c (ix86_hardreg_mov_ok): For vector modes,
	allow standard_sse_constant_p immediate constants.


Thanks in advance,
Roger
--


[-- Attachment #2: patchv4b.txt --]
[-- Type: text/plain, Size: 714 bytes --]

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fb65609..9cc903e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19303,7 +19303,9 @@ ix86_hardreg_mov_ok (rtx dst, rtx src)
   /* Avoid complex sets of likely_spilled hard registers before reload.  */
   if (REG_P (dst) && HARD_REGISTER_P (dst)
       && !REG_P (src) && !MEM_P (src)
-      && !x86_64_immediate_operand (src, GET_MODE (dst))
+      && !(VECTOR_MODE_P (GET_MODE (dst))
+	   ? standard_sse_constant_p (src, GET_MODE (dst))
+	   : x86_64_immediate_operand (src, GET_MODE (dst)))
       && ix86_class_likely_spilled_p (REGNO_REG_CLASS (REGNO (dst)))
       && !reload_completed)
     return false;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p
  2021-10-15 12:15 [PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p Roger Sayle
@ 2021-10-15 14:41 ` Uros Bizjak
  0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2021-10-15 14:41 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches, H.J. Lu

On Fri, Oct 15, 2021 at 2:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> My previous patch, which was intended to reduce the differences seen by
> the combination of -march=cascadelake and -m32, has additionally found
> some more instances where this combination behaves differently to regular
> x86_64-pc-linux-gnu.  The middle-end always, and backends usually, use
> emit_move_insn to emit/expand move instructions allowing the backend
> control over placing things in constant pools, adding REG_EQUAL notes,
> and so on.  Several of the AVX512 built-in expanders bypass this logic,
> and instead generate moves directly using emit_insn(gen_rtx_SET (dst,src)).
>
> For example, i386-expand.c line 12004 contains:
>       for (i = 0; i < 8; i++)
>         emit_insn (gen_rtx_SET (xmm_regs[i], const0_rtx));
>
> I suspect that in this case, loading of standard_sse_constant_p, my
> change to require loading of likely spilled hard registers via a
> pseudo is perhaps overly strict, so this patch/fix reallows these
> immediate constants values to be loaded directly prior to reload.
>
> If anyone notices a (SPEC benchmark) performance regression with
> this patch, I'll propose the more invasive fix to make more use of
> emit_move_insn in the backend (and revert this fix), but all things
> being equal it's best to leave things the way they previously were.
>
> This patch not only cures the regressions reported by Sunil's
> tester, but in combination with the previous patch now has 7 fewer
> unexpected failures in the testsuite with -m32 -march=cascadelake.
> This patch has also been tested with "make bootstrap" and
> "make -k check" on x86_64-pc-linux-gnu with no new failures.
>
> Ok for mainline?
> Sorry again for the temporary inconvenience.
>
>
> 2021-10-15  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.c (ix86_hardreg_mov_ok): For vector modes,
>         allow standard_sse_constant_p immediate constants.

LGTM.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-10-15 14:41 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 12:15 [PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p Roger Sayle
2021-10-15 14:41 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).