From: "Roger Sayle" <roger@nextmovesoftware.com>
To: "'GCC Patches'" <gcc-patches@gcc.gnu.org>
Subject: [PATCH] PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs.
Date: Mon, 30 May 2022 11:06:16 +0100 [thread overview]
Message-ID: <00e101d8740c$e785e110$b691a330$@nextmovesoftware.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2574 bytes --]
This patch addresses the issue in comment #6 of PR rtl-optimization/7061
(a four digit PR number) from 2006 where on x86_64 complex number arguments
are unconditionally spilled to the stack.
For the test cases below:
float re(float _Complex a) { return __real__ a; }
float im(float _Complex a) { return __imag__ a; }
GCC with -O2 currently generates:
re: movq %xmm0, -8(%rsp)
movss -8(%rsp), %xmm0
ret
im: movq %xmm0, -8(%rsp)
movss -4(%rsp), %xmm0
ret
with this patch we now generate:
re: ret
im: movq %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm0
ret
[Technically, this shift can be performed on %xmm0 in a single
instruction, but the backend needs to be taught to do that, the
important bit is that the SCmode argument isn't written to the
stack].
The patch itself is to emit_group_store where just before RTL
expansion commits to writing to the stack, we check if the store
group consists of a single scalar integer register that holds
a complex mode value; on x86_64 SCmode arguments are passed in
DImode registers. If this is the case, we can use a SUBREG to
"view_convert" the integer to the equivalent complex mode.
An interesting corner case that showed up during testing is that
x86_64 also passes HCmode arguments in DImode registers(!), i.e.
using modes of different sizes. This is easily handled/supported
by first converting to an integer mode of the correct size, and
then generating a complex mode SUBREG of this. This is similar
in concept to the patch I proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
which was almost (but not quite) approved here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591139.html
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures. Ok for mainline?
2020-05-30 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/7061
* expr.cc (emit_group_stote): For groups that consist of a single
scalar integer register that hold a complex mode value, use
gen_lowpart to generate a SUBREG to "view_convert" to the complex
mode. For modes of different sizes, first convert to an integer
mode of the appropriate size.
gcc/testsuite/ChangeLog
PR rtl-optimization/7061
* gcc.target/i386/pr7061-1.c: New test case.
* gcc.target/i386/pr7061-2.c: New test case.
Thanks in advance,
Roger
--
[-- Attachment #2: patchcx.txt --]
[-- Type: text/plain, Size: 1908 bytes --]
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 7197996..c9df206 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -2803,10 +2803,26 @@ emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
{
machine_mode dest_mode = GET_MODE (dest);
machine_mode tmp_mode = GET_MODE (tmps[i]);
+ scalar_int_mode imode;
gcc_assert (known_eq (bytepos, 0) && XVECLEN (src, 0));
- if (GET_MODE_ALIGNMENT (dest_mode)
+ if (finish == 1
+ && REG_P (tmps[i])
+ && COMPLEX_MODE_P (dest_mode)
+ && SCALAR_INT_MODE_P (tmp_mode)
+ && int_mode_for_mode (dest_mode).exists (&imode))
+ {
+ if (tmp_mode != imode)
+ {
+ rtx tmp = gen_reg_rtx (imode);
+ emit_move_insn (tmp, gen_lowpart (imode, tmps[i]));
+ dst = gen_lowpart (dest_mode, tmp);
+ }
+ else
+ dst = gen_lowpart (dest_mode, tmps[i]);
+ }
+ else if (GET_MODE_ALIGNMENT (dest_mode)
>= GET_MODE_ALIGNMENT (tmp_mode))
{
dest = assign_stack_temp (dest_mode,
diff --git a/gcc/testsuite/gcc.target/i386/pr7061-1.c b/gcc/testsuite/gcc.target/i386/pr7061-1.c
new file mode 100644
index 0000000..ce5f6b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr7061-1.c
@@ -0,0 +1,4 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+float re(float _Complex a) { return __real__ a; }
+/* { dg-final { scan-assembler-not "mov" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr7061-2.c b/gcc/testsuite/gcc.target/i386/pr7061-2.c
new file mode 100644
index 0000000..ac33340
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr7061-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+float im(float _Complex a) { return __imag__ a; }
+/* { dg-final { scan-assembler-not "movss" } } */
+/* { dg-final { scan-assembler-not "rsp" } } */
next reply other threads:[~2022-05-30 10:06 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-30 10:06 Roger Sayle [this message]
2022-06-01 15:04 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='00e101d8740c$e785e110$b691a330$@nextmovesoftware.com' \
--to=roger@nextmovesoftware.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).