public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r13-1038] PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs.
@ 2022-06-10 14:20 Roger Sayle
  0 siblings, 0 replies; only message in thread
From: Roger Sayle @ 2022-06-10 14:20 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:1753a7120109c1d3b682f9487d6cca64fb2f0929

commit r13-1038-g1753a7120109c1d3b682f9487d6cca64fb2f0929
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Fri Jun 10 15:14:23 2022 +0100

    PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs.
    
    This patch addresses the issue in comment #6 of PR rtl-optimization/7061
    (a four digit PR number) from 2006 where on x86_64 complex number arguments
    are unconditionally spilled to the stack.
    
    For the test cases below:
    float re(float _Complex a) { return __real__ a; }
    float im(float _Complex a) { return __imag__ a; }
    
    GCC with -O2 currently generates:
    
    re:     movq    %xmm0, -8(%rsp)
            movss   -8(%rsp), %xmm0
            ret
    im:     movq    %xmm0, -8(%rsp)
            movss   -4(%rsp), %xmm0
            ret
    
    with this patch we now generate:
    
    re:     ret
    im:     movq    %xmm0, %rax
            shrq    $32, %rax
            movd    %eax, %xmm0
            ret
    
    [Technically, this shift can be performed on %xmm0 in a single
    instruction, but the backend needs to be taught to do that, the
    important bit is that the SCmode argument isn't written to the
    stack].
    
    The patch itself is to emit_group_store where just before RTL
    expansion commits to writing to the stack, we check if the store
    group consists of a single scalar integer register that holds
    a complex mode value; on x86_64 SCmode arguments are passed in
    DImode registers.  If this is the case, we can use a SUBREG to
    "view_convert" the integer to the equivalent complex mode.
    
    An interesting corner case that showed up during testing is that
    x86_64 also passes HCmode arguments in DImode registers(!), i.e.
    using modes of different sizes.  This is easily handled/supported
    by first converting to an integer mode of the correct size, and
    then generating a complex mode SUBREG of this.  This is similar
    in concept to the patch I proposed here:
    https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
    
    2020-06-10  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            PR rtl-optimization/7061
            * expr.cc (emit_group_store): For groups that consist of a single
            scalar integer register that hold a complex mode value, use
            gen_lowpart to generate a SUBREG to "view_convert" to the complex
            mode.  For modes of different sizes, first convert to an integer
            mode of the appropriate size.
    
    gcc/testsuite/ChangeLog
            PR rtl-optimization/7061
            * gcc.target/i386/pr7061-1.c: New test case.
            * gcc.target/i386/pr7061-2.c: New test case.

Diff:
---
 gcc/expr.cc                              | 18 +++++++++++++++++-
 gcc/testsuite/gcc.target/i386/pr7061-1.c |  4 ++++
 gcc/testsuite/gcc.target/i386/pr7061-2.c |  5 +++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index c37a9990536..78c839ab425 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -2801,10 +2801,26 @@ emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
 	    {
 	      machine_mode dest_mode = GET_MODE (dest);
 	      machine_mode tmp_mode = GET_MODE (tmps[i]);
+	      scalar_int_mode imode;
 
 	      gcc_assert (known_eq (bytepos, 0) && XVECLEN (src, 0));
 
-	      if (GET_MODE_ALIGNMENT (dest_mode)
+	      if (finish == 1
+		  && REG_P (tmps[i])
+		  && COMPLEX_MODE_P (dest_mode)
+		  && SCALAR_INT_MODE_P (tmp_mode)
+		  && int_mode_for_mode (dest_mode).exists (&imode))
+		{
+		  if (tmp_mode != imode)
+		    {
+		      rtx tmp = gen_reg_rtx (imode);
+		      emit_move_insn (tmp, gen_lowpart (imode, tmps[i]));
+		      dst = gen_lowpart (dest_mode, tmp);
+		    }
+		  else
+		    dst = gen_lowpart (dest_mode, tmps[i]);
+		}
+	      else if (GET_MODE_ALIGNMENT (dest_mode)
 		  >= GET_MODE_ALIGNMENT (tmp_mode))
 		{
 		  dest = assign_stack_temp (dest_mode,
diff --git a/gcc/testsuite/gcc.target/i386/pr7061-1.c b/gcc/testsuite/gcc.target/i386/pr7061-1.c
new file mode 100644
index 00000000000..ce5f6b2741c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr7061-1.c
@@ -0,0 +1,4 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+float re(float _Complex a) { return __real__ a; }
+/* { dg-final { scan-assembler-not "mov" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr7061-2.c b/gcc/testsuite/gcc.target/i386/pr7061-2.c
new file mode 100644
index 00000000000..ac33340099b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr7061-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+float im(float _Complex a) { return __imag__ a; }
+/* { dg-final { scan-assembler-not "movss" } } */
+/* { dg-final { scan-assembler-not "rsp" } } */


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-06-10 14:20 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 14:20 [gcc r13-1038] PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs Roger Sayle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).