From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 71EBA3858D20; Thu, 27 Apr 2023 07:52:05 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 71EBA3858D20
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1682581925;
	bh=xoeybBNo7fuH6aE37qix3l+KWzhXMLCyt6KrjFI3NYI=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=tvjX8ycTt/q7O1i+R+f6MLCYrECbwZzSlMQDmn4YYmEL/rZG1tVUBlxnj6U8wCWhd
	 YOeREhKZ9K2vFLeQTwsyFkmzTNT2qfUyeQFqKC0nrJB3s2mnj4dBs2yfDIx4KhxEyJ
	 WkvsYOIaS1VV9QQ7+OdkpoG7beSkOmdRT1jeYjMg=
From: "rsandifo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/109632] Inefficient codegen when complex numbers are
 emulated with structs
Date: Thu, 27 Apr 2023 07:52:04 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rsandifo at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-109632-4-OSsl5awA6n@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-109632-4@http.gcc.gnu.org/bugzilla/>
References: <bug-109632-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109632

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org
--- Comment #4 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.or=
g> ---
Maybe worth noting that if the complex arguments are passed
by value, to give:

struct complx_t {
    float re;
    float im;
};

complx_t
add(const complx_t a, const complx_t b) {
  return {a.re + b.re, a.im + b.im};
}

and SLP is disabled, we get:

        fmov    w4, s1
        fmov    w3, s3
        fmov    x0, d0
        fmov    x1, d2
        mov     x2, 0
        bfi     x0, x4, 32, 32
        bfi     x1, x3, 32, 32
        fmov    d0, x0
        fmov    d1, x1
        sbfx    x3, x0, 0, 32
        sbfx    x0, x1, 0, 32
        ushr    d1, d1, 32
        fmov    d3, x0
        fmov    d2, x3
        ushr    d0, d0, 32
        fadd    s2, s2, s3
        fadd    s0, s0, s1
        fmov    w1, s2
        fmov    w0, s0
        bfi     x2, x1, 0, 32
        bfi     x2, x0, 32, 32
        lsr     x0, x2, 32
        lsr     w2, w2, 0
        fmov    s1, w0
        fmov    s0, w2
        ret

which is almost impressive, in its way.

I think we need a way in gimple of =E2=80=9CSRA-ing=E2=80=9D the arguments
and return value, in cases where that's forced by the ABI.
I.e. provide separate incoming values of a.re and a.im,
and store them to =E2=80=9Ca=E2=80=9D on entry.  Then similarly make the
return stmt return RETURN_DECL.re and RETURN_DECL.im
separately.=