[Bug target/109632] Inefficient codegen when complex numbers are emulated with structs

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
Date: Tue, 23 May 2023 10:34:59 +0000	[thread overview]
Message-ID: <bug-109632-4-J8hzULSr6y@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-109632-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:b096a6ebe9d9f9fed4c105f6555f724eb32af95c

commit r14-1131-gb096a6ebe9d9f9fed4c105f6555f724eb32af95c
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Tue May 23 11:34:42 2023 +0100

    aarch64: Provide FPR alternatives for some bit insertions [PR109632]

    At -O2, and so with SLP vectorisation enabled:

        struct complx_t { float re, im; };
        complx_t add(complx_t a, complx_t b) {
          return {a.re + b.re, a.im + b.im};
        }

    generates:

            fmov    w3, s1
            fmov    x0, d0
            fmov    x1, d2
            fmov    w2, s3
            bfi     x0, x3, 32, 32
            fmov    d31, x0
            bfi     x1, x2, 32, 32
            fmov    d30, x1
            fadd    v31.2s, v31.2s, v30.2s
            fmov    x1, d31
            lsr     x0, x1, 32
            fmov    s1, w0
            lsr     w0, w1, 0
            fmov    s0, w0
            ret

    This is because complx_t is passed and returned in FPRs, but GCC gives
    it DImode.  We therefore âneedâ to assemble a DImode pseudo from the
    two individual floats, bitcast it to a vector, do the arithmetic,
    bitcast it back to a DImode pseudo, then extract the individual floats.

    There are many problems here.  The most basic is that we shouldn't
    use SLP for such a trivial example.  But SLP should in principle be
    beneficial for more complicated examples, so preventing SLP for the
    example above just changes the reproducer needed.  A more fundamental
    problem is that it doesn't make sense to use single DImode pseudos in a
    testcase like this.  I have a WIP patch to allow re and im to be stored
    in individual SFmode pseudos instead, but it's quite an invasive change
    and might end up going nowhere.

    A simpler problem to tackle is that we allow DImode pseudos to be stored
    in FPRs, but we don't provide any patterns for inserting values into
    them, even though INS makes that easy for element-like insertions.
    This patch adds some patterns for that.

    Doing that showed that aarch64_modes_tieable_p was too strict:
    it didn't allow SFmode and DImode values to be tied, even though
    both of them occupy a single GPR and FPR, and even though we allow
    both classes to change between the modes.

    The *aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS> pattern is
    especially ugly, but it's not clear what target-independent
    code ought to simplify it to, if it was going to simplify it.

    We should probably do the same thing for extractions, but that's left
    as future work.

    After the patch we generate:

            ins     v0.s[1], v1.s[0]
            ins     v2.s[1], v3.s[0]
            fadd    v0.2s, v0.2s, v2.2s
            fmov    x0, d0
            ushr    d1, d0, 32
            lsr     w0, w0, 0
            fmov    s0, w0
            ret

    which seems like a step in the right direction.

    All in all, there's nothing elegant about this patchh.  It just
    seems like the least worst option.

    gcc/
            PR target/109632
            * config/aarch64/aarch64.cc (aarch64_modes_tieable_p): Allow
            subregs between any scalars that are 64 bits or smaller.
            * config/aarch64/iterators.md (SUBDI_BITS): New int iterator.
            (bits_etype): New int attribute.
            * config/aarch64/aarch64.md (*insv_reg<mode>_<SUBDI_BITS>)
            (*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>): New patterns.
            (*aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS>): Likewise.

    gcc/testsuite/
            * gcc.target/aarch64/ins_bitfield_1.c: New test.
            * gcc.target/aarch64/ins_bitfield_2.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_3.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_4.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_5.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_6.c: Likewise.

next prev parent reply	other threads:[~2023-05-23 10:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-26 13:03 [Bug target/109632] New: " tnfchris at gcc dot gnu.org
2023-04-26 14:14 ` [Bug target/109632] " rguenth at gcc dot gnu.org
2023-04-26 14:40 ` tnfchris at gcc dot gnu.org
2023-04-26 15:23 ` tnfchris at gcc dot gnu.org
2023-04-27  7:52 ` rsandifo at gcc dot gnu.org
2023-04-27 11:17 ` rsandifo at gcc dot gnu.org
2023-04-27 11:24 ` tnfchris at gcc dot gnu.org
2023-04-27 13:33 ` rsandifo at gcc dot gnu.org
2023-04-27 17:50 ` rsandifo at gcc dot gnu.org
2023-04-27 19:11 ` tnfchris at gcc dot gnu.org
2023-05-02 15:00 ` rsandifo at gcc dot gnu.org
2023-05-23 10:34 ` cvs-commit at gcc dot gnu.org [this message]
2023-05-23 19:16 ` rsandifo at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-109632-4-J8hzULSr6y@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).