public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Robin Dapp <rdapp@linux.ibm.com>
To: Jeff Law <jeffreyalaw@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [RFC] postreload cse'ing vector constants
Date: Tue, 27 Sep 2022 19:40:23 +0200	[thread overview]
Message-ID: <7bd6cb29-a107-a7f2-463f-75bf811792a7@linux.ibm.com> (raw)
In-Reply-To: <caf21e5f-966e-d8e7-a7b5-feda97b9d543@linux.ibm.com>

> I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9
> and s390.  Everything looks good except two additional fails on x86
> where code actually looks worse.
> 
> gcc.target/i386/keylocker-encodekey128.c
> 
> 17c17,18
> <       movaps  %xmm4, k2(%rip)
> ---
>>       pxor    %xmm0, %xmm0
>>       movaps  %xmm0, k2(%rip)
> 
> gcc.target/i386/keylocker-encodekey256.c:
> 
> 19c19,20
> <       movaps  %xmm4, k3(%rip)
> ---
>>       pxor    %xmm0, %xmm0
>>       movaps  %xmm0, k3(%rip)

Before the patch and after postreload we have:

(insn (set (reg:V2DI xmm0)
        (reg:V2DI xmm4))
     (expr_list:REG_DEAD (reg:V2DI 24 xmm4)
        (expr_list:REG_EQUIV (const_vector:V2DI [
                    (const_int 0 [0]) repeated x2
                ])))))
(insn (set (mem/c:V2DI (symbol_ref:DI ("k2"))
        (reg:V2DI xmm0))))

which is converted by cprop_hardreg to:

(insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
        (reg:V2DI xmm4))))

With the change there is:

(insn (set (reg:V2DI xmm0)
        (const_vector:V2DI [
                (const_int 0 [0]) repeated x2
            ])))
(insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
        (reg:V2DI xmm0))))

which is not simplified further because xmm0 needs to be explicitly
zeroed while xmm4 is assumed to be zeroed by encodekey128.  I'm not
familiar with this so I'm supposing this is correct even though I found
"XMM4 through XMM6 are reserved for future usages and software should
not rely upon them being zeroed." online.

Even inf xmm4 were zeroed explicity, I guess in this case the simple
costing of mov reg,reg vs mov reg,imm (with the latter not being more
expensive) falls short?  cprop_hardreg can actually propagate the zeroed
xmm4 into the next move.
The same mechanism could possibly even elide many such moves which would
mean we'd unnecessarily emit many mov reg,0?  Hmm...

  reply	other threads:[~2022-09-27 17:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-07 14:40 Robin Dapp
2022-09-07 15:06 ` Jeff Law
2022-09-07 15:33   ` Robin Dapp
2022-09-07 15:49     ` Jeff Law
2022-09-08 13:04       ` Robin Dapp
2022-09-27 17:40         ` Robin Dapp [this message]
2022-09-27 19:39           ` H.J. Lu
2022-09-28 16:48             ` Robin Dapp
2022-11-03 12:38               ` Robin Dapp
2022-11-20 16:40                 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bd6cb29-a107-a7f2-463f-75bf811792a7@linux.ibm.com \
    --to=rdapp@linux.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).