[Bug target/89984] Extra register move - cvs-commit at gcc dot gnu.org

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/89984] Extra register move
Date: Wed, 08 Sep 2021 12:09:25 +0000	[thread overview]
Message-ID: <bug-89984-4-2xTiALmfKi@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-89984-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89984

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:7485a52551d71db2e8bbfc4c484196bcc321a1cd

commit r12-3417-g7485a52551d71db2e8bbfc4c484196bcc321a1cd
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 8 14:06:10 2021 +0200

    i386: Fix up xorsign for AVX [PR89984]

    Thinking about it more this morning, while this patch fixes the problems
    revealed in the testcase, the recent PR89984 change was buggy too, but
    perhaps that can be fixed incrementally.  Because for AVX the new code
    destructively modifies op1.  If that is different from dest, say on:
    float
    foo (float x, float y)
    {
      return x * __builtin_copysignf (1.0f, y) + y;
    }
    then we get after RA:
    (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82])
            (unspec:SF [
                    (reg:SF 20 xmm0 [88])
                    (reg:SF 21 xmm1 [89])
                    (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 
S16 A128])
                ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1}
         (nil))
    (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87])
            (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82])
                (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {*fop_sf_comm}
         (nil))
    but split the xorsign into:
            vandps  .LC0(%rip), %xmm1, %xmm1
            vxorps  %xmm0, %xmm1, %xmm0
    and then the addition:
            vaddss  %xmm1, %xmm0, %xmm0
    which means we miscompile it - instead of adding y in the end we add
    __builtin_copysignf (0.0f, y).
    So, wonder if we don't want instead in addition to the &Yv <- Yv, 0
    alternative (enabled for both pre-AVX and AVX as in this patch) the
    &Yv <- Yv, Yv where destination must be different from inputs and another
    Yv <- Yv, Yv where it can be the same but then need a match_scratch
    (with X for the other alternatives and =Yv for the last one).
    That way we'd always have a safe register we can store the op1 & mask
    value into, either the destination (in the first alternative known to
    be equal to op1 which is needed for non-AVX but ok for AVX too), in the
    second alternative known to be different from both inputs and in the third
    which could be used for those
    float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); }
    cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use
    some other register like xmm2.

    On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote:
    > I'm curious why we need the  post_reload splitter @xorsign<mode>3_1
    > for scalar mode, can't we just expand them into and/xor operations in
    > the expander, just like vector modes did.

    Following seems to work for all the testcases I've tried (and in some
    generates better code than the post-reload splitter).

    2021-09-08  Jakub Jelinek  <jakub@redhat.com>
                liuhongt  <hongtao.liu@intel.com>

            PR target/89984
            * config/i386/i386.md (@xorsign<mode>3_1): Remove.
            * config/i386/i386-expand.c (ix86_expand_xorsign): Expand right
away
            into AND with mask and XOR, using paradoxical subregs.
            (ix86_split_xorsign): Remove.
            * config/i386/i386-protos.h (ix86_split_xorsign): Remove.

            * gcc.target/i386/avx-pr102224.c: Fix up PR number.
            * gcc.dg/pr89984.c: New test.
            * gcc.target/i386/avx-pr89984.c: New test.

next prev parent reply	other threads:[~2021-09-08 12:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-89984-4@http.gcc.gnu.org/bugzilla/>
2021-09-01  8:42 ` pinskia at gcc dot gnu.org
2021-09-04  6:41 ` pinskia at gcc dot gnu.org
2021-09-06 12:18 ` cvs-commit at gcc dot gnu.org
2021-09-06 12:35 ` hjl.tools at gmail dot com
2021-09-08  8:24 ` jakub at gcc dot gnu.org
2021-09-08 12:09 ` cvs-commit at gcc dot gnu.org [this message]
2021-09-08 12:11 ` jakub at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-89984-4-2xTiALmfKi@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).