RE: [x86 PATCH] Fix FAIL of gcc.target/i386/pr91681-1.c

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Roger Sayle" <roger@nextmovesoftware.com>
To: "'Jiang, Haochen'" <haochen.jiang@intel.com>, <gcc-patches@gcc.gnu.org>
Cc: "'Uros Bizjak'" <ubizjak@gmail.com>
Subject: RE: [x86 PATCH] Fix FAIL of gcc.target/i386/pr91681-1.c
Date: Mon, 17 Jul 2023 08:54:16 +0100	[thread overview]
Message-ID: <003801d9b883$e6415c50$b2c414f0$@nextmovesoftware.com> (raw)
In-Reply-To: <SA1PR11MB594662B874596DF8E284BCA4EC3BA@SA1PR11MB5946.namprd11.prod.outlook.com>

> From: Jiang, Haochen <haochen.jiang@intel.com>
> Sent: 17 July 2023 02:50
> 
> > From: Jiang, Haochen
> > Sent: Friday, July 14, 2023 10:50 AM
> >
> > > The recent change in TImode parameter passing on x86_64 results in
> > > the FAIL of pr91681-1.c.  The issue is that with the extra
> > > flexibility, the combine pass is now spoilt for choice between using
> > > either the *add<dwi>3_doubleword_concat or the
> > > *add<dwi>3_doubleword_zext patterns, when one operand is a *concat and
> the other is a zero_extend.
> > > The solution proposed below is provide an
> > > *add<dwi>3_doubleword_concat_zext define_insn_and_split, that can
> > > benefit both from the register allocation of *concat, and still
> > > avoid the xor normally required by zero extension.
> > >
> > > I'm investigating a follow-up refinement to improve register
> > > allocation further by avoiding the early clobber in the =&r, and
> > > handling (custom) reloads explicitly, but this piece resolves the
> > > testcase
> > failure.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make
> > > bootstrap and make -k check, both with and without
> > > --target_board=unix{-m32} with no new failures.  Ok for mainline?
> > >
> > >
> > > 2023-07-11  Roger Sayle  <roger@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > >         PR target/91681
> > >         * config/i386/i386.md (*add<dwi>3_doubleword_concat_zext): New
> > >         define_insn_and_split derived from
*add<dwi>3_doubleword_concat
> > >         and *add<dwi>3_doubleword_zext.
> >
> > Hi Roger,
> >
> > This commit currently changed the codegen of testcase p443644-2.c from:
> 
> Oops, a typo, I mean pr43644-2.c.
> 
> Haochen

I'm working on a fix and hope to have this resolved soon (unfortunately
fixing
things in a post-reload splitter isn't working out due to reload's choices,
so the
solution will likely be a peephole2).

The problem is that pr91681-1.c and pr43644-2.c can't both PASS (as
written)!
The operation x = y + 0, can be generated as either "mov y,x; add $0,x" or
as
"xor x,x; add y,x".  pr91681-1.c checks there isn't an xor, pr43644-2.c
checks
there isn't a mov.  Doh!  As the author of both these test cases, I've
painted
myself into a corner.

The solution is that add $0,x should be generated (optimal) when y is
already in x,
and "xor x,x; add y,x" used otherwise (as this is shorter than "mov y,x; add
$0,x",
both sequences being approximately equal performance-wise).

> >         movq    %rdx, %rax
> >         xorl    %edx, %edx
> >         addq    %rdi, %rax
> >         adcq    %rsi, %rdx
> > to:
> >         movq    %rdx, %rcx
> >         movq    %rdi, %rax
> >         movq    %rsi, %rdx
> >         addq    %rcx, %rax
> >         adcq    $0, %rdx
> >
> > which causes the testcase fail under -m64.
> > Is this within your expectation?

You're right that the original (using xor) is better for pr43644-2.c's test
case.
unsigned __int128 foo(unsigned __int128 x, unsigned long long y) { return
x+y; }
but the closely related (swapping the argument order):
unsigned __int128 bar(unsigned long long y, unsigned __int128 x) { return
x+y; }
is better using "adcq $0", than having a superfluous xor.

Executive summary: This FAIL isn't serious.  I'll silence it soon.

> > BRs,
> > Haochen
> >
> > >
> > >
> > > Thanks,
> > > Roger
> > > --

     prev parent reply	other threads:[~2023-07-17  7:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-11 20:07 Roger Sayle
2023-07-12  9:37 ` Uros Bizjak
2023-07-14  2:50 ` Jiang, Haochen
2023-07-17  1:49   ` Jiang, Haochen
2023-07-17  7:54     ` Roger Sayle [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='003801d9b883$e6415c50$b2c414f0$@nextmovesoftware.com' \
    --to=roger@nextmovesoftware.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=haochen.jiang@intel.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).