public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alexander Monakov <amonakov@ispras.ru>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Jeff Law <jeffreyalaw@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg
Date: Fri, 16 Sep 2022 16:38:01 +0300 (MSK)	[thread overview]
Message-ID: <c9cbba46-557e-d662-2c49-fadbbff45f9c@ispras.ru> (raw)
In-Reply-To: <CAFULd4aX94u4ZJ3o9RZsT_3eW-Ei2NzoMOSvyQipyC7aYKKSGA@mail.gmail.com>

On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote:

> On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> >
> > On 9/15/22 19:06, liuhongt via Gcc-patches wrote:
> > > There's peephole2 submit in 1990s which split cmp mem, 0 to load mem,
> > > reg + test reg, reg. I don't know exact reason why gcc do this.
> > >
> > > For latest x86 processors, ciscization should help processor frontend
> > > also codesize, for processor backend, they should be the same(has same
> > > uops).
> > >
> > > So the patch deleted the peephole2, and also modify another splitter to
> > > generate more cmp mem, 0 for 32-bit target.
> > >
> > > It will help instruction fetch.
> > >
> > > for minmax-1.c minmax-2.c minmax-10, pr96891.c, it's supposed to scan there's no
> > > comparison to 1 or -1, so adjust the testcase since under 32-bit
> > > target, we now generate cmp mem, 0 instead of load + test.
> > >
> > > Similar for pr78035.c.
> > >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> > > No performance impact for SPEC2017 on ICX/Znver3.
> > >
> > It was almost certainly for PPro/P2 given it was rth's work from
> > 1999.    Probably should have been conditionalized on PPro/P2 at the
> > time.   No worries losing it now...
> 
> Please add a tune flag in x86-tune.def under "Historical relics" and
> use it in the relevant peephole2 instead of deleting it.

When the next instruction after 'load mem; test reg, reg' is a conditional
branch, this disables macro-op fusion because Intel CPUs do not macro-fuse
'cmp mem, imm; jcc'.

It would be nice to rephrase the commit message to acknowledge this (the
statement 'has same uops' is not always true with this considered).

AMD CPUs can fuse some 'cmp mem, imm; jcc' under some conditions, so this
should be beneficial for AMD.

Alexander

  reply	other threads:[~2022-09-16 13:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-16  1:06 liuhongt
2022-09-16  1:13 ` Hongtao Liu
2022-09-16  1:31 ` Jeff Law
2022-09-16  6:47   ` Uros Bizjak
2022-09-16 13:38     ` Alexander Monakov [this message]
2022-09-20  2:27       ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9cbba46-557e-d662-2c49-fadbbff45f9c@ispras.ru \
    --to=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).