public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Jeff Law <law@redhat.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Optimize away x86 mem stores of what the mem contains already (PR rtl-optimization/79593)
Date: Tue, 08 Jan 2019 10:49:00 -0000	[thread overview]
Message-ID: <CAFULd4a2eg+M_J5cEgH7z7RsF92XpVcOZnFBw8nvSP-xx432LA@mail.gmail.com> (raw)
In-Reply-To: <20190108092714.GX30353@tucnak>

On Tue, Jan 8, 2019 at 10:27 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Jan 08, 2019 at 08:29:03AM +0100, Uros Bizjak wrote:
> > Is there a reason stack registers are excluded? Before stackreg pass,
> > these registers are just like other hard registers.
>
> I was just afraid of those, after stack pass removing a store but not
> load would result in the stack getting out of sync, and I wasn't sure if the
> stack pass will tolerate a RA chosen stack reg not being touched anymore.

It does. There is a stack compensating code that handles these situations.

> Also, doesn't loading a float/double/long double into a stack reg and
> storing again canonicalize it in any way?  Say NaNs?  Or unnormal or
> pseudo-denormal long doubles etc.?  Tried that now with a sNaN and the
> load/store didn't change it, but haven't tried with exceptions enabled to
> see if it would raise one.

FLD from memory in SF and DFmode is considered a conversion, and
converts sNaN to NaN (and emits #IA exception). But sNaN handling is
already busted in the compiler as RA is free to spill the register in
non-XFmode. IMO, the peephole2 pattern is no worse than the current
situation.

> If that is fine, I'll leave that out.

I think we are good to do so.

> Looking around, I was using a wrong macro anyway and am surprised nothing
> complained, it should have been STACK_REG_P rather than STACK_REGNO_P.
>
> > Other that that, there is no need for REG_P predicate; after reload we
> > don't have subregs and register_operand will match only hard regs.
>
> Ok, I wasn't sure because "register_operand" allows (subreg (reg)) even if
> reload_completed, just disallows subregs of mem after that.

At least for x86, there are no SUBREGs after reload, otherwise other
parts of the compiler would break.

> > Also, please put peep2_reg_dead_p predicate in the pattern predicate.
>
> I don't see how, that would mean I'd have to write two peephole2s instead of
> one.  It tries to deal with two different cases, one is where the temporary
> reg is dead, in that case we can optimize away both the load or store, the
> second case is where the temporary reg isn't dead, in that case we can
> optimize away the store, but not the load.  With the optimizing away of both
> load and store I was just trying to do a cheap DCE there.

I didn't realize this is an optimization, a comment would be welcome here.

Uros.

> Looking around more, I actually think I need to replace
> (match_dup 1) with (match_operand 2 "memory_operand"), add rtx_equal_p
> (operands[1], operands[2]) and !MEM_VOLATILE_P (operands[2]), because
> apparently rtx_equal_p doesn't check the MEM_VOLATILE_P bit.
>
> > > 2019-01-07  Jakub Jelinek  <jakub@redhat.com>
> > >
> > >         PR rtl-optimization/79593
> > >         * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2.
> > >
> > > --- gcc/config/i386/i386.md.jj  2019-01-01 12:37:31.564738571 +0100
> > > +++ gcc/config/i386/i386.md     2019-01-07 17:11:21.056392168 +0100
> > > @@ -18740,6 +18740,21 @@ (define_peephole2
> > >                        const0_rtx);
> > >  })
> > >
> > > +;; Attempt to optimize away memory stores of values the memory already
> > > +;; has.  See PR79593.
> > > +(define_peephole2
> > > +  [(set (match_operand 0 "register_operand")
> > > +        (match_operand 1 "memory_operand"))
> > > +   (set (match_dup 1) (match_dup 0))]
> > > +  "REG_P (operands[0])
> > > +   && !STACK_REGNO_P (operands[0])
> > > +   && !MEM_VOLATILE_P (operands[1])"
> > > +  [(set (match_dup 0) (match_dup 1))]
> > > +{
> > > +  if (peep2_reg_dead_p (1, operands[0]))
> > > +    DONE;
> > > +})
> > > +
> > >  ;; Attempt to always use XOR for zeroing registers (including FP modes).
> > >  (define_peephole2
> > >    [(set (match_operand 0 "general_reg_operand")
>
>         Jakub

  reply	other threads:[~2019-01-08 10:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 22:51 Jakub Jelinek
2019-01-08  7:29 ` Uros Bizjak
2019-01-08  9:27   ` Jakub Jelinek
2019-01-08 10:49     ` Uros Bizjak [this message]
2019-01-08 11:43       ` Jakub Jelinek
2019-01-08 17:40         ` Uros Bizjak
2019-01-11 19:03 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4a2eg+M_J5cEgH7z7RsF92XpVcOZnFBw8nvSP-xx432LA@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).