From: Uros Bizjak <ubizjak@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Jeff Law <law@redhat.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Optimize away x86 mem stores of what the mem contains already (PR rtl-optimization/79593)
Date: Tue, 08 Jan 2019 10:49:00 -0000 [thread overview]
Message-ID: <CAFULd4a2eg+M_J5cEgH7z7RsF92XpVcOZnFBw8nvSP-xx432LA@mail.gmail.com> (raw)
In-Reply-To: <20190108092714.GX30353@tucnak>
On Tue, Jan 8, 2019 at 10:27 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Jan 08, 2019 at 08:29:03AM +0100, Uros Bizjak wrote:
> > Is there a reason stack registers are excluded? Before stackreg pass,
> > these registers are just like other hard registers.
>
> I was just afraid of those, after stack pass removing a store but not
> load would result in the stack getting out of sync, and I wasn't sure if the
> stack pass will tolerate a RA chosen stack reg not being touched anymore.
It does. There is a stack compensating code that handles these situations.
> Also, doesn't loading a float/double/long double into a stack reg and
> storing again canonicalize it in any way? Say NaNs? Or unnormal or
> pseudo-denormal long doubles etc.? Tried that now with a sNaN and the
> load/store didn't change it, but haven't tried with exceptions enabled to
> see if it would raise one.
FLD from memory in SF and DFmode is considered a conversion, and
converts sNaN to NaN (and emits #IA exception). But sNaN handling is
already busted in the compiler as RA is free to spill the register in
non-XFmode. IMO, the peephole2 pattern is no worse than the current
situation.
> If that is fine, I'll leave that out.
I think we are good to do so.
> Looking around, I was using a wrong macro anyway and am surprised nothing
> complained, it should have been STACK_REG_P rather than STACK_REGNO_P.
>
> > Other that that, there is no need for REG_P predicate; after reload we
> > don't have subregs and register_operand will match only hard regs.
>
> Ok, I wasn't sure because "register_operand" allows (subreg (reg)) even if
> reload_completed, just disallows subregs of mem after that.
At least for x86, there are no SUBREGs after reload, otherwise other
parts of the compiler would break.
> > Also, please put peep2_reg_dead_p predicate in the pattern predicate.
>
> I don't see how, that would mean I'd have to write two peephole2s instead of
> one. It tries to deal with two different cases, one is where the temporary
> reg is dead, in that case we can optimize away both the load or store, the
> second case is where the temporary reg isn't dead, in that case we can
> optimize away the store, but not the load. With the optimizing away of both
> load and store I was just trying to do a cheap DCE there.
I didn't realize this is an optimization, a comment would be welcome here.
Uros.
> Looking around more, I actually think I need to replace
> (match_dup 1) with (match_operand 2 "memory_operand"), add rtx_equal_p
> (operands[1], operands[2]) and !MEM_VOLATILE_P (operands[2]), because
> apparently rtx_equal_p doesn't check the MEM_VOLATILE_P bit.
>
> > > 2019-01-07 Jakub Jelinek <jakub@redhat.com>
> > >
> > > PR rtl-optimization/79593
> > > * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2.
> > >
> > > --- gcc/config/i386/i386.md.jj 2019-01-01 12:37:31.564738571 +0100
> > > +++ gcc/config/i386/i386.md 2019-01-07 17:11:21.056392168 +0100
> > > @@ -18740,6 +18740,21 @@ (define_peephole2
> > > const0_rtx);
> > > })
> > >
> > > +;; Attempt to optimize away memory stores of values the memory already
> > > +;; has. See PR79593.
> > > +(define_peephole2
> > > + [(set (match_operand 0 "register_operand")
> > > + (match_operand 1 "memory_operand"))
> > > + (set (match_dup 1) (match_dup 0))]
> > > + "REG_P (operands[0])
> > > + && !STACK_REGNO_P (operands[0])
> > > + && !MEM_VOLATILE_P (operands[1])"
> > > + [(set (match_dup 0) (match_dup 1))]
> > > +{
> > > + if (peep2_reg_dead_p (1, operands[0]))
> > > + DONE;
> > > +})
> > > +
> > > ;; Attempt to always use XOR for zeroing registers (including FP modes).
> > > (define_peephole2
> > > [(set (match_operand 0 "general_reg_operand")
>
> Jakub
next prev parent reply other threads:[~2019-01-08 10:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-07 22:51 Jakub Jelinek
2019-01-08 7:29 ` Uros Bizjak
2019-01-08 9:27 ` Jakub Jelinek
2019-01-08 10:49 ` Uros Bizjak [this message]
2019-01-08 11:43 ` Jakub Jelinek
2019-01-08 17:40 ` Uros Bizjak
2019-01-11 19:03 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFULd4a2eg+M_J5cEgH7z7RsF92XpVcOZnFBw8nvSP-xx432LA@mail.gmail.com \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).