public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Kirill Yukhin <kirill.yukhin@gmail.com>,
		"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)
Date: Tue, 04 Apr 2017 12:33:00 -0000	[thread overview]
Message-ID: <CAFULd4aRw2xdbk0QtOhtuxamnb87bQvASnF5H6W-EO6PZ5GPVw@mail.gmail.com> (raw)
In-Reply-To: <20170404120039.GG17461@tucnak>

On Tue, Apr 4, 2017 at 2:00 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Apr 04, 2017 at 08:39:59AM +0200, Uros Bizjak wrote:
>> > Any thoughts on what to do to generate reasonable code when the shift count
>> > comes from memory (e.g. as int variable) or is in the low bits of some XMM
>> > regioster?
>>
>> The problem with int variable from memory is, that shifts access full
>> 128bits for their count operand, so this is effectively a no-go. If
>> there is a 128bit count value in memory, we can maybe define shift
>> pattern with:
>>
>> (subreg:DI (match_operand:V2DI 2 "general_operand" "xmN,vmN"))
>>
>> ?
>
> Well, if the original memory is say int, then we can't just read it as V2DI
> or V4SI.

Of course. The above was for the case when we *want* to load from
memory. The insn loads full 128bit value.

>> > First of all, perhaps we could have some combiner (or peephole) pattern that would
>> > transform sign-extend from e.g. SI to DI on the shift count into zero-extend
>> > if there are no other uses of the extension result - if the shift count is
>> > negative in SImode (or even QImode), then it is already large number and the
>> > upper 32 bits or more don't really change anything on that.
>>
>> We can introduce shift patterns with embedded extensions, and split
>> them to zext + shift. These new patterns can be easily macroized with
>> any_extend code iterator and SWI124 mode iterator, so we avoid pattern
>> explosion.
>
> I assume split those before reload.  Because we want to give reload a chance
> to do the zero extension on GPRs if it is more beneficial, and it might
> choose to store it into memory and load into XMM from memory and that is
> hard to do after reload.

Yes, split before reload, and hope that alternative's decorations play
well with RA.

>> > Then perhaps we could emit pmovzxdq for SSE4.1+ instead of going through
>> > GPRs and back, or for SSE2 pxor on a scratch reg and punpck* to get it zero
>> > extended.  Not sure if we want to add =v / vm alternative to
>> > zero_extendsidi2*, it already has some x but with ?s that prevent the RA
>> > from using it.  So thoughts on that?
>>
>> The ? is there to discourage RA from allocating xmm reg (all these
>> alternatives have * on xmm reg), in effect instructing RA to prefer
>> GPRs. If the value is already in xmm reg, then I expect ? alternative
>> will be used. So, yes, v/v alternative as you proposed would be a good
>> addition to zero_extendsidi alternatives. Please note though that
>> pmovzxdq operates on a vector value, so memory operands should be
>> avoided.
>
> With ? in front of it or without?  I admit I've only tried so far:

I'd leave ?* in this case. In my experience, RA allocates alternative
with ?* only when really needed.

> @@ -4049,24 +4049,29 @@ (define_expand "extendsidi2"
>  })
>
>  (define_insn "*extendsidi2_rex64"
> -  [(set (match_operand:DI 0 "register_operand" "=*a,r")
> -       (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "*0,rm")))]
> +  [(set (match_operand:DI 0 "register_operand" "=*a,r,v")
> +       (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "*0,rm,vm")))]
>    "TARGET_64BIT"
>    "@
>     {cltq|cdqe}
> -   movs{lq|x}\t{%1, %0|%0, %1}"
> -  [(set_attr "type" "imovx")
> -   (set_attr "mode" "DI")
> -   (set_attr "prefix_0f" "0")
> -   (set_attr "modrm" "0,1")])
> +   movs{lq|x}\t{%1, %0|%0, %1}
> +   %vpmovsxdq\t{%1, %0|%0, %1}"
> +  [(set_attr "isa" "*,*,sse4")
> +   (set_attr "type" "imovx,imovx,ssemov")
> +   (set_attr "mode" "DI,DI,TI")
> +   (set_attr "prefix_0f" "0,0,*")
> +   (set_attr "prefix_extra" "*,*,1")
> +   (set_attr "prefix" "orig,orig,maybe_evex")
> +   (set_attr "modrm" "0,1,*")])
>
>
> and with the ? in front of v it for some reason didn't trigger.
> I'll try the zero_extendsidi2 now and see how it works.
>
>> OK for trunk and backports.
>
> Committed to trunk so far, backports in a week or so when I backport
> dozens of other patches together with it.
>
>         Jakub

Uros.

  reply	other threads:[~2017-04-04 12:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-03 20:34 Jakub Jelinek
2017-04-04  6:40 ` Uros Bizjak
2017-04-04 12:01   ` Jakub Jelinek
2017-04-04 12:33     ` Uros Bizjak [this message]
2017-04-04 15:09       ` Jakub Jelinek
2017-04-06  7:34         ` Uros Bizjak
2017-04-06  8:40           ` Uros Bizjak
2017-04-06  8:40           ` Jakub Jelinek
2017-04-06  8:47             ` Uros Bizjak
2017-04-06  9:56               ` Jakub Jelinek
2017-04-06  8:48             ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4aRw2xdbk0QtOhtuxamnb87bQvASnF5H6W-EO6PZ5GPVw@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=kirill.yukhin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).