From: Kirill Yukhin <kirill.yukhin@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [RFT PATCH, i386]: Optimize zero-extensions from mask registers
Date: Mon, 22 Aug 2016 15:25:00 -0000 [thread overview]
Message-ID: <20160822152454.GA25898@titus> (raw)
In-Reply-To: <CAFULd4bHkc1gmN3B0ddHvQEjJ5b0vyD1sUmWLC99hXrb2oa9wA@mail.gmail.com>
Hello Uroš,
On 05 Aug 14:22, Uros Bizjak wrote:
> Hello!
>
> Attached patch was inspired by assembly from PR 72805 testcase.
> Currently, the compiler generates:
>
> test:
> vpternlogd $0xFF, %zmm0, %zmm0, %zmm0
> vpxord %zmm1, %zmm1, %zmm1
> vpcmpd $1, %zmm1, %zmm0, %k1
> kmovw %k1, %eax
> movzwl %ax, %eax
> ret
>
> Please note that kmovw already zero-extended from a mask register.
>
> 2016-08-05 Uros Bizjak <ubizjak@gmail.com>
>
> * config/i386/i386.md (*zero_extendsidi2): Add (*r,*k) alternative.
> (zero_extend<mode>di2): Ditto.
> (*zero_extend<mode>si2): Ditto.
> (*zero_extendqihi2): Ditto.
>
> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> The patch is in RFT state, since I have no means to test AVX512 stuff.
> Kirill, can someone from Intel please test the patch?
I gave a try to your patch and see no regressions or bootstrap failures on i386/x86_64 (run on SDE).
--
Thanks, K
>
> Uros.
> Index: config/i386/i386.md
> ===================================================================
> --- config/i386/i386.md (revision 239166)
> +++ config/i386/i386.md (working copy)
> @@ -3688,10 +3688,10 @@
>
> (define_insn "*zero_extendsidi2"
> [(set (match_operand:DI 0 "nonimmediate_operand"
> - "=r,?r,?o,r ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x")
> + "=r,?r,?o,r ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,*r")
> (zero_extend:DI
> (match_operand:SI 1 "x86_64_zext_operand"
> - "0 ,rm,r ,rmWz,0,r ,m ,*Yj,*x,r ,m")))]
> + "0 ,rm,r ,rmWz,0,r ,m ,*Yj,*x,r ,m ,*k")))]
> ""
> {
> switch (get_attr_type (insn))
> @@ -3717,6 +3717,9 @@
>
> return "%vmovd\t{%1, %0|%0, %1}";
>
> + case TYPE_MSKMOV:
> + return "kmovd\t{%1, %k0|%k0, %1}";
> +
> default:
> gcc_unreachable ();
> }
> @@ -3724,7 +3727,7 @@
> [(set (attr "isa")
> (cond [(eq_attr "alternative" "0,1,2")
> (const_string "nox64")
> - (eq_attr "alternative" "3,7")
> + (eq_attr "alternative" "3,7,11")
> (const_string "x64")
> (eq_attr "alternative" "8")
> (const_string "x64_sse4")
> @@ -3741,6 +3744,8 @@
> (const_string "ssemov")
> (eq_attr "alternative" "8")
> (const_string "sselog1")
> + (eq_attr "alternative" "11")
> + (const_string "mskmov")
> ]
> (const_string "imovx")))
> (set (attr "prefix_extra")
> @@ -3792,12 +3797,14 @@
> "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);")
>
> (define_insn "zero_extend<mode>di2"
> - [(set (match_operand:DI 0 "register_operand" "=r")
> + [(set (match_operand:DI 0 "register_operand" "=r,*r")
> (zero_extend:DI
> - (match_operand:SWI12 1 "nonimmediate_operand" "<r>m")))]
> + (match_operand:SWI12 1 "nonimmediate_operand" "<r>m,*k")))]
> "TARGET_64BIT"
> - "movz{<imodesuffix>l|x}\t{%1, %k0|%k0, %1}"
> - [(set_attr "type" "imovx")
> + "@
> + movz{<imodesuffix>l|x}\t{%1, %k0|%k0, %1}
> + kmov<mskmodesuffix>\t{%1, %k0|%k0, %1}"
> + [(set_attr "type" "imovx,mskmov")
> (set_attr "mode" "SI")])
>
> (define_expand "zero_extend<mode>si2"
> @@ -3841,13 +3848,15 @@
> (set_attr "mode" "SI")])
>
> (define_insn "*zero_extend<mode>si2"
> - [(set (match_operand:SI 0 "register_operand" "=r")
> + [(set (match_operand:SI 0 "register_operand" "=r,*r")
> (zero_extend:SI
> - (match_operand:SWI12 1 "nonimmediate_operand" "<r>m")))]
> + (match_operand:SWI12 1 "nonimmediate_operand" "<r>m,*k")))]
> "!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))"
> - "movz{<imodesuffix>l|x}\t{%1, %0|%0, %1}"
> - [(set_attr "type" "imovx")
> - (set_attr "mode" "SI")])
> + "@
> + movz{<imodesuffix>l|x}\t{%1, %0|%0, %1}
> + kmov<mskmodesuffix>\t{%1, %0|%0, %1}"
> + [(set_attr "type" "imovx,mskmov")
> + (set_attr "mode" "SI,<MODE>")])
>
> (define_expand "zero_extendqihi2"
> [(set (match_operand:HI 0 "register_operand")
> @@ -3890,12 +3899,14 @@
>
> ; zero extend to SImode to avoid partial register stalls
> (define_insn "*zero_extendqihi2"
> - [(set (match_operand:HI 0 "register_operand" "=r")
> - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "qm")))]
> + [(set (match_operand:HI 0 "register_operand" "=r,*r")
> + (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "qm,*k")))]
> "!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))"
> - "movz{bl|x}\t{%1, %k0|%k0, %1}"
> - [(set_attr "type" "imovx")
> - (set_attr "mode" "SI")])
> + "@
> + movz{bl|x}\t{%1, %k0|%k0, %1}
> + kmovb\t{%1, %k0|%k0, %1}"
> + [(set_attr "type" "imovx,mskmov")
> + (set_attr "mode" "SI,QI")])
>
> (define_insn_and_split "*zext<mode>_doubleword_and"
> [(set (match_operand:DI 0 "register_operand" "=&<r>")
next prev parent reply other threads:[~2016-08-22 15:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-05 12:22 Uros Bizjak
2016-08-22 15:25 ` Kirill Yukhin [this message]
2016-09-05 17:11 ` [PATCH, i386]: Fix zero-extension optimizations from mask registers (PR target/77476) Jakub Jelinek
2016-09-05 18:40 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160822152454.GA25898@titus \
--to=kirill.yukhin@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).