public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] x86: improve/shorten vector zeroing-idiom optimization conditional
@ 2022-08-02 15:20 Jan Beulich
  2022-08-02 15:56 ` H.J. Lu
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Beulich @ 2022-08-02 15:20 UTC (permalink / raw)
  To: Binutils

- Drop the rounding type check: We're past template matching, and none
  of the involved insns support embedded rounding.
- Drop the extension opcode check: None of the involved opcodes have
  variants with it being other than None.
- Instead check opcode space, even if just to be on the safe side going
  forward.
- Reduce the number of comparisons by folding two groups.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4329,24 +4329,19 @@ optimize_encoding (void)
 	   && !i.types[2].bitfield.xmmword
 	   && (i.tm.opcode_modifier.vex
 	       || ((!i.mask.reg || i.mask.zeroing)
-		   && i.rounding.type == rc_none
 		   && is_evex_encoding (&i.tm)
 		   && (i.vec_encoding != vex_encoding_evex
 		       || cpu_arch_isa_flags.bitfield.cpuavx512vl
 		       || i.tm.cpu_flags.bitfield.cpuavx512vl
 		       || (i.tm.operand_types[2].bitfield.zmmword
 			   && i.types[2].bitfield.ymmword))))
-	   && ((i.tm.base_opcode == 0x55
-		|| i.tm.base_opcode == 0x57
-		|| i.tm.base_opcode == 0xdf
-		|| i.tm.base_opcode == 0xef
-		|| i.tm.base_opcode == 0xf8
-		|| i.tm.base_opcode == 0xf9
-		|| i.tm.base_opcode == 0xfa
-		|| i.tm.base_opcode == 0xfb
-		|| i.tm.base_opcode == 0x42
-		|| i.tm.base_opcode == 0x47)
-	       && i.tm.extension_opcode == None))
+	   && i.tm.opcode_modifier.opcodespace == SPACE_0F
+	   && ((i.tm.base_opcode | 2) == 0x57
+	       || i.tm.base_opcode == 0xdf
+	       || i.tm.base_opcode == 0xef
+	       || (i.tm.base_opcode | 3) == 0xfb
+	       || i.tm.base_opcode == 0x42
+	       || i.tm.base_opcode == 0x47))
     {
       /* Optimize: -O1:
 	   VOP, one of vandnps, vandnpd, vxorps, vxorpd, vpsubb, vpsubd,

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] x86: improve/shorten vector zeroing-idiom optimization conditional
  2022-08-02 15:20 [PATCH] x86: improve/shorten vector zeroing-idiom optimization conditional Jan Beulich
@ 2022-08-02 15:56 ` H.J. Lu
  0 siblings, 0 replies; 2+ messages in thread
From: H.J. Lu @ 2022-08-02 15:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Tue, Aug 2, 2022 at 8:20 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> - Drop the rounding type check: We're past template matching, and none
>   of the involved insns support embedded rounding.
> - Drop the extension opcode check: None of the involved opcodes have
>   variants with it being other than None.
> - Instead check opcode space, even if just to be on the safe side going
>   forward.
> - Reduce the number of comparisons by folding two groups.
>
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4329,24 +4329,19 @@ optimize_encoding (void)
>            && !i.types[2].bitfield.xmmword
>            && (i.tm.opcode_modifier.vex
>                || ((!i.mask.reg || i.mask.zeroing)
> -                  && i.rounding.type == rc_none
>                    && is_evex_encoding (&i.tm)
>                    && (i.vec_encoding != vex_encoding_evex
>                        || cpu_arch_isa_flags.bitfield.cpuavx512vl
>                        || i.tm.cpu_flags.bitfield.cpuavx512vl
>                        || (i.tm.operand_types[2].bitfield.zmmword
>                            && i.types[2].bitfield.ymmword))))
> -          && ((i.tm.base_opcode == 0x55
> -               || i.tm.base_opcode == 0x57
> -               || i.tm.base_opcode == 0xdf
> -               || i.tm.base_opcode == 0xef
> -               || i.tm.base_opcode == 0xf8
> -               || i.tm.base_opcode == 0xf9
> -               || i.tm.base_opcode == 0xfa
> -               || i.tm.base_opcode == 0xfb
> -               || i.tm.base_opcode == 0x42
> -               || i.tm.base_opcode == 0x47)
> -              && i.tm.extension_opcode == None))
> +          && i.tm.opcode_modifier.opcodespace == SPACE_0F
> +          && ((i.tm.base_opcode | 2) == 0x57
> +              || i.tm.base_opcode == 0xdf
> +              || i.tm.base_opcode == 0xef
> +              || (i.tm.base_opcode | 3) == 0xfb
> +              || i.tm.base_opcode == 0x42
> +              || i.tm.base_opcode == 0x47))
>      {
>        /* Optimize: -O1:
>            VOP, one of vandnps, vandnpd, vxorps, vxorpd, vpsubb, vpsubd,

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-02 15:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-02 15:20 [PATCH] x86: improve/shorten vector zeroing-idiom optimization conditional Jan Beulich
2022-08-02 15:56 ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).