From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 110032 invoked by alias); 19 Mar 2019 08:30:30 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 109845 invoked by uid 89); 19 Mar 2019 08:30:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy= X-HELO: prv1-mh.provo.novell.com Received: from prv1-mh.provo.novell.com (HELO prv1-mh.provo.novell.com) (137.65.248.33) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Mar 2019 08:30:12 +0000 Received: from INET-PRV1-MTA by prv1-mh.provo.novell.com with Novell_GroupWise; Tue, 19 Mar 2019 02:30:10 -0600 Message-Id: <5C90A88F020000780022025A@prv1-mh.provo.novell.com> Date: Tue, 19 Mar 2019 08:30:00 -0000 From: "Jan Beulich" To: "H.J. Lu" Cc: Subject: Re: [PATCH] x86: Correct EVEX vector load/store optimization References: <20190315235414.11609-1-hjl.tools@gmail.com> <20190317204712.GA6721@gmail.com> <5C8FA1E8020000780021FE41@prv1-mh.provo.novell.com> In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-SW-Source: 2019-03/txt/msg00120.txt.bz2 >>> On 19.03.19 at 07:20, wrote: > On Mon, Mar 18, 2019 at 9:49 PM Jan Beulich wrote: >> >> >>> On 17.03.19 at 21:47, wrote: >> > --- a/gas/config/tc-i386.c >> > +++ b/gas/config/tc-i386.c >> > @@ -4075,6 +4075,56 @@ optimize_encoding (void) >> > i.types[j].bitfield.ymmword =3D 0; >> > } >> > } >> > + else if ((cpu_arch_flags.bitfield.cpuavx >> > + || cpu_arch_isa_flags.bitfield.cpuavx) >> >> Once again a questionable condition, as per earlier replies to >> other patches of yours. >=20 > Fixed. >=20 >> > + && i.vec_encoding !=3D vex_encoding_evex >> > + && !i.types[0].bitfield.zmmword >> > + && !i.mask >> > + && is_evex_encoding (&i.tm) >> > + && (i.tm.base_opcode =3D=3D 0x666f >> > + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) =3D=3D 0x666f >> > + || i.tm.base_opcode =3D=3D 0xf36f >> > + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) =3D=3D 0xf36f >> > + || i.tm.base_opcode =3D=3D 0xf26f >> > + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) =3D=3D 0xf26f) >> >> All three of these can be expressed with just a single comparison, >> using & or | instead of ^ and (if necessary) adjusting the literal >> value compared against. >=20 > Fixed. >=20 >> > + && i.tm.extension_opcode =3D=3D None) >> > + { >> > + /* Optimize: -O1: >> > + VOP, one of vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16, >> > + vmovdqu32 and vmovdqu64: >> > + EVEX VOP %xmmM, %xmmN >> > + -> VEX vmovdqa|vmovdqu %xmmM, %xmmN (M and N < 16) >> > + EVEX VOP %ymmM, %ymmN >> > + -> VEX vmovdqa|vmovdqu %ymmM, %ymmN (M and N < 16) >> > + EVEX VOP %xmmM, mem >> > + -> VEX vmovdqa|vmovdqu %xmmM, mem (M < 16) >> > + EVEX VOP %ymmM, mem >> > + -> VEX vmovdqa|vmovdqu %ymmM, mem (M < 16) >> > + EVEX VOP mem, %xmmN >> > + -> VEX mvmovdqa|vmovdquem, %xmmN (N < 16) >> >> There's some confusion on this line. >> >> > + EVEX VOP mem, %ymmN >> > + -> VEX vmovdqa|vmovdqu mem, %ymmN (N < 16) >> > + */ >> >> For the variants with a memory operand I doubt the conversion >> is always a win, and it may be against the user request in case of >> -Os. This is because of the Disp8 scaling the EVEX encoding permits. >=20 > Fixed. >=20 >> > + if (i.tm.base_opcode =3D=3D 0xf26f) >> > + i.tm.base_opcode =3D 0xf36f; >> > + else if ((i.tm.base_opcode ^ Opcode_SIMD_IntD) =3D=3D 0xf26f) >> > + i.tm.base_opcode =3D 0xf36f ^ Opcode_SIMD_IntD; >> >> This again can be expressed without "else if()" afaict. >> >=20 > Fixed. >=20 > Here is the patch. Thanks. >--- a/gas/config/tc-i386.c >+++ b/gas/config/tc-i386.c >@@ -4068,18 +4068,14 @@ optimize_encoding (void) > i.types[j].bitfield.ymmword =3D 0; > } > } >- else if ((cpu_arch_flags.bitfield.cpuavx >- || cpu_arch_isa_flags.bitfield.cpuavx) >- && i.vec_encoding !=3D vex_encoding_evex >+ else if (i.vec_encoding !=3D vex_encoding_evex > && !i.types[0].bitfield.zmmword Ah, here the remaining cpuavx goes away as well. >+ if ((i.tm.base_opcode & ~Opcode_SIMD_IntD) =3D=3D 0xf26f) >+ { >+ i.tm.base_opcode &=3D Opcode_SIMD_IntD; >+ i.tm.base_opcode |=3D 0xf36f; >+ } How about the even simpler if ((i.tm.base_opcode & ~Opcode_SIMD_IntD) =3D=3D 0xf26f) i.tm.base_opcode ^=3D 0xf36f ^ 0xf26f; ? Jan