public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "hjl.tools@gmail.com" <hjl.tools@gmail.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: FW: [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16 instructions
Date: Tue, 20 Jul 2021 16:15:18 +0200	[thread overview]
Message-ID: <213b8124-b67f-6c12-f3a1-605770b3ec3d@suse.com> (raw)
In-Reply-To: <BY5PR11MB4008045B744E6FE7566AD4229EE29@BY5PR11MB4008.namprd11.prod.outlook.com>

On 20.07.2021 15:38, Cui, Lili wrote:
> 
> 
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Tuesday, July 20, 2021 9:03 PM
>> To: Cui, Lili <lili.cui@intel.com>
>> Cc: hjl.tools@gmail.com; binutils@sourceware.org
>> Subject: Re: FW: [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16
>> instructions
>>
>> On 20.07.2021 13:26, Cui, Lili wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: Tuesday, July 20, 2021 4:46 PM
>>>> To: Cui, Lili <lili.cui@intel.com>
>>>> Cc: hjl.tools@gmail.com; binutils@sourceware.org
>>>> Subject: Re: FW: [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16
>>>> instructions
>>>>
>>>> On 20.07.2021 09:08, Cui, Lili wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: Wednesday, July 14, 2021 11:21 PM
>>>>>> To: Cui, Lili <lili.cui@intel.com>
>>>>>> Cc: hjl.tools@gmail.com; binutils@sourceware.org
>>>>>> Subject: Re: [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16
>>>>>> instructions
>>>>>>
>>>>>> On 13.07.2021 08:58, Cui, Lili wrote:
>>>>>>
>>>>>> Disassembler:
>>>>>>
>>>>>> d_scalar_mode looks to be unused.
>>>>>>
>>>>>> This
>>>>>>
>>>>>>   /* EVEX_W_MAP5_2A_P_1 */
>>>>>>   {
>>>>>>     { "vcvtsi2sh{%LQ|}",	{ XMScalar, VexScalar, EXxEVexR, Ed }, 0 },
>>>>>>     { "vcvtsi2sh{%LQ|}",	{ XMScalar, VexScalar, EXxEVexR, Eq }, 0 },
>>>>>>   },
>>>>>>
>>>>>> can imo be expressed without decoding EVEX.W, by using Edq instead
>>>>>> of
>>>>>> (separately) Ed and Eq. There's at least one similar case elsewhere.
>>>>>> Interestingly in the 2si/2usi conversions you do use Gdq already,
>>>>>> which I think handles the EVEX.W=1 case correctly outside of 64-bit
>>>>>> mode (unlike Eq, which will unconditionally produce 64-bit register
>>>>>> names
>>>> afaict).
>>>>>>
>>>>>> As to a broader question on decoding EVEX.W: Did you consider
>>>>>> introducing e.g. %XH (paralleling %XW, just that EVEX.W=1 is not a
>>>>>> valid encoding), to avoid this decode step for perhaps almost all
>>>>>> entries? And if that's not an option, decoding EVEX.W first for all
>>>>>> the opcodes which previously had no meaning at all would, in some
>>>>>> cases, reduce the overall number of table entries (and in all other
>>>>>> cases this would then merely be for consistency, as it also
>>>>>> wouldn't
>>>> increase the number of table entries). To give an example:
>>>>>>
>>>>>>     { PREFIX_TABLE (PREFIX_EVEX_0F3AC2) },
>>>>>>
>>>>>> =>
>>>>>>
>>>>>>   /* PREFIX_EVEX_0F3AC2 */
>>>>>>   {
>>>>>>     { VEX_W_TABLE (EVEX_W_0F3AC2_P_0) },
>>>>>>     { VEX_W_TABLE (EVEX_W_0F3AC2_P_1) },
>>>>>>   },
>>>>>>
>>>>>> =>
>>>>>>
>>>>>>   /* EVEX_W_0F3AC2_P_0 */
>>>>>>   {
>>>>>>     { "vcmpph",	{ XMask, Vex, EXxh, EXxEVexS, Ib }, 0 },
>>>>>>   },
>>>>>>   /* EVEX_W_0F3AC2_P_1 */
>>>>>>   {
>>>>>>     { "vcmpsh",	{ XMask, VexScalar, EXxmm_mw, EXxEVexS, Ib }, 0 },
>>>>>>   },
>>>>>>
>>>>>> i.e. a total of 1 + 4 + 2 * 2 entries. Whereas decoding W first
>>>>>> would yield 1
>>>>>> (evex) + 2 (evex_w) + 4 (prefix) entries.
>>>>>
>>>>> Hi Jan,
>>>>>
>>>>> Do you want me to change it like this?
>>>>>      { PREFIX_TABLE (PREFIX_EVEX_0F3AC2) },
>>>>>
>>>>>  =>
>>>>>
>>>>>    /* PREFIX_EVEX_0F3AC2 */
>>>>>    {
>>>>>      { "vcmp%XH",	{ XMask, Vex, EXxh, EXxEVexS, Ib }, 0 },
>>>>>      { "vcmp%XH",	{ XMask, VexScalar, EXxmm_mw, EXxEVexS, Ib }, 0 },
>>>>>    },
>>>>>
>>>>> "XH" => print 'ph', 'sh' depending on the EVEX.ll bit, if EVEX.W==W1
>>>>> report
>>>> bad code.
>>>>> if  (EVEX.LL== EVEX.LLIG)
>>>>>       print 'sh'
>>>>> else
>>>>>       print 'ph'
>>>>
>>>> Not exactly, no. %XH was meant to parallel %XW, which prints 's' or 'd'
>>>> depending on VEX.W. %XH would print 'h' if EVEX.W is clear and
>>>> produce an appropriate indication of the encoding being bad if EVEX.W is
>> set.
>>>> IOW something like
>>>>
>>>>    /* PREFIX_EVEX_0F3AC2 */
>>>>    {
>>>>      { "vcmpp%XH",	{ XMask, Vex, EXxh, EXxEVexS, Ib }, 0 },
>>>>      { "vcmps%XH",	{ XMask, VexScalar, EXxmm_mw, EXxEVexS, Ib }, 0 },
>>>>    },
>>>>
>>>>>> The delta is even larger for something like MAP5_7D: 1 + 4 + 4 * 2
>>>>>> vs. 1 + 2 + 4. This also results in more related entries ending up
>>>>>> closer to one another.
>>>>>>
>>>>> I don't quite understand here,  should I let all FP16 disassembler
>>>>> go
>>>> through W_TABLE fist? or just add something like %XH instead of going
>>>> through W_TABLE? Thanks.
>>>>
>>>> Where beneficial you will want to decode EVEX.W first, yes. Unless,
>>>> as per above, you can avoid that decoding step altogether by using %XH.
>>>>
>>> I prefer to decode EVEX.W first instead of using %XH.
>>
>> Well, I can't stop you from avoiding %XH, but I did intentionally say "Where
>> beneficial you will want to ...". That is, I think that where possible you should
>> use %XH, and only where that's not suitable decode EVEX.W (typically earlier
>> than EVEX.pp).
>>
> OK, for the instruction that EVEX.W cannot be decoded earlier than EVEX.PP, I will use %XH.

I'm sorry, but no, it's the other way around. The first check would be whether
%XH can be used. Only then would you check whether decoding EVEX.W first is at
least no worse than decoding EVEX.pp first; I think there will be few if any
cases where decoding EVEX.pp first is beneficial.

Jan


  reply	other threads:[~2021-07-20 14:15 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-01  7:47 [PATCH 0/2]Enable Intel AVX512_FP16 instructions and add tests for it Cui,Lili
2021-07-01  7:47 ` [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16 instructions Cui,Lili
2021-07-02 13:42   ` Jan Beulich
2021-07-02 15:46     ` Jan Beulich
2021-07-06 12:42       ` Cui, Lili
2021-07-09 11:52     ` Cui, Lili
2021-07-13  7:25       ` Jan Beulich
2021-07-13  7:35         ` Cui, Lili
2021-07-02 15:08   ` Jan Beulich
2021-07-09 11:50     ` Cui, Lili
2021-07-05  6:30   ` Jan Beulich
2021-07-05 12:38     ` H.J. Lu
2021-07-06 12:48       ` Cui, Lili
2021-07-09 11:47     ` Cui, Lili
2021-07-09 12:16       ` Jan Beulich
2021-07-13  6:58         ` Cui, Lili
2021-07-13  7:54           ` Jan Beulich
2021-07-13  8:03             ` Cui, Lili
2021-07-13 16:25           ` Jan Beulich
     [not found]             ` <DM6PR11MB4009305D09B37299FC2F282C9EE39@DM6PR11MB4009.namprd11.prod.outlook.com>
2021-07-21 14:29               ` Jan Beulich
2021-07-22  7:05                 ` Cui, Lili
2021-07-14 15:21           ` Jan Beulich
2021-07-20  7:08             ` FW: " Cui, Lili
2021-07-20  8:46               ` Jan Beulich
2021-07-20 11:13                 ` Cui, Lili
2021-07-20 11:26                 ` Cui, Lili
2021-07-20 13:02                   ` Jan Beulich
2021-07-20 13:38                     ` Cui, Lili
2021-07-20 14:15                       ` Jan Beulich [this message]
2021-07-20 14:29                         ` Cui, Lili
2021-07-21 10:32             ` Jan Beulich
2021-07-01 15:42 ` [PATCH 0/2]Enable Intel AVX512_FP16 instructions and add tests for it H.J. Lu
2021-07-01 17:46   ` H.J. Lu
2021-07-02  0:13     ` Cui, Lili
     [not found] ` <20210701074736.9534-3-lili.cui@intel.com>
2021-07-02 15:44   ` [PATCH 2/2] [PATCH 2/2] Add tests for Intel AVX512_FP16 instructions Jan Beulich
     [not found]     ` <BY5PR11MB4008FDC77679D0F8FB9E88B39E149@BY5PR11MB4008.namprd11.prod.outlook.com>
2021-07-13 15:59       ` Jan Beulich
2021-07-14 18:01         ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=213b8124-b67f-6c12-f3a1-605770b3ec3d@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).