Re: [PATCH 2/8] Support APX GPR32 with extend evex prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH 2/8] Support APX GPR32 with extend evex prefix
Date: Wed, 18 Oct 2023 08:40:53 +0200	[thread overview]
Message-ID: <fa567d75-7d91-15ce-aa2b-cbfd087b0183@suse.com> (raw)
In-Reply-To: <SJ0PR11MB56008D045F483814CCBB895E9ED6A@SJ0PR11MB5600.namprd11.prod.outlook.com>

On 17.10.2023 17:48, Cui, Lili wrote:
>> On 19.09.2023 17:25, Cui, Lili wrote:
> 
>>> --- a/gas/config/tc-i386.c
> 
>>> +++ b/gas/config/tc-i386.c
> 
>>> @@ -1945,6 +1945,30 @@ cpu_flags_match (const insn_template *t)
> 
>>>                         && (!x.bitfield.cpuvpclmulqdq ||
> 
>> cpu.bitfield.cpuvpclmulqdq))
> 
>>>                       match |= CPU_FLAGS_ARCH_MATCH;
> 
>>>             }
> 
>>> +        else if (x.bitfield.cpuapx_f)
> 
>>> +          {
> 
>>> +            if (cpu.bitfield.cpuapx_f
> 
>>> +                      && (!x.bitfield.cpumovbe || cpu.bitfield.cpumovbe)
> 
>>> +                      && (!x.bitfield.cpuept || cpu.bitfield.cpuept)
> 
>>> +                      && (!x.bitfield.cpuinvpcid || cpu.bitfield.cpuinvpcid)
> 
>>> +                      && (!x.bitfield.cpusse4_2 || cpu.bitfield.cpusse4_2)
> 
>>> +                      && (!x.bitfield.cpubmi2 || cpu.bitfield.cpubmi2)
> 
>>> +                      && (!x.bitfield.cpubmi || cpu.bitfield.cpubmi)
> 
>>> +                      && (!x.bitfield.cpuadx || cpu.bitfield.cpuadx)
> 
>>> +                      && (!x.bitfield.cpusha || cpu.bitfield.cpusha)
> 
>>> +                      && (!x.bitfield.cpuavx512bw || cpu.bitfield.cpuavx512bw)
> 
>>> +                      && (!x.bitfield.cpuavx512dq || cpu.bitfield.cpuavx512dq)
> 
>>> +                      && (!x.bitfield.cpuavx512f || cpu.bitfield.cpuavx512f)
> 
>>> +                      && (!x.bitfield.cpushstk || cpu.bitfield.cpushstk)
> 
>>> +                      && (!x.bitfield.cpumovdir64b || cpu.bitfield.cpumovdir64b)
> 
>>> +                      && (!x.bitfield.cpumovdiri || cpu.bitfield.cpumovdiri)
> 
>>> +                      && (!x.bitfield.cpuenqcmd || cpu.bitfield.cpuenqcmd)
> 
>>> +                      && (!x.bitfield.cpukl || cpu.bitfield.cpukl)
> 
>>> +                      && (!x.bitfield.cpuwidekl || cpu.bitfield.cpuwidekl)
> 
>>> +                      && (!x.bitfield.cpucmpccxadd || cpu.bitfield.cpucmpccxadd)
> 
>>> +                      && (!x.bitfield.cpurao_int || cpu.bitfield.cpurao_int))
> 
>>> +                    match |= CPU_FLAGS_ARCH_MATCH;
> 
>>> +          }
> 
>>>           else
> 
>>>             match |= CPU_FLAGS_ARCH_MATCH;
> 
>>>
> 
>>
> 
>> This is getting unwieldy, so I think we will need to think of a better way of
> 
>> expressing both "multiple ISAs need to be enabled" and "one of a set of ISAs
> 
>> needs to be enabled". It's only the mix of these expressed in a uniform way in
> 
>> the insn table that requires these extra conditionals. With the size of
> 
>> i386_cpu_attr greatly shrunk as of recently, I wonder if we couldn't simply add
> 
>> a 2nd instance of it to insn_template. One would be "all of these are required",
> 
>> while the other would be "any one of these is sufficient".
> 
>>
> 
> 
> 
> I didn't find a better way to distinguish these two types of requirements in insn_template.
> 
> I wrote a new function "cpu_flags_not_or_check" to replace these clumsy judgments.
> 
> 
> 
> static INLINE int
> 
> cpu_flags_not_or_check (const union i386_cpu_flags *x,
> 
>                         const union i386_cpu_flags *y)
> 
> {
> 
>   switch (ARRAY_SIZE(x->array))
> 
>     {
> 
>     case 5:
> 
>       if ((~x->array[4] | y->array[4]) != 0xffffffff)
> 
>         return 0;
> 
>       /* Fall through.  */
> 
>     case 4:
> 
>       if ((~x->array[3] | y->array[3]) != 0xffffffff)
> 
>         return 0;
> 
>       /* Fall through.  */
> 
>     case 3:
> 
>       if ((~x->array[2] | y->array[2]) != 0xffffffff)
> 
>         return 0;
> 
>       /* Fall through.  */
> 
>     case 2:
> 
>       if ((~x->array[1] | y->array[1]) != 0xffffffff)
> 
>         return 0;
> 
>       /* Fall through.  */
> 
>     case 1:
> 
>      return ((~x->array[1] | y->array[1]) == 0Xffffffff);
> 
>       break;
> 
>     default:
> 
>       abort ();
> 
>     }
> 
> }

Without seeing how this is used I can't comment on it. It feels though
as if you may not have fully understood my earlier reply: Even prior
to APX we already have cases where one CPU specifier in the opcode
table using | means "both" and another means "either". I think we want
to split that, and thus simplify the logic in cpu_flags_match(). That's
separate prereq work, of course.

Thing is that prior to 734dfd1cc966 this would have been prohibitively
expensive in terms of table size growth. But now we can afford having
two i386_cpu_attr fields, one meaning "all of these", the other meaning
"any of these". To limit churn in the opcode table, I'd be inclined to
continue to express CPU requirements in a single field there, using e.g.
(CpuA|CpuB)&CpuC&CpuD (Cpu prefixes re-added here for clarity, even if
they aren't present in the opcode table anymore).

I'd be happy to do that prereq work (if we can agree on the approach),
but it may mean a little bit of delay, as after my vacation I need to
catch up with a few other thinghs first.

>>> +  if (i.prefix[DATA_PREFIX] != 0)
> 
>>> +    {
> 
>>> +      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> 
>>> +      i.prefix[DATA_PREFIX] = 0;
> 
>>> +    }
> 
>>
> 
>> While this looks to be correct for the case when the prefix was derived from an
> 
>> insn template and the use of 16-bit operands, I don't think it is uniformly
> 
>> correct when "data16" was used as a prefix explicitly. In such a case either
> 
>> REX2 encoding needs to be used, or an error needs emitting.
> 
>>
> 
>> You may further want to assert that i.tm.opcode_modifier.opcodeprefix is still
> 
>> zero ahead of the assignment.
> 
>>
> 
> 
> 
> For REX2 encoding, we add no special handling, just follow REX.
> 
> For EVEX-promoted encoding, such as “data16 aand   %r25d,0x123(%r31,%rax,4)”, the following existing code will report an error.
> 
> 
> 
>   if (is_any_vex_encoding (&i.tm)
> 
>       || i.tm.operand_types[i.imm_operands].bitfield.class >= RegMMX
> 
>       || i.tm.operand_types[i.imm_operands + 1].bitfield.class >= RegMMX)
> 
>     {
> 
>       /* Check for data size prefix on VEX/XOP/EVEX encoded and SIMD insns.  */
> 
>       if (i.prefix[DATA_PREFIX])
> 
>         {
> 
>           as_bad (_("data size prefix invalid with `%s'"), insn_name (&i.tm));
> 
>           return;
> 
>         }

Thinking of it, I may need to revise my earlier comment some: RAO-INT insns
are a bad example here, since despite being legacy-encoded they don't permit
a data16 prefix to specify 16-bit operand size. Consider the same for e.g.
AND. The legacy form permits use of data16 (leaving aside that it's better /
clearer to simply use 16-bit register names), so the promoted forms likely
ought to permit such as well. IOW perhaps the code you have is correct, but
the check you quote may need adjusting.

>>> @@ -1121,6 +1122,15 @@ process_i386_opcode_modifier (FILE *table,
> 
>> char *mod, unsigned int space,
> 
>>>         fprintf (stderr,
> 
>>>                        "%s: %d: W modifier without Word/Dword/Qword
> 
>> operand(s)\n",
> 
>>>                        filename, lineno);
> 
>>> +      if (modifiers[Vex].value
> 
>>> +        || (space > SPACE_0F
> 
>>> +            && !(space == SPACE_EVEXMAP4
> 
>>> +                       || modifiers[EVex].value
> 
>>> +                       || modifiers[Disp8MemShift].value
> 
>>> +                       || modifiers[Broadcast].value
> 
>>> +                       || modifiers[Masking].value
> 
>>> +                       || modifiers[SAE].value)))
> 
>>
> 
>> First of all, this wants simplifying to
> 
>>
> 
>>       if (modifiers[Vex].value
> 
>>              || (space > SPACE_0F
> 
>>                  && space != SPACE_EVEXMAP4
> 
>>                  && !modifiers[EVex].value
> 
>>                  && !modifiers[Disp8MemShift].value
> 
>>                  && !modifiers[Broadcast].value
> 
>>                  && !modifiers[Masking].value
> 
>>                  && !modifiers[SAE].value))
> 
>>
> 
>> which helps readability and makes more obvious that this parallels tc-
> 
>> i386.c:is_evex_encoding(). Such a connection, where updates need to be
> 
>> made in sync, needs pointing out in code comments at both sites.
> 
>>
> 
> 
> 
> Done.
> 
> 
> 
>> Yet of course this condition won't hold anymore for combined VEX/EVEX
> 
>> templates.
> 
>>
> 
> 
> 
> I rebased master and as you predicted this doesn't work, one entry contains both VEX and EVEX, VEX requires No_egpr=1 and EVEX requires No_egpr=0,
> 
> Finally I chose to add "No_egpr=1" for it. And added the following judgment in check_EgprOperands.
> 
> 
> 
> check_EgprOperands (const insn_template *t)
> 
> {
> 
> -  if (t->opcode_modifier.noegpr)
> 
> if (t->opcode_modifier.noegpr && !need_evex_encoding())

I'll need to look at this in context, so I can't comment right here.

Also, just to mention it: Something's wrong with your reply (also visible
in the list archive copy), harming readability quite a bit. There are
extra blank lines between any two real ones.

Jan

next prev parent reply	other threads:[~2023-10-18  6:40 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19 15:25 [PATCH 0/8] [RFC] Support Intel APX EGPR Cui, Lili
2023-09-19 15:25 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-09-21 15:27   ` Jan Beulich
2023-09-27 15:57     ` Cui, Lili
2023-09-21 15:51   ` Jan Beulich
2023-09-27 15:59     ` Cui, Lili
2023-09-28  8:02       ` Jan Beulich
2023-10-07  3:27         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 2/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-09-22 10:12   ` Jan Beulich
2023-10-17 15:48     ` Cui, Lili
2023-10-18  6:40       ` Jan Beulich [this message]
2023-10-18 10:44         ` Cui, Lili
2023-10-18 10:50           ` Jan Beulich
2023-09-22 10:50   ` Jan Beulich
2023-10-17 15:50     ` Cui, Lili
2023-10-17 16:11       ` Jan Beulich
2023-10-18  2:02         ` Cui, Lili
2023-10-18  6:10           ` Jan Beulich
2023-09-25  6:03   ` Jan Beulich
2023-10-17 15:52     ` Cui, Lili
2023-10-17 16:12       ` Jan Beulich
2023-10-18  6:31         ` Cui, Lili
2023-10-18  6:47           ` Jan Beulich
2023-10-18  7:52             ` Cui, Lili
2023-10-18  8:21               ` Jan Beulich
2023-10-18 11:30                 ` Cui, Lili
2023-10-19 11:58                   ` Cui, Lili
2023-10-19 15:24                     ` Jan Beulich
2023-10-19 16:38                       ` Cui, Lili
2023-10-20  6:25                         ` Jan Beulich
2023-10-22 14:33                           ` Cui, Lili
2023-09-19 15:25 ` [PATCH 3/8] Add tests for " Cui, Lili
2023-09-27 13:11   ` Jan Beulich
2023-10-17 15:53     ` FW: " Cui, Lili
2023-10-17 16:19       ` Jan Beulich
2023-10-18  2:32         ` Cui, Lili
2023-10-18  6:05           ` Jan Beulich
2023-10-18  7:16             ` Cui, Lili
2023-10-18  8:05               ` Jan Beulich
2023-10-18 11:26                 ` Cui, Lili
2023-10-18 12:06                   ` Jan Beulich
2023-10-25 16:03                     ` Cui, Lili
2023-09-27 13:19   ` Jan Beulich
2023-09-19 15:25 ` [PATCH 4/8] Support APX NDD Cui, Lili
2023-09-27 14:44   ` Jan Beulich
2023-10-22 14:05     ` Cui, Lili
2023-10-23  7:12       ` Jan Beulich
2023-10-25  8:10         ` Cui, Lili
2023-10-25  8:47           ` Jan Beulich
2023-10-25 15:49             ` Cui, Lili
2023-10-25 15:59               ` Jan Beulich
2023-09-28  7:57   ` Jan Beulich
2023-10-22 14:57     ` Cui, Lili
2023-10-24 11:39     ` Cui, Lili
2023-10-24 11:58       ` Jan Beulich
2023-10-25 15:29         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 5/8] Support APX NDD optimized encoding Cui, Lili
2023-09-28  9:29   ` Jan Beulich
2023-10-23  2:57     ` Hu, Lin1
2023-10-23  7:23       ` Jan Beulich
2023-10-23  7:50         ` Hu, Lin1
2023-10-23  8:15           ` Jan Beulich
2023-10-24  1:40             ` Hu, Lin1
2023-10-24  6:03               ` Jan Beulich
2023-10-24  6:08                 ` Hu, Lin1
2023-10-23  3:07     ` [PATCH-V2] " Hu, Lin1
2023-10-23  3:30     ` [PATCH 5/8] [v2] " Hu, Lin1
2023-10-23  7:26       ` Jan Beulich
2023-09-19 15:25 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-09-28 11:37   ` Jan Beulich
2023-10-30 15:21     ` Cui, Lili
2023-10-30 15:31       ` Jan Beulich
2023-11-20 13:05         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 7/8] Support APX NF Cui, Lili
2023-09-25  6:07   ` Jan Beulich
2023-09-28 12:42   ` Jan Beulich
2023-11-02 10:15     ` Cui, Lili
2023-11-02 10:23       ` Jan Beulich
2023-11-02 10:46         ` Cui, Lili
2023-12-12  2:59           ` H.J. Lu
2023-09-19 15:25 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-09-28 13:11   ` Jan Beulich
2023-11-02  2:32     ` Hu, Lin1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa567d75-7d91-15ce-aa2b-cbfd087b0183@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).