From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH v3 4/9] Support APX GPR32 with extend evex prefix
Date: Tue, 12 Dec 2023 12:32:03 +0000 [thread overview]
Message-ID: <SJ0PR11MB5600B6C624DD37091E4738F69E8EA@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <61ef66ac-ae1c-4c57-b800-475437e225e6@suse.com>
> >>>>> @@ -3670,10 +3673,11 @@ install_template (const insn_template *t)
> >>>>>
> >>>>> /* Dual VEX/EVEX templates need stripping one of the possible
> >> variants. */
> >>>>> if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> >>>>> - {
> >>>>> - if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> >>>>> - || maybe_cpu (t, CpuFMA))
> >>>>> - && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> >>>>> + {
> >>>>> + if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA)
> >>>>> + || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) ||
> >>>> APX_F(CpuCMPCCXADD)
> >>>>> + || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) ||
> >>>> APX_F(CpuAVX512DQ)
> >>>>> + || APX_F(CpuAVX512BW) || APX_F(CpuBMI) ||
> APX_F(CpuBMI2))
> >>>>> {
> >>>>> if (need_evex_encoding ())
> >>>>
> >>>> There are several issues here:
> >>>> - Why did you need to change (to the worse) the original code?
> >>>> - Why did you not model the addition after that original code?
> >>>> - How come APX_F (CpuAVX512*) constructs appear here, when no
> >> AVX512
> >>>> insn can be VEX-encoded?
> >>>
> >>> I don't understand what you mean, we have this combination.
> >>>
> >>> kmov<dq>, 0x<dq:kpfx>90, AVX512BW&(AVX512BW|APX_F),
> >>> Modrm|Vex128|EVex128|Space0F|VexW1|<dq:kvsz>|NoSuf, {
> >>> RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> >>
> >> Oh, I'm sorry: I forgot about the mask register insns.
> >>
> >>>> - If these new macros are really needed for whatever reason, they
> >> shouldn't
> >>>> be added to opcodes/i386-opc.h when they're useful only in the
> >> assembler.
> >>>> - Style requires a blank before the opening parenthesis in function
> >>>> invocations (which also covers function-like macro invocations).
> >>>>
> >>>> I think I asked before: How is it that you get away without
> >>>> altering cpu_flags_match(), containing related and quite similar logic?
> >>>>
> >>>
> >>> For the original logic ( ... || ... ) && ( ... || ...), the content
> >>> in the first bracket
> >> and the content in the following brackets can be combined
> >> arbitrarily. I think it is Inaccurate.
> >>
> >> In which way? If there are issues with the existing code, these
> >> issues want taking care of in separate (prereq) patches. Of course
> >> there are assumptions made here about the CPU combinations that can
> >> (and cannot) occur in any of our templates. Similar assumptions are imo
> fine to make in the APX additions.
> >>
> >> Note how I used two nested if()s despite that not having been
> >> necessary at that time. I did so in anticipation that for APX you'd
> >> want to add another
> >> (separate) inner if(), rather than altering the one that's there.
> >
> > Could we remove the CPU check here? it's a bit ugly and has limited
> effectiveness.
> >
> > if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> > {
> > if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA)
> > || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) ||
> APX_F(CpuCMPCCXADD)
> > || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) ||
> APX_F(CpuAVX512DQ)
> > || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2))
>
> I agree on the "a bit ugly" part, but taking what's there right now I don't
> understand "has limited effectiveness". Of course you can remove any code
> you want, provided you can prove nothing breaks.
>
Here is install_template().
All I can say is that after removing the CPU check, no test cases failed. I know it's hard to convince you to delete this place, or what do you suggest to do with this? APX requires this, otherwise the test cases will fail.
- if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA)
- || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || APX_F(CpuCMPCCXADD)
- || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || APX_F(CpuAVX512DQ)
- || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2))
- {
> >>> So I give examples one by one for each identified combination.
> >>
> >> Which examples are you talking about? I see none given in your reply.
> >>
> >
> > Sorry, I want to say "I've listed every possible combination".
> >
> >>> Just found cpu_flags_match() has similar logic, I think the
> >>> following is the
> >> only code related to CPUID alerts, but none of our combinations are
> >> related to cpuavx.
> >>>
> >>> if (all.bitfield.cpuavx)
> >>> {
> >>> /* We need to check SSE2AVX with AVX. */
> >>> if (!t->opcode_modifier.sse2avx
> >>> || (sse2avx && !i.prefix[DATA_PREFIX]))
> >>> match |= CPU_FLAGS_ARCH_MATCH;
> >>> }
> >>
> >> Not sure why you pick out this one. This special case is needed for
> >> sse2avx; I don't see how it's related here. What I've been pointing
> >> you at is the code in that function which follows a similar "Dual VEX/EVEX
> templates ..."
> >> comment.
> >>
> >
> > I know you're talking about this code, I'm just guessing what it does? Don't
> know what I missed.
>
> You pulled out this sse2avx code. Hence I was expecting you to tell me why
> you consider it relevant here.
>
Here is cpu_flag_match().
I rechecked the code, maybe you want to say I missed the outer loop.
cpu = cpu_flags_and (any, active);
if (cpu_flags_all_zero (&any) || !cpu_flags_all_zero (&cpu))
{
if (all.bitfield.cpuavx)
{
/* We need to check SSE2AVX with AVX. */
if (!t->opcode_modifier.sse2avx
|| (sse2avx && !i.prefix[DATA_PREFIX]))
match |= CPU_FLAGS_ARCH_MATCH;
}
else
match |= CPU_FLAGS_ARCH_MATCH;
}
> > For example
> >
> > .arch .nobmi
> > andn (%eax), %eax, %eax
> >
> > ---------------------------------------------------------------------------------------------
> > if (flag_code != CODE_64BIT)
> > active = cpu_flags_and_not (cpu_arch_flags, cpu_64_flags);
> > else
> > active = cpu_arch_flags; ---> cpubmi = 0;
> > cpu = cpu_flags_and (all, active); ---> cpuapx =1; cpubmi = 0;
> > if (cpu_flags_equal (&cpu, &all)) ---> &cpu and &all are not same.
> > {
> > ...
> > }
> > Return CPU_FLAGS_64BIT_MATCH
> > ----------------------------------------------------------------------
> > ------------------------
> > Then we will report an arch error.
> >
> > if (supported != CPU_FLAGS_PERFECT_MATCH)
> > {
> > as_bad (_("`%s' is not supported on `%s%s'"),
> > insn_name (current_templates.start),
> > cpu_arch_name ? cpu_arch_name : default_arch,
> > cpu_sub_arch_name ? cpu_sub_arch_name : "");
> > return NULL;
> > }
>
> Which is what we want, I think (for the particular example you picked)? Yet
> again, I don't think I can see what you're trying to tell me. I also have to
> confess I've lost track of whether we're discussing install_template(),
> cpu_flag_match(), or both. For example in install_template() you may indeed
> be able to get away with little or no changes, as long as there's no used
> features tracking for APX (see the early ELF-specific part of output_insn()).
> Things would be somewhat inconsistent then, but that may be tolerable (as
> long as properly justified in the patch description). Not getting this into
> proper shape right with the introduction of APX may bite us later, though.
>
Here is cpu_flag_match().
I just want to say that for the APX part we don't need to handle it in the "Double VEX/EVEX Template...".
Thanks,
Lili.
next prev parent reply other threads:[~2023-12-12 12:32 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 7:02 [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Cui, Lili
2023-11-24 7:02 ` [PATCH v3 2/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-04 16:30 ` Jan Beulich
2023-12-05 13:31 ` Cui, Lili
2023-12-06 7:52 ` Jan Beulich
2023-12-06 12:43 ` Cui, Lili
2023-12-07 9:01 ` Jan Beulich
2023-12-08 3:10 ` Cui, Lili
2023-11-24 7:02 ` [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-24 7:02 ` [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-07 12:38 ` Jan Beulich
2023-12-08 15:21 ` Cui, Lili
2023-12-11 8:34 ` Jan Beulich
2023-12-12 10:44 ` Cui, Lili
2023-12-12 11:16 ` Jan Beulich
2023-12-12 12:32 ` Cui, Lili [this message]
2023-12-12 12:39 ` Jan Beulich
2023-12-12 13:15 ` Cui, Lili
2023-12-12 14:13 ` Jan Beulich
2023-12-13 7:36 ` Cui, Lili
2023-12-13 7:48 ` Jan Beulich
2023-12-12 12:58 ` Cui, Lili
2023-12-12 14:04 ` Jan Beulich
2023-12-13 8:35 ` Cui, Lili
2023-12-13 9:13 ` Jan Beulich
2023-12-07 13:34 ` Jan Beulich
2023-12-11 6:16 ` Cui, Lili
2023-12-11 8:43 ` Jan Beulich
2023-12-11 11:50 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 5/9] Add tests for " Cui, Lili
2023-12-07 14:05 ` Jan Beulich
2023-12-11 6:16 ` Cui, Lili
2023-12-11 8:55 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 6/9] Support APX NDD Cui, Lili
2023-12-08 14:12 ` Jan Beulich
2023-12-11 13:36 ` Cui, Lili
2023-12-11 16:50 ` Jan Beulich
2023-12-13 10:42 ` Cui, Lili
2024-03-22 10:02 ` Jan Beulich
2024-03-22 10:31 ` Jan Beulich
2024-03-26 2:04 ` Cui, Lili
2024-03-26 7:06 ` Jan Beulich
2024-03-26 7:18 ` Cui, Lili
2024-03-22 10:59 ` Jan Beulich
2024-03-26 8:22 ` Cui, Lili
2024-03-26 9:30 ` Jan Beulich
2024-03-27 2:41 ` Cui, Lili
2023-12-08 14:27 ` Jan Beulich
2023-12-12 5:53 ` Cui, Lili
2023-12-12 8:28 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 7/9] Support APX Push2/Pop2 Cui, Lili
2023-12-11 11:17 ` Jan Beulich
2023-12-15 8:38 ` Cui, Lili
2023-12-15 8:44 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-11 12:27 ` Jan Beulich
2023-12-12 3:18 ` Hu, Lin1
2023-12-12 8:41 ` Jan Beulich
2023-12-13 5:31 ` Hu, Lin1
2023-12-12 8:45 ` Jan Beulich
2023-12-13 6:06 ` Hu, Lin1
2023-12-13 8:19 ` Jan Beulich
2023-12-13 8:34 ` Hu, Lin1
2023-11-24 7:02 ` [PATCH v3 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-11-24 7:09 ` [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Jan Beulich
2023-11-24 11:22 ` Cui, Lili
2023-11-24 12:14 ` Jan Beulich
2023-12-12 2:57 ` Lu, Hongjiu
2023-12-12 8:16 ` Cui, Lili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB5600B6C624DD37091E4738F69E8EA@SJ0PR11MB5600.namprd11.prod.outlook.com \
--to=lili.cui@intel.com \
--cc=JBeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hongjiu.lu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).