From: Jan Beulich <jbeulich@suse.com>
To: "Hu, Lin1" <lin1.hu@intel.com>, "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
"ccoutant@gmail.com" <ccoutant@gmail.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH 7/8] Support APX NDD optimized encoding.
Date: Tue, 14 Nov 2023 11:50:35 +0100 [thread overview]
Message-ID: <8ed3b7a2-8cba-6428-1c01-5b6c28ca4a89@suse.com> (raw)
In-Reply-To: <SJ0PR11MB5940FDFB811A68B067277C32A6B2A@SJ0PR11MB5940.namprd11.prod.outlook.com>
On 14.11.2023 03:28, Hu, Lin1 wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>>
>> On 10.11.2023 06:43, Hu, Lin1 wrote:
>>>> On 02.11.2023 12:29, Cui, Lili wrote:
>>>>> + unsigned int src2 = (i.operands > 3) ? i.operands - 3 : 0;
>>>>> +
>>>>> + if (i.types[src1].bitfield.class == Reg
>>>>> + && i.op[src1].regs == i.op[dest].regs)
>>>>> + readonly_var = src2;
>>>>
>>>> As can be seen in the testcase, this also results in ADCX/ADOX to be
>>>> converted to non-ND EVEX forms, i.e. even when that's not a win at all.
>>>> We shouldn't change what the user has written when the encoding
>>>> doesn't actually improve. (Or else, but I'd be hesitant to accept
>>>> that, at the very least the effect would need pointing out in the
>>>> description or even a code comment, so that later on it is possible
>>>> to figure out whether that was intentional or an
>>>> oversight.)
>>>>
>>>> This is where my template ordering remark in reply to patch 5 comes into play:
>>>> Whether invoking re-parse is okay would further need to depend on
>>>> whether an alternative (earlier) template actually allows
>>>> REX2 encoding (same base-opcode could be one of the criteria for how
>>>> far to look back through earlier templates; an option might also be
>>>> to put the 3- operand templates first, so that looking backwards
>>>> wouldn't be necessary in the first place). This would then likely
>>>> also address one of the forward looking concerns I've raised above.
>>>>
>>>
>>> Indeed, adcx's legacy insn can't support rex2.
>>>
>>> For my problem, I prefer to re-order templates order, because, I hadn't
>> thought of a way to simply move t to the farthest same base_opcode template
>> for the moment. The following is a tentative scenario: the order will be ndd evex
>> - rex2 - evex.
>>
>> Yes, this matches my understanding / expectation.
>>
>>> And I will need a tmp_variable to avoid the insn doesn't match the rex2, let me
>> backtrack the match's result and the value of i.
>>
>> This, however, I'm not convinced of. I'd rather see this vaguely in line with
>> 58bceb182740 ("x86: prefer VEX encodings over EVEX ones when
>> possible"): Do another full matching round with the removed operand, arranging
>> for "internal error" to be raised in case that fails. Your approach would, I think,
>> result in silent bad code generation in case something went wrong. Thing is - you
>> don't even need to advance (or
>> backtrack) t in that case
>>
>
> I tried to reorder the templates and modify the code as follows:
>
> @ -7728,6 +7765,40 @@ match_template (char mnem_suffix)
> i.memshift = memshift;
> }
>
> + /* If we can optimize a NDD insn to non-NDD insn, like
> + add %r16, %r8, %r8 -> add %r16, %r8,
> + add %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
> + Note that the semantics have not been changed. */
> + if (optimize
> + && !i.no_optimize
> + && i.vec_encoding != vex_encoding_evex
> + && t + 1 < current_templates->end
> + && !t[1].opcode_modifier.evex)
> + {
> + unsigned int readonly_var = convert_NDD_to_REX2 (t);
> + if (readonly_var != ~0)
> + {
> + if (!check_EgprOperands (t + 1))
> + {
> + specific_error = progress (internal_error);
> + continue;
> + }
> + ++i.operands;
> + ++i.reg_operands;
DYM decrement rather than increment for these? We're trying to go from
3 to 2 operands, after all.
> + ++i.tm.operands;
Why is this? Aren't we ahead of filling i.tm here?
> +
> + if (readonly_var == 1)
> + swap_2_operands (0, 1);
> + }
> + }
>
> convert_NDD_to_REX2 return readonly_var now. check_EgprOperands aims to exclude some insns like adcx and adox. Because their opcode_space is legacy-map2 can't support rex2.
Good. Looking forward to seeing the full change.
> And I need some modifications in tc-i386.c after reorder i386-opc.tbl.
>
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 7a86aff1828..d98950c7dfd 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -14401,7 +14401,9 @@ static bool check_register (const reg_entry *r)
>
> if (r->reg_flags & RegRex2)
> {
> - if (is_evex_encoding (current_templates->start))
> + if (is_evex_encoding (current_templates->start)
> + && ((current_templates->start + 1 >= current_templates->end)
> + || (is_evex_encoding (current_templates->start + 1))))
> i.vec_encoding = vex_encoding_evex;
>
> if (!cpu_arch_flags.bitfield.cpuapx_f
>
> What's your opinion?
See my comments to Lili on already the original code (which you further
modify) here. There cannot be a dependency on current_templates here,
imo. Lili - the fact Lin needs the modification above actually looks to
support my view on this.
Jan
next prev parent reply other threads:[~2023-11-14 10:50 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-02 11:29 [PATCH v2 0/8] Support Intel APX EGPR Cui, Lili
2023-11-02 11:29 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-11-02 17:05 ` Jan Beulich
2023-11-03 6:20 ` Cui, Lili
2023-11-03 13:05 ` Jan Beulich
2023-11-03 14:19 ` Jan Beulich
2023-11-06 15:20 ` Cui, Lili
2023-11-06 16:08 ` Jan Beulich
2023-11-07 8:16 ` Cui, Lili
2023-11-07 10:43 ` Jan Beulich
2023-11-07 15:31 ` Cui, Lili
2023-11-07 15:43 ` Jan Beulich
2023-11-07 15:53 ` Cui, Lili
2023-11-06 15:02 ` Jan Beulich
2023-11-07 8:06 ` Cui, Lili
2023-11-07 10:20 ` Jan Beulich
2023-11-07 14:32 ` Cui, Lili
2023-11-07 15:08 ` Jan Beulich
2023-11-06 15:39 ` Jan Beulich
2023-11-09 8:02 ` Cui, Lili
2023-11-09 10:52 ` Jan Beulich
2023-11-09 13:27 ` Cui, Lili
2023-11-09 15:22 ` Jan Beulich
2023-11-10 7:11 ` Cui, Lili
2023-11-10 9:14 ` Jan Beulich
2023-11-10 9:21 ` Jan Beulich
2023-11-10 12:38 ` Cui, Lili
2023-12-14 10:13 ` Cui, Lili
2023-12-18 15:24 ` Jan Beulich
2023-12-18 16:23 ` H.J. Lu
2023-11-10 9:47 ` Cui, Lili
2023-11-10 9:57 ` Jan Beulich
2023-11-10 12:05 ` Cui, Lili
2023-11-10 12:35 ` Jan Beulich
2023-11-13 0:18 ` Cui, Lili
2023-11-02 11:29 ` [PATCH 2/8] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-02 11:29 ` [PATCH 3/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-11-02 11:29 ` [PATCH 4/8] Add tests for " Cui, Lili
2023-11-08 9:11 ` Jan Beulich
2023-11-15 14:56 ` Cui, Lili
2023-11-16 9:17 ` Jan Beulich
2023-11-16 15:34 ` Cui, Lili
2023-11-16 16:50 ` Jan Beulich
2023-11-17 12:42 ` Cui, Lili
2023-11-17 14:38 ` Jan Beulich
2023-11-22 13:40 ` Cui, Lili
2023-11-02 11:29 ` [PATCH 5/8] Support APX NDD Cui, Lili
2023-11-08 10:39 ` Jan Beulich
2023-11-20 1:19 ` Cui, Lili
2023-11-08 11:13 ` Jan Beulich
2023-11-20 12:36 ` Cui, Lili
2023-11-20 16:33 ` Jan Beulich
2023-11-22 7:46 ` Cui, Lili
2023-11-22 8:47 ` Jan Beulich
2023-11-22 10:45 ` Cui, Lili
2023-11-23 10:57 ` Jan Beulich
2023-11-23 12:14 ` Cui, Lili
2023-11-24 6:56 ` [PATCH v3 0/9] Support Intel APX EGPR Cui, Lili
2023-12-07 8:17 ` Cui, Lili
2023-12-07 8:33 ` Cui, Lili
2023-11-09 9:37 ` [PATCH 5/8] Support APX NDD Jan Beulich
2023-11-20 1:33 ` Cui, Lili
2023-11-20 8:19 ` Jan Beulich
2023-11-20 12:54 ` Cui, Lili
2023-11-20 16:43 ` Jan Beulich
2023-11-02 11:29 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-11-08 11:44 ` Jan Beulich
2023-11-08 12:52 ` Jan Beulich
2023-11-22 5:48 ` Cui, Lili
2023-11-22 8:53 ` Jan Beulich
2023-11-22 12:26 ` Cui, Lili
2023-11-09 9:57 ` Jan Beulich
2023-11-02 11:29 ` [PATCH 7/8] Support APX NDD optimized encoding Cui, Lili
2023-11-09 10:36 ` Jan Beulich
2023-11-10 5:43 ` Hu, Lin1
2023-11-10 9:54 ` Jan Beulich
2023-11-14 2:28 ` Hu, Lin1
2023-11-14 10:50 ` Jan Beulich [this message]
2023-11-15 2:52 ` Hu, Lin1
2023-11-15 8:57 ` Jan Beulich
2023-11-15 2:59 ` [PATCH][v3] " Hu, Lin1
2023-11-15 9:34 ` Jan Beulich
2023-11-17 7:24 ` Hu, Lin1
2023-11-17 9:47 ` Jan Beulich
2023-11-20 3:28 ` Hu, Lin1
2023-11-20 8:34 ` Jan Beulich
2023-11-14 2:58 ` [PATCH 1/2] Reorder APX insns in i386.tbl Hu, Lin1
2023-11-14 11:20 ` Jan Beulich
2023-11-15 1:49 ` Hu, Lin1
2023-11-15 8:52 ` Jan Beulich
2023-11-17 3:27 ` Hu, Lin1
2023-11-02 11:29 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-11-09 12:59 ` Jan Beulich
2023-11-14 3:26 ` Hu, Lin1
2023-11-14 11:15 ` Jan Beulich
2023-11-24 5:40 ` Hu, Lin1
2023-11-24 7:21 ` Jan Beulich
2023-11-27 2:16 ` Hu, Lin1
2023-11-27 8:03 ` Jan Beulich
2023-11-27 8:46 ` Hu, Lin1
2023-11-27 8:54 ` Jan Beulich
2023-11-27 9:03 ` Hu, Lin1
2023-11-27 10:32 ` Jan Beulich
2023-12-04 7:33 ` Hu, Lin1
2023-11-02 13:22 ` [PATCH v2 0/8] Support Intel APX EGPR Jan Beulich
2023-11-03 16:42 ` Cui, Lili
2023-11-06 7:30 ` Jan Beulich
2023-11-06 14:20 ` Cui, Lili
2023-11-06 14:44 ` Jan Beulich
2023-11-06 16:03 ` Cui, Lili
2023-11-06 16:10 ` Jan Beulich
2023-11-07 1:53 ` Cui, Lili
2023-11-07 10:11 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ed3b7a2-8cba-6428-1c01-5b6c28ca4a89@suse.com \
--to=jbeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=ccoutant@gmail.com \
--cc=hongjiu.lu@intel.com \
--cc=lili.cui@intel.com \
--cc=lin1.hu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).