public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Hu, Lin1" <lin1.hu@intel.com>, "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"ccoutant@gmail.com" <ccoutant@gmail.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH 7/8] Support APX NDD optimized encoding.
Date: Tue, 14 Nov 2023 11:50:35 +0100	[thread overview]
Message-ID: <8ed3b7a2-8cba-6428-1c01-5b6c28ca4a89@suse.com> (raw)
In-Reply-To: <SJ0PR11MB5940FDFB811A68B067277C32A6B2A@SJ0PR11MB5940.namprd11.prod.outlook.com>

On 14.11.2023 03:28, Hu, Lin1 wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>>
>> On 10.11.2023 06:43, Hu, Lin1 wrote:
>>>> On 02.11.2023 12:29, Cui, Lili wrote:
>>>>> +      unsigned int src2 = (i.operands > 3) ? i.operands - 3 : 0;
>>>>> +
>>>>> +      if (i.types[src1].bitfield.class == Reg
>>>>> +	  && i.op[src1].regs == i.op[dest].regs)
>>>>> +	readonly_var = src2;
>>>>
>>>> As can be seen in the testcase, this also results in ADCX/ADOX to be
>>>> converted to non-ND EVEX forms, i.e. even when that's not a win at all.
>>>> We shouldn't change what the user has written when the encoding
>>>> doesn't actually improve. (Or else, but I'd be hesitant to accept
>>>> that, at the very least the effect would need pointing out in the
>>>> description or even a code comment, so that later on it is possible
>>>> to figure out whether that was intentional or an
>>>> oversight.)
>>>>
>>>> This is where my template ordering remark in reply to patch 5 comes into play:
>>>> Whether invoking re-parse is okay would further need to depend on
>>>> whether an alternative (earlier) template actually allows
>>>> REX2 encoding (same base-opcode could be one of the criteria for how
>>>> far to look back through earlier templates; an option might also be
>>>> to put the 3- operand templates first, so that looking backwards
>>>> wouldn't be necessary in the first place). This would then likely
>>>> also address one of the forward looking concerns I've raised above.
>>>>
>>>
>>> Indeed, adcx's legacy insn can't support rex2.
>>>
>>> For my problem, I prefer to re-order templates order, because, I hadn't
>> thought of a way to simply move t to the farthest same base_opcode template
>> for the moment. The following is a tentative scenario: the order will be ndd evex
>> - rex2 - evex.
>>
>> Yes, this matches my understanding / expectation.
>>
>>> And I will need a tmp_variable to avoid the insn doesn't match the rex2, let me
>> backtrack the match's result and the value of i.
>>
>> This, however, I'm not convinced of. I'd rather see this vaguely in line with
>> 58bceb182740 ("x86: prefer VEX encodings over EVEX ones when
>> possible"): Do another full matching round with the removed operand, arranging
>> for "internal error" to be raised in case that fails. Your approach would, I think,
>> result in silent bad code generation in case something went wrong. Thing is - you
>> don't even need to advance (or
>> backtrack) t in that case
>>
> 
> I tried to reorder the templates and modify the code as follows:
> 
> @ -7728,6 +7765,40 @@ match_template (char mnem_suffix)
>           i.memshift = memshift;
>         }
> 
> +      /* If we can optimize a NDD insn to non-NDD insn, like
> +        add %r16, %r8, %r8 -> add %r16, %r8,
> +        add  %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
> +        Note that the semantics have not been changed.  */
> +      if (optimize
> +         && !i.no_optimize
> +         && i.vec_encoding != vex_encoding_evex
> +         && t + 1 < current_templates->end
> +         && !t[1].opcode_modifier.evex)
> +       {
> +         unsigned int readonly_var = convert_NDD_to_REX2 (t);
> +         if (readonly_var != ~0)
> +           {
> +             if (!check_EgprOperands (t + 1))
> +               {
> +                 specific_error = progress (internal_error);
> +                 continue;
> +               }
> +             ++i.operands;
> +             ++i.reg_operands;

DYM decrement rather than increment for these? We're trying to go from
3 to 2 operands, after all.

> +             ++i.tm.operands;

Why is this? Aren't we ahead of filling i.tm here?

> +
> +             if (readonly_var == 1)
> +               swap_2_operands (0, 1);
> +           }
> +       }
> 
> convert_NDD_to_REX2 return readonly_var now. check_EgprOperands aims to exclude some insns like adcx and adox. Because their opcode_space is legacy-map2 can't support rex2.

Good. Looking forward to seeing the full change.

> And I need some modifications in tc-i386.c after reorder i386-opc.tbl.
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 7a86aff1828..d98950c7dfd 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -14401,7 +14401,9 @@ static bool check_register (const reg_entry *r)
> 
>    if (r->reg_flags & RegRex2)
>      {
> -      if (is_evex_encoding (current_templates->start))
> +      if (is_evex_encoding (current_templates->start)
> +         && ((current_templates->start + 1 >= current_templates->end)
> +             || (is_evex_encoding (current_templates->start + 1))))
>         i.vec_encoding = vex_encoding_evex;
> 
>        if (!cpu_arch_flags.bitfield.cpuapx_f
> 
> What's your opinion?

See my comments to Lili on already the original code (which you further
modify) here. There cannot be a dependency on current_templates here,
imo. Lili - the fact Lin needs the modification above actually looks to
support my view on this.

Jan

  reply	other threads:[~2023-11-14 10:50 UTC|newest]

Thread overview: 113+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 11:29 [PATCH v2 0/8] Support Intel APX EGPR Cui, Lili
2023-11-02 11:29 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-11-02 17:05   ` Jan Beulich
2023-11-03  6:20     ` Cui, Lili
2023-11-03 13:05     ` Jan Beulich
2023-11-03 14:19   ` Jan Beulich
2023-11-06 15:20     ` Cui, Lili
2023-11-06 16:08       ` Jan Beulich
2023-11-07  8:16         ` Cui, Lili
2023-11-07 10:43           ` Jan Beulich
2023-11-07 15:31             ` Cui, Lili
2023-11-07 15:43               ` Jan Beulich
2023-11-07 15:53                 ` Cui, Lili
2023-11-06 15:02   ` Jan Beulich
2023-11-07  8:06     ` Cui, Lili
2023-11-07 10:20       ` Jan Beulich
2023-11-07 14:32         ` Cui, Lili
2023-11-07 15:08           ` Jan Beulich
2023-11-06 15:39   ` Jan Beulich
2023-11-09  8:02     ` Cui, Lili
2023-11-09 10:52       ` Jan Beulich
2023-11-09 13:27         ` Cui, Lili
2023-11-09 15:22           ` Jan Beulich
2023-11-10  7:11             ` Cui, Lili
2023-11-10  9:14               ` Jan Beulich
2023-11-10  9:21                 ` Jan Beulich
2023-11-10 12:38                   ` Cui, Lili
2023-12-14 10:13                   ` Cui, Lili
2023-12-18 15:24                     ` Jan Beulich
2023-12-18 16:23                       ` H.J. Lu
2023-11-10  9:47                 ` Cui, Lili
2023-11-10  9:57                   ` Jan Beulich
2023-11-10 12:05                     ` Cui, Lili
2023-11-10 12:35                       ` Jan Beulich
2023-11-13  0:18                         ` Cui, Lili
2023-11-02 11:29 ` [PATCH 2/8] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-02 11:29 ` [PATCH 3/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-11-02 11:29 ` [PATCH 4/8] Add tests for " Cui, Lili
2023-11-08  9:11   ` Jan Beulich
2023-11-15 14:56     ` Cui, Lili
2023-11-16  9:17       ` Jan Beulich
2023-11-16 15:34     ` Cui, Lili
2023-11-16 16:50       ` Jan Beulich
2023-11-17 12:42         ` Cui, Lili
2023-11-17 14:38           ` Jan Beulich
2023-11-22 13:40             ` Cui, Lili
2023-11-02 11:29 ` [PATCH 5/8] Support APX NDD Cui, Lili
2023-11-08 10:39   ` Jan Beulich
2023-11-20  1:19     ` Cui, Lili
2023-11-08 11:13   ` Jan Beulich
2023-11-20 12:36     ` Cui, Lili
2023-11-20 16:33       ` Jan Beulich
2023-11-22  7:46         ` Cui, Lili
2023-11-22  8:47           ` Jan Beulich
2023-11-22 10:45             ` Cui, Lili
2023-11-23 10:57               ` Jan Beulich
2023-11-23 12:14                 ` Cui, Lili
2023-11-24  6:56                 ` [PATCH v3 0/9] Support Intel APX EGPR Cui, Lili
2023-12-07  8:17                   ` Cui, Lili
2023-12-07  8:33                     ` Cui, Lili
2023-11-09  9:37   ` [PATCH 5/8] Support APX NDD Jan Beulich
2023-11-20  1:33     ` Cui, Lili
2023-11-20  8:19       ` Jan Beulich
2023-11-20 12:54         ` Cui, Lili
2023-11-20 16:43           ` Jan Beulich
2023-11-02 11:29 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-11-08 11:44   ` Jan Beulich
2023-11-08 12:52     ` Jan Beulich
2023-11-22  5:48     ` Cui, Lili
2023-11-22  8:53       ` Jan Beulich
2023-11-22 12:26         ` Cui, Lili
2023-11-09  9:57   ` Jan Beulich
2023-11-02 11:29 ` [PATCH 7/8] Support APX NDD optimized encoding Cui, Lili
2023-11-09 10:36   ` Jan Beulich
2023-11-10  5:43     ` Hu, Lin1
2023-11-10  9:54       ` Jan Beulich
2023-11-14  2:28         ` Hu, Lin1
2023-11-14 10:50           ` Jan Beulich [this message]
2023-11-15  2:52             ` Hu, Lin1
2023-11-15  8:57               ` Jan Beulich
2023-11-15  2:59             ` [PATCH][v3] " Hu, Lin1
2023-11-15  9:34               ` Jan Beulich
2023-11-17  7:24                 ` Hu, Lin1
2023-11-17  9:47                   ` Jan Beulich
2023-11-20  3:28                     ` Hu, Lin1
2023-11-20  8:34                       ` Jan Beulich
2023-11-14  2:58         ` [PATCH 1/2] Reorder APX insns in i386.tbl Hu, Lin1
2023-11-14 11:20           ` Jan Beulich
2023-11-15  1:49             ` Hu, Lin1
2023-11-15  8:52               ` Jan Beulich
2023-11-17  3:27                 ` Hu, Lin1
2023-11-02 11:29 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-11-09 12:59   ` Jan Beulich
2023-11-14  3:26     ` Hu, Lin1
2023-11-14 11:15       ` Jan Beulich
2023-11-24  5:40         ` Hu, Lin1
2023-11-24  7:21           ` Jan Beulich
2023-11-27  2:16             ` Hu, Lin1
2023-11-27  8:03               ` Jan Beulich
2023-11-27  8:46                 ` Hu, Lin1
2023-11-27  8:54                   ` Jan Beulich
2023-11-27  9:03                     ` Hu, Lin1
2023-11-27 10:32                       ` Jan Beulich
2023-12-04  7:33                         ` Hu, Lin1
2023-11-02 13:22 ` [PATCH v2 0/8] Support Intel APX EGPR Jan Beulich
2023-11-03 16:42   ` Cui, Lili
2023-11-06  7:30     ` Jan Beulich
2023-11-06 14:20       ` Cui, Lili
2023-11-06 14:44         ` Jan Beulich
2023-11-06 16:03           ` Cui, Lili
2023-11-06 16:10             ` Jan Beulich
2023-11-07  1:53               ` Cui, Lili
2023-11-07 10:11                 ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ed3b7a2-8cba-6428-1c01-5b6c28ca4a89@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@gmail.com \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    --cc=lin1.hu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).