public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] x86: Don't check if AVX512 template requires AVX512VL
@ 2023-06-20 16:44 H.J. Lu
  2023-06-21  6:52 ` Jan Beulich
  2023-06-21 10:04 ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: H.J. Lu @ 2023-06-20 16:44 UTC (permalink / raw)
  To: binutils

If the ZMM operand in an AVX12 template also allows XMM or ZMM, this
template must require AVX512VL.  Drop the AVX512VL requirement check
on such AVX512 templates.

	* config/tc-i386.c (check_VecOperands): Don't check if AVX512
	template requires AVX512VL.
---
 gas/config/tc-i386.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index de35ee2a2c6..dcafac0c0cd 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t)
   /* Templates allowing for ZMMword as well as YMMword and/or XMMword for
      any one operand are implicity requiring AVX512VL support if the actual
      operand size is YMMword or XMMword.  Since this function runs after
-     template matching, there's no need to check for YMMword/XMMword in
-     the template.  */
+     template matching, there's no need to check for YMMword/XMMword nor
+     AVX512VL in the template.  */
   cpu = cpu_flags_and (t->cpu_flags, avx512);
   if (!cpu_flags_all_zero (&cpu)
-      && !t->cpu_flags.bitfield.cpuavx512vl
       && !cpu_arch_flags.bitfield.cpuavx512vl)
     {
       for (op = 0; op < t->operands; ++op)
-- 
2.40.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL
  2023-06-20 16:44 [PATCH] x86: Don't check if AVX512 template requires AVX512VL H.J. Lu
@ 2023-06-21  6:52 ` Jan Beulich
  2023-06-21 16:06   ` H.J. Lu
  2023-06-21 10:04 ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2023-06-21  6:52 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 20.06.2023 18:44, H.J. Lu via Binutils wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t)
>    /* Templates allowing for ZMMword as well as YMMword and/or XMMword for
>       any one operand are implicity requiring AVX512VL support if the actual
>       operand size is YMMword or XMMword.  Since this function runs after
> -     template matching, there's no need to check for YMMword/XMMword in
> -     the template.  */
> +     template matching, there's no need to check for YMMword/XMMword nor
> +     AVX512VL in the template.  */
>    cpu = cpu_flags_and (t->cpu_flags, avx512);
>    if (!cpu_flags_all_zero (&cpu)
> -      && !t->cpu_flags.bitfield.cpuavx512vl
>        && !cpu_arch_flags.bitfield.cpuavx512vl)
>      {
>        for (op = 0; op < t->operands; ++op)

While the code change is correct afaict, I don't think we want it,
which is related to your imo wrong editing of the comment: For one we
check for the flag to be clear (unlike the xmm/ymm checks). And then
the check is there to skip a pointless (in that case) loop. IOW the
loop is only needed for templates which do not explicitly name
AVX512VL as a required feature; templates which do can't validly have
a mix of zmm and one or both of xmm/ymm in any one operand. In fact if
you grep i386-opc.tbl for "AVX612VL.*ZMM" you won't find any single
match.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL
  2023-06-20 16:44 [PATCH] x86: Don't check if AVX512 template requires AVX512VL H.J. Lu
  2023-06-21  6:52 ` Jan Beulich
@ 2023-06-21 10:04 ` Jan Beulich
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2023-06-21 10:04 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 20.06.2023 18:44, H.J. Lu via Binutils wrote:
> If the ZMM operand in an AVX12 template also allows XMM or ZMM, this
> template must require AVX512VL.  Drop the AVX512VL requirement check
> on such AVX512 templates.
> 
> 	* config/tc-i386.c (check_VecOperands): Don't check if AVX512
> 	template requires AVX512VL.
> ---
>  gas/config/tc-i386.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

I notice you've committed this change already. May I please ask that
you wait a little with committing any patches where there is a chance
that there are going to be comments? (Typically I wait for at least a
week unless it's a clear bugfix.)

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL
  2023-06-21  6:52 ` Jan Beulich
@ 2023-06-21 16:06   ` H.J. Lu
  2023-06-22  9:00     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2023-06-21 16:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Tue, Jun 20, 2023 at 11:52 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 20.06.2023 18:44, H.J. Lu via Binutils wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t)
> >    /* Templates allowing for ZMMword as well as YMMword and/or XMMword for
> >       any one operand are implicity requiring AVX512VL support if the actual
> >       operand size is YMMword or XMMword.  Since this function runs after
> > -     template matching, there's no need to check for YMMword/XMMword in
> > -     the template.  */
> > +     template matching, there's no need to check for YMMword/XMMword nor
> > +     AVX512VL in the template.  */
> >    cpu = cpu_flags_and (t->cpu_flags, avx512);
> >    if (!cpu_flags_all_zero (&cpu)
> > -      && !t->cpu_flags.bitfield.cpuavx512vl
> >        && !cpu_arch_flags.bitfield.cpuavx512vl)
> >      {
> >        for (op = 0; op < t->operands; ++op)
>
> While the code change is correct afaict, I don't think we want it,
> which is related to your imo wrong editing of the comment: For one we
> check for the flag to be clear (unlike the xmm/ymm checks). And then
> the check is there to skip a pointless (in that case) loop. IOW the
> loop is only needed for templates which do not explicitly name
> AVX512VL as a required feature; templates which do can't validly have
> a mix of zmm and one or both of xmm/ymm in any one operand. In fact if
> you grep i386-opc.tbl for "AVX612VL.*ZMM" you won't find any single
> match.
>

That is why the check isn't needed.

-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL
  2023-06-21 16:06   ` H.J. Lu
@ 2023-06-22  9:00     ` Jan Beulich
  2023-06-23  5:59       ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2023-06-22  9:00 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 21.06.2023 18:06, H.J. Lu wrote:
> On Tue, Jun 20, 2023 at 11:52 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 20.06.2023 18:44, H.J. Lu via Binutils wrote:
>>> --- a/gas/config/tc-i386.c
>>> +++ b/gas/config/tc-i386.c
>>> @@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t)
>>>    /* Templates allowing for ZMMword as well as YMMword and/or XMMword for
>>>       any one operand are implicity requiring AVX512VL support if the actual
>>>       operand size is YMMword or XMMword.  Since this function runs after
>>> -     template matching, there's no need to check for YMMword/XMMword in
>>> -     the template.  */
>>> +     template matching, there's no need to check for YMMword/XMMword nor
>>> +     AVX512VL in the template.  */
>>>    cpu = cpu_flags_and (t->cpu_flags, avx512);
>>>    if (!cpu_flags_all_zero (&cpu)
>>> -      && !t->cpu_flags.bitfield.cpuavx512vl
>>>        && !cpu_arch_flags.bitfield.cpuavx512vl)
>>>      {
>>>        for (op = 0; op < t->operands; ++op)
>>
>> While the code change is correct afaict, I don't think we want it,
>> which is related to your imo wrong editing of the comment: For one we
>> check for the flag to be clear (unlike the xmm/ymm checks). And then
>> the check is there to skip a pointless (in that case) loop. IOW the
>> loop is only needed for templates which do not explicitly name
>> AVX512VL as a required feature; templates which do can't validly have
>> a mix of zmm and one or both of xmm/ymm in any one operand. In fact if
>> you grep i386-opc.tbl for "AVX612VL.*ZMM" you won't find any single
>> match.
> 
> That is why the check isn't needed.

I'm confused now: You explicitly want to make gas perform more work for
no gain? The flag being clear is the common case, and hence in the
common case we can avoid getting into this loop. So yes, as said, your
code change is functionally correct (as in: not breaking things), but
it is at the same time making overall behavior worse. Please revert.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL
  2023-06-22  9:00     ` Jan Beulich
@ 2023-06-23  5:59       ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2023-06-23  5:59 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 22.06.2023 11:00, Jan Beulich wrote:
> On 21.06.2023 18:06, H.J. Lu wrote:
>> On Tue, Jun 20, 2023 at 11:52 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>
>>> On 20.06.2023 18:44, H.J. Lu via Binutils wrote:
>>>> --- a/gas/config/tc-i386.c
>>>> +++ b/gas/config/tc-i386.c
>>>> @@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t)
>>>>    /* Templates allowing for ZMMword as well as YMMword and/or XMMword for
>>>>       any one operand are implicity requiring AVX512VL support if the actual
>>>>       operand size is YMMword or XMMword.  Since this function runs after
>>>> -     template matching, there's no need to check for YMMword/XMMword in
>>>> -     the template.  */
>>>> +     template matching, there's no need to check for YMMword/XMMword nor
>>>> +     AVX512VL in the template.  */
>>>>    cpu = cpu_flags_and (t->cpu_flags, avx512);
>>>>    if (!cpu_flags_all_zero (&cpu)
>>>> -      && !t->cpu_flags.bitfield.cpuavx512vl
>>>>        && !cpu_arch_flags.bitfield.cpuavx512vl)
>>>>      {
>>>>        for (op = 0; op < t->operands; ++op)
>>>
>>> While the code change is correct afaict, I don't think we want it,
>>> which is related to your imo wrong editing of the comment: For one we
>>> check for the flag to be clear (unlike the xmm/ymm checks). And then
>>> the check is there to skip a pointless (in that case) loop. IOW the
>>> loop is only needed for templates which do not explicitly name
>>> AVX512VL as a required feature; templates which do can't validly have
>>> a mix of zmm and one or both of xmm/ymm in any one operand. In fact if
>>> you grep i386-opc.tbl for "AVX612VL.*ZMM" you won't find any single
>>> match.
>>
>> That is why the check isn't needed.
> 
> I'm confused now: You explicitly want to make gas perform more work for
> no gain? The flag being clear is the common case, and hence in the
> common case we can avoid getting into this loop.

I guess I managed to contradict myself here: The flag being clear is the
common case, and hence the effect of avoiding to enter the loop is more
limited than I first inferred.

Jan

> So yes, as said, your
> code change is functionally correct (as in: not breaking things), but
> it is at the same time making overall behavior worse. Please revert.
> 
> Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-23  5:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-20 16:44 [PATCH] x86: Don't check if AVX512 template requires AVX512VL H.J. Lu
2023-06-21  6:52 ` Jan Beulich
2023-06-21 16:06   ` H.J. Lu
2023-06-22  9:00     ` Jan Beulich
2023-06-23  5:59       ` Jan Beulich
2023-06-21 10:04 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).