public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Hu, Lin1" <lin1.hu@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>,
	"Cui, Lili" <lili.cui@intel.com>
Subject: RE: [PATCH 5/8] Support APX NDD optimized encoding.
Date: Mon, 23 Oct 2023 07:50:39 +0000	[thread overview]
Message-ID: <SJ0PR11MB5940E4168DA5C09580058927A6D8A@SJ0PR11MB5940.namprd11.prod.outlook.com> (raw)
In-Reply-To: <68619a61-a0f4-a851-77c6-e324eca30cb4@suse.com>



> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, October 23, 2023 3:24 PM
> To: Hu, Lin1 <lin1.hu@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org; Cui, Lili
> <lili.cui@intel.com>
> Subject: Re: [PATCH 5/8] Support APX NDD optimized encoding.
> 
> On 23.10.2023 04:57, Hu, Lin1 wrote:
> > Thanks for your reviews, I have responded to your comments. The new patch
> will be attached to a subsequent email.
> 
> Up-front remark: Your way of replying makes it hard to spot what your
> responses actually are. Many mail programs allow you to properly prefix (by
> convention using '>') responded-to mail contents, it's usually merely a matter of
> enabling that mode of operation.
> 
> > -----Original Message-----
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Thursday, September 28, 2023 5:30 PM
> >
> > On 19.09.2023 17:25, Cui, Lili wrote:
> >> --- a/gas/config/tc-i386.c
> >> +++ b/gas/config/tc-i386.c
> >> @@ -7091,6 +7091,46 @@ check_EgprOperands (const insn_template *t)
> >>    return 0;
> >>  }
> >>
> >> +/* Optimize APX NDD insns to non-NDD insns.  */
> >> +
> >> +static int
> >
> > "bool" please when the function merely returns a yes/no indicator.
> >
> > * Have modified.
> >
> >> +optimize_NDD_to_nonNDD (const insn_template *t) {
> >> +  if (t->opcode_modifier.vexvvvv
> >> +      && t->opcode_space == SPACE_EVEXMAP4
> >> +      && i.reg_operands >= 2
> >
> > See the remark near the bottom of the changes to this file: This
> > condition is likely insufficient, as
> > - further insns allowing ND may not be treated this way (CCMPscc,
> >   CTESTscc, and one of the CFCMOVcc forms at the very least),
> > - {nf} uses will want excluding, as it would be merely a waste of
> >   time to try to re-match with fewer operands.
> >
> > * CCMPSCC and CTESTSCC’s vexvvvv will be false. I think one of the
> CFCMOVCC forms is same.
> 
> By "is same" do you mean "fits the optimization pattern here"?
> 

Because I didn't find any insn that allowing ND, but its vexvvvv is true in CFCMOVcc's table. I'm going to assume that you found it, but I don't see it. I believe its vexvvvv is false, too. So I say it is same as CCMPSCC and CTESTSCC.

> >> @@ -7562,6 +7602,15 @@ match_template (char mnem_suffix)
> >>  	     slip through to break.  */
> >>  	}
> >>
> >> +      /* If we can optimize a NDD insn to non-NDD insn, like
> >> +	 add %r16, %r8, %r8 -> add %r16, %r8, then rematch template.  */
> >> +      if (optimize_NDD_to_nonNDD (t))
> >
> > I don't think such an optimization should be done without any form of -O.
> >
> > As to the function name, maybe better optimize_NDD_to_REX2()?
> >
> > * Refer to the optimization of VOP, temporarily set to O1 will be optimized.  If
> we use 32bit register, some instructions will be optimized from NDD to rex or
> legacy. Like cmovg  0x90909090(%eax),%edx,%edx, imul   %rdx,%rax,%rdx in our
> test.
> 
> What you you mean by saying "temporarily"?

Since we don't have any relevant experience, we'd like to see if you have any opinion on this. If you haven't, users will use O1 to optimize their NDD insns.

> 
> >> --- /dev/null
> >> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
> >> @@ -0,0 +1,115 @@
> >> +# Check 64bit APX NDD instructions with optimized encoding
> >> +
> >> +	.allow_index_reg
> >> +	.text
> >> +_start:
> >> +inc    %ax,%ax
> >> +inc    %eax,%eax
> >> +inc    %rax,%rax
> >> +inc    %r16,%r16
> >> +add    %r31b,%r8b,%r8b
> >> +add    %r31b,%r8b,%r31b
> >> +addb    %r31b,%r8b,%r8b
> >> +add    %r31,%r8,%r8
> >> +addq    %r31,%r8,%r8
> >> +add    %r31d,%r8d,%r8d
> >> +addl    %r31d,%r8d,%r8d
> >> +add    %r31w,%r8w,%r8w
> >> +addw    %r31w,%r8w,%r8w
> >> +{store} add    %r31,%r8,%r8
> >> +{load}  add    %r31,%r8,%r8
> >> +add    %r31,(%r8),%r31
> >> +add    (%r31),%r8,%r8
> >> +add    $0x12344433,%r15,%r15
> >> +add    $0xfffffffff4332211,%r8,%r8
> >> +dec    %r17,%r17
> >> +not    %r17,%r17
> >> +neg    %r17,%r17
> >> +sub    %r15,%r17,%r17
> >> +sub    %r15d,(%r8),%r15d
> >> +sub    (%r15,%rax,1),%r16,%r16
> >> +sub    $0x1234,%r30,%r30
> >> +sbb    %r15,%r17,%r17
> >> +sbb    %r15,(%r8),%r15
> >> +sbb    (%r15,%rax,1),%r16,%r16
> >> +sbb    $0x1234,%r30,%r30
> >> +adc    %r15,%r17,%r17
> >> +adc    %r15d,(%r8),%r15d
> >> +adc    (%r15,%rax,1),%r16,%r16
> >> +adc    $0x1234,%r30,%r30
> >> +or    %r15,%r17,%r17
> >> +or    %r15d,(%r8),%r15d
> >> +or    (%r15,%rax,1),%r16,%r16
> >> +or    $0x1234,%r30,%r30
> >> +xor    %r15,%r17,%r17
> >> +xor    %r15d,(%r8),%r15d
> >> +xor    (%r15,%rax,1),%r16,%r16
> >> +xor    $0x1234,%r30,%r30
> >> +and    %r15,%r17,%r17
> >> +and    %r15d,(%r8),%r15d
> >> +and    (%r15,%rax,1),%r16,%r16
> >> +and    $0x1234,%r30,%r30
> >> +and    $0x1234,%r30
> >> +ror    %r31,%r31
> >> +rorb   %r31b,%r31b
> >
> > Please be consistent with omitting (or having) suffixes (further up you simply
> test both variants, but that's not the case throughout ...
> >
> >> +ror    $0x2,%r12,%r12
> >> +rol    %r31,%r31
> >> +rolb   %r31b,%r31b
> >> +rol    $0x2,%r12,%r12
> >> +rcr    %r31,%r31
> >> +rcrb   %r31b,%r31b
> >> +rcr    $0x2,%r12b,%r12b
> >> +rcr    $0x2,%r12,%r12
> >> +rcl    %r31,%r31
> >> +rclb   %r31b,%r31b
> >> +rcl    $0x2,%r12b,%r12b
> >> +rcl    $0x2,%r12,%r12
> >> +shl    %r31,%r31
> >> +shlb   %r31b,%r31b
> >> +shl    $0x2,%r12b,%r12b
> >> +shl    $0x2,%r12,%r12
> >> +sar    %r31,%r31
> >> +sarb   %r31b,%r31b
> >> +sar    $0x2,%r12b,%r12b
> >> +sar    $0x2,%r12,%r12
> >> +shl    %r31,%r31
> >> +shlb   %r31b,%r31b
> >> +shl    $0x2,%r12b,%r12b
> >> +shl    $0x2,%r12,%r12
> >> +shr    %r31,%r31
> >> +shrb   %r31b,%r31b
> >
> > ... here.
> >
> > * Added suffixes to the command names, I don't understand very well what is
> suggested inside the parentheses, I just aligned most of the insns' test items.
> 
> In parentheses I've merely tried to make more clear what aspects I mean to be
> considered for the result being consistent.

OK,  I'm tentatively assuming that my changes are well, and I'll talk to lili about putting our patches together to get back to you.
> 
> Jan

  reply	other threads:[~2023-10-23  7:50 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19 15:25 [PATCH 0/8] [RFC] Support Intel APX EGPR Cui, Lili
2023-09-19 15:25 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-09-21 15:27   ` Jan Beulich
2023-09-27 15:57     ` Cui, Lili
2023-09-21 15:51   ` Jan Beulich
2023-09-27 15:59     ` Cui, Lili
2023-09-28  8:02       ` Jan Beulich
2023-10-07  3:27         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 2/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-09-22 10:12   ` Jan Beulich
2023-10-17 15:48     ` Cui, Lili
2023-10-18  6:40       ` Jan Beulich
2023-10-18 10:44         ` Cui, Lili
2023-10-18 10:50           ` Jan Beulich
2023-09-22 10:50   ` Jan Beulich
2023-10-17 15:50     ` Cui, Lili
2023-10-17 16:11       ` Jan Beulich
2023-10-18  2:02         ` Cui, Lili
2023-10-18  6:10           ` Jan Beulich
2023-09-25  6:03   ` Jan Beulich
2023-10-17 15:52     ` Cui, Lili
2023-10-17 16:12       ` Jan Beulich
2023-10-18  6:31         ` Cui, Lili
2023-10-18  6:47           ` Jan Beulich
2023-10-18  7:52             ` Cui, Lili
2023-10-18  8:21               ` Jan Beulich
2023-10-18 11:30                 ` Cui, Lili
2023-10-19 11:58                   ` Cui, Lili
2023-10-19 15:24                     ` Jan Beulich
2023-10-19 16:38                       ` Cui, Lili
2023-10-20  6:25                         ` Jan Beulich
2023-10-22 14:33                           ` Cui, Lili
2023-09-19 15:25 ` [PATCH 3/8] Add tests for " Cui, Lili
2023-09-27 13:11   ` Jan Beulich
2023-10-17 15:53     ` FW: " Cui, Lili
2023-10-17 16:19       ` Jan Beulich
2023-10-18  2:32         ` Cui, Lili
2023-10-18  6:05           ` Jan Beulich
2023-10-18  7:16             ` Cui, Lili
2023-10-18  8:05               ` Jan Beulich
2023-10-18 11:26                 ` Cui, Lili
2023-10-18 12:06                   ` Jan Beulich
2023-10-25 16:03                     ` Cui, Lili
2023-09-27 13:19   ` Jan Beulich
2023-09-19 15:25 ` [PATCH 4/8] Support APX NDD Cui, Lili
2023-09-27 14:44   ` Jan Beulich
2023-10-22 14:05     ` Cui, Lili
2023-10-23  7:12       ` Jan Beulich
2023-10-25  8:10         ` Cui, Lili
2023-10-25  8:47           ` Jan Beulich
2023-10-25 15:49             ` Cui, Lili
2023-10-25 15:59               ` Jan Beulich
2023-09-28  7:57   ` Jan Beulich
2023-10-22 14:57     ` Cui, Lili
2023-10-24 11:39     ` Cui, Lili
2023-10-24 11:58       ` Jan Beulich
2023-10-25 15:29         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 5/8] Support APX NDD optimized encoding Cui, Lili
2023-09-28  9:29   ` Jan Beulich
2023-10-23  2:57     ` Hu, Lin1
2023-10-23  7:23       ` Jan Beulich
2023-10-23  7:50         ` Hu, Lin1 [this message]
2023-10-23  8:15           ` Jan Beulich
2023-10-24  1:40             ` Hu, Lin1
2023-10-24  6:03               ` Jan Beulich
2023-10-24  6:08                 ` Hu, Lin1
2023-10-23  3:07     ` [PATCH-V2] " Hu, Lin1
2023-10-23  3:30     ` [PATCH 5/8] [v2] " Hu, Lin1
2023-10-23  7:26       ` Jan Beulich
2023-09-19 15:25 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-09-28 11:37   ` Jan Beulich
2023-10-30 15:21     ` Cui, Lili
2023-10-30 15:31       ` Jan Beulich
2023-11-20 13:05         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 7/8] Support APX NF Cui, Lili
2023-09-25  6:07   ` Jan Beulich
2023-09-28 12:42   ` Jan Beulich
2023-11-02 10:15     ` Cui, Lili
2023-11-02 10:23       ` Jan Beulich
2023-11-02 10:46         ` Cui, Lili
2023-12-12  2:59           ` H.J. Lu
2023-09-19 15:25 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-09-28 13:11   ` Jan Beulich
2023-11-02  2:32     ` Hu, Lin1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5940E4168DA5C09580058927A6D8A@SJ0PR11MB5940.namprd11.prod.outlook.com \
    --to=lin1.hu@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).