From: Carl Love <cel@us.ibm.com>
To: "Kewen.Lin" <linkw@linux.ibm.com>
Cc: Peter Bergner <bergner@vnet.ibm.com>,
Segher Boessenkool <segher@kernel.crashing.org>,
gcc-patches@gcc.gnu.org, David Edelsohn <dje.gcc@gmail.com>
Subject: Re: [PATCH] rs6000: Update the vsx-vector-6.* tests.
Date: Thu, 29 Jun 2023 14:36:02 -0700 [thread overview]
Message-ID: <4d3135956e493fc12311af8fce18d043769ab7a4.camel@us.ibm.com> (raw)
In-Reply-To: <3f8d0bdc-bddf-d178-ee76-8d41c4b8755f@linux.ibm.com>
Kewen:
On Wed, 2023-06-28 at 16:35 +0800, Kewen.Lin wrote:
> > Yea, I was going with a runnable test and didn't include the
> > instruction counts. Added back in. Rather then doing by processor
> > version (P8, P9, P10) I was able to do it by BE/LE. The
> > instruction
> > counts were the same for LE accross processor versions but there
> > are a
> > few instruction counts that vary with BE and LE.
>
> But the original test case only checks for cpu-types (processor
> version)
> but not for endianness, it means for the bif usages, there should not
> be
> different for endianness. Why does this changes with your new test
> case?
> Could you have a further look and make it consistent with some
> adjustment
> if possible? As we know, checking insn counts sometimes are fragile,
> so
> I think we should try our best to make it as robust as possible in
> the
> first place.
>
> Besides, the original case also have some differences between p7/p8
> and
> p9.
>
There are differences on P8 LE versus BE. I did a diff between the P8
and P9 tests:
diff vsx-vector-6.p8.c vsx-vector-6.p9.c
3,4c3,4
< /* { dg-require-effective-target powerpc_p8vector_ok } */
< /* { dg-options "-O2 -mdejagnu-cpu=power8" } */
---
> /* { dg-require-effective-target powerpc_p9vector_ok } */
> /* { dg-options "-O2 -mdejagnu-cpu=power9" } */
12c12
< /* { dg-final { scan-assembler-times {\mvperm\M} 1 } } */
---
> /* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 1 } } */
23d22
< /* { dg-final { scan-assembler-times {\mxvmsub[am]dp\M} 1 } } */
37c36
< /* { dg-final { scan-assembler-times {\mxvsubdp\M} 1 } } */
---
> /* { dg-final { scan-assembler-times {\mxvmsub[am]dp\M} 1 } } */
So we can see the vperm, vpermr, xxpermr, xvmsubadp, xvmsubmdp,
xvsubdp, xvmsubadp, xvmsubmdp instruction count checks are different
between the two architectures. I then wrote a script to compile the
CPU specific test on Power 8, Power 9 and Power 10 architectures and
then grep for the above list of instructions. If I run the scrip on P8
BE and LE I get
Power 8 BE Power 8 LE Power 9 LE Power 9 BE Power 10 LE*
(makalu-lp1) (genoa) (marlin) (nilram) (ltcd97-lp3)
instruction count count count count count
vperm 1 1 0 0 0
vpermr 0 0 0 0 0
xxpermr 0 0 1 0 1
xvmsubadp 1 0 1 1 1
xvmsubmdp 0 1 0 0 0
xvsubdp 1 1 1 1 1
From the diff we see
{ dg-final {scan-assembler-times {\mxvmsub[am]dp\M} 1 } }
This test picks up the correct subtraction instruction for LE versus BE
so this "masks" the LE/BE difference. I changed the check in vsx-
vector-6-func-3op.c to match. This eliminates the LE and BE checks and
reduces the number of specific checks.
In vsx-vector-6-func-3op.c The new test checks the counts for
xxpermdi, which the original test does not check. The check for
xxpermdi are not needed. They are not directly related to the builtin
tests. I removed them.
Looking at the LE/BE checks in the other test file vsx-vector-6-func-
2op.c, instructions xvmaxsp, xvminsp and xvmaxdp were not checked in
the original test. The functions where these instructions are used get
inlined. On LE, the binary instructions show up in the inlined code as
well as what appears to be the binary for the original, non-inlined
function. Best I can see, the binary for the original function is dead
code. I don't see any calls to it. Seems like it shouldn't be there
as it would make the binary smaller. On BE, I don't see the binary for
the original non-inlined function.
I had played with putting -Wno-inline on the command line but that
didn't seem to make any difference. However, you suggestion of
__attribute__ ((noipa)) does prevent the inlining and we don't get the
second copy of the instructions showing up. The inlining eliminated the
LE/BE differences for xvmaxsp, xvminsp and xvmaxdp.
The instruction count test for xxlor in vsx-vector-6-func-2lop.c
differs on LE and BE vsx-vector-6-func-2op.c. I believe the
instruction is used with loads to reorder the data. I don't see anyway
to get around the extra xxlor instructions and verify the vec_or
builtin test generates the instruction.
I was able to eliminate all of the LE/BE qualifiers in the instruction
counts with the exception of xxlor. By using the same checks that look
for multiple versions of xvmsumb*, as was done in the original test, we
can also eliminate LE/BE specific tests and account for different
instructions across CPU versions. We could go back to checking for
specific instructions being generated on Power 8, Power 9, Power 10 if
you prefer not using checks that cover multiple flavors of a given
instruction across different CPU types.
FYI, I eliminated the function call to do the various tests. Instead,
I modified the macro to generate a function call to do the test and
check the results.
Carl
next prev parent reply other threads:[~2023-06-29 21:36 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-30 20:46 Carl Love
2023-06-19 7:17 ` Kewen.Lin
2023-06-21 22:42 ` Carl Love
2023-06-28 8:35 ` Kewen.Lin
2023-06-29 21:36 ` Carl Love [this message]
2023-06-30 3:37 ` Kewen.Lin
2023-06-30 22:20 ` Carl Love
2023-06-30 23:50 ` Carl Love
2023-07-01 0:03 ` Peter Bergner
2023-06-30 23:59 ` Peter Bergner
2023-07-03 15:57 ` Carl Love
2023-07-04 2:08 ` Kewen.Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4d3135956e493fc12311af8fce18d043769ab7a4.camel@us.ibm.com \
--to=cel@us.ibm.com \
--cc=bergner@vnet.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@linux.ibm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).