From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>,
Mike Stump <mikestump@comcast.net>,
David Edelsohn <dje.gcc@gmail.com>, Kewen Lin <linkw@gcc.gnu.org>,
gcc-patches@gcc.gnu.org, Alexandre Oliva <oliva@adacore.com>,
Vladimir Makarov <vmakarov@redhat.com>
Subject: Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*
Date: Wed, 31 May 2023 17:00:05 +0800 [thread overview]
Message-ID: <64d67b5d-6ab5-79bf-0375-49e426b7559e@linux.ibm.com> (raw)
In-Reply-To: <20230525112200.GJ19790@gate.crashing.org>
Hi Segher,
on 2023/5/25 19:22, Segher Boessenkool wrote:
> Hi!
>
> On Thu, May 25, 2023 at 07:05:55AM -0300, Alexandre Oliva wrote:
>> On May 25, 2023, "Kewen.Lin" <linkw@linux.ibm.com> wrote:
>>> So both lp64 and ilp32 have the same count, could we merge it and
>>> remove the selectors?
>>
>> We could, but... I thought I wouldn't, since they were different
>> before, and they're likely to diverge again in the future. I thought
>> that combining them might suggest that they ought to be the same, when
>> we already know that this is not the case.
>>
>> I'll prepare an alternate patch that combines them.
>
> Fwiw, updating the insn counts blindly like this has very small value on
> the one hand, and negative value on the other. In total, negative
> value.
>
> If it is not possible to keep these tests up-to-date easily the test
> should be improved. If tests regressed otoh we should ***not*** paper
> over that with patches like this, but investigate what happened instead:
> such regressions are *real*.
>
> So which is it here? I am assuming it is a not-to-well written testcase
> without all the necessary noipa attrs, and/or putting more than one
> thing to test per function directly. Insn counts then shift easily if
> the compiler decides to factor (CSE etc.) your code differently, but
> that is a testcase artifact then, not something we want to adjust counts
> for all of the time.
>
> It is feasible to do these insn count things only for trivial tiny
> snippets. Everything bigger will regress all of the time, no one will
> look at it properly, and instead people will just do blind "update
> counts" patches like this :-/ *Good* insn count tests are quite
> valuable, but harder to write. But maintenance costs noticably bigger
> than zero for a testcase are not good, how many testcases do we run in
> the testsuite?
>
> So, can we fix the underlying problem here please?
Thanks for all the comments and good question.
I looked into this issue and found the current counts for 32-bit are
mainly for aix (it doesn't need any updates there), and there are some
generated assembly differences between aix 32-bit and 32-bit Linux, and
it seems to be related to if compiler saves the frame pointer or not.
Take a function testbc_var from fold-vec-extract-char.p7.c as example:
#include <altivec.h>
unsigned char
testbc_var (vector bool char vbc2, signed int si)
{
return vec_extract (vbc2, si);
}
1) on aix 32-bit, with trunk:
.testbc_var:
li 9,32
addi 10,1,-64
stxvw4x 34,10,9
rlwinm 3,3,0,28,31
addi 9,3,-64 // these two lines
add 3,9,1 // can be combined, see below.
lbz 3,32(3)
blr
with old gcc (without r11-6615):
.testbc_var:
addi 10,1,-64
li 9,32
stxvw4x 34,10,9
rlwinm 3,3,0,28,31
add 3,10,3 // better
lbz 3,32(3)
blr
apparently an extra unnecessary addi is created. The test
case expects one addi to adjust stack for a temp space,
one add to prepare the index for the extracted byte.
2) same thing happens on aix 64-bit and Linux 64-bit:
trunk:
.testbc_var:
li 9,48
addi 10,1,-64
stxvw4x 34,10,9
rldicl 5,5,0,60
addi 9,5,-64 // similar to aix 32-bit
add 5,9,1 // ....
lbz 3,48(5)
blr
vs. optimized:
.testbc_var:
addi 10,1,-64
li 9,48
stxvw4x 34,10,9
rldicl 5,5,0,60
add 5,10,5 // better
lbz 3,48(5)
blr
3) but for Linux 32-bit, they are the same between trunk
and old gcc (without r11-6615):
testbc_var:
stwu 1,-48(1)
li 9,16
rlwinm 3,3,0,28,31
stxvw4x 34,1,9
add 3,1,3
lbz 3,16(3)
addi 1,1,48
blr
So the expected count adjusted for aix 32-bit broke Linux
32-bit.
As above, the behavior change (one more addi) on 64-bit and aix
32-bit results in sub-optimal code than before, but we updated
the counts previously, so I changed PR101169's component to
rtl-optimization for further investigation. From 3), what
Alexandre proposed to fix for Linux 32-bit is actually to restore
the expected count back to before. :)
BR,
Kewen
prev parent reply other threads:[~2023-05-31 9:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-24 5:51 Alexandre Oliva
2023-05-25 5:44 ` Kewen.Lin
2023-05-25 10:05 ` Alexandre Oliva
2023-05-25 11:22 ` Segher Boessenkool
2023-05-25 13:55 ` Alexandre Oliva
2023-05-25 15:33 ` Segher Boessenkool
2024-04-22 10:11 ` [PATCH v2] " Alexandre Oliva
2024-05-25 8:17 ` Alexandre Oliva
2024-05-27 3:11 ` Kewen.Lin
2024-05-29 17:01 ` Alexandre Oliva
2023-05-31 9:00 ` Kewen.Lin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64d67b5d-6ab5-79bf-0375-49e426b7559e@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@gcc.gnu.org \
--cc=mikestump@comcast.net \
--cc=oliva@adacore.com \
--cc=ro@CeBiTec.Uni-Bielefeld.DE \
--cc=segher@kernel.crashing.org \
--cc=vmakarov@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).