public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108226] New: __restrict on inlined function parameters does not function as expected
@ 2022-12-26 0:05 jhaberman at gmail dot com
2023-01-09 13:54 ` [Bug ipa/108226] " rguenth at gcc dot gnu.org
2023-02-17 13:57 ` jamborm at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: jhaberman at gmail dot com @ 2022-12-26 0:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108226
Bug ID: 108226
Summary: __restrict on inlined function parameters does not
function as expected
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jhaberman at gmail dot com
Target Milestone: ---
In bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58526 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60712#c3 it is said that
restrict/__restrict on inlined function parameters was fixed in GCC 5. But I
ran into a case where __restrict does not work as expected:
// Godbolt link for this example: https://godbolt.org/z/e5j93Ex3v
long g;
static void Func1(void* p1, int* p2) {
switch (*p2) {
case 2:
__builtin_memcpy(p1, &g, 1);
return;
case 1:
__builtin_memcpy(p1, &g, 8);
return;
case 0: {
__builtin_memcpy(p1, &g, 16);
return;
}
}
}
static void Func2(char* __restrict p1, int* __restrict p2) {
*p2 = 1;
*p1 = 123;
Func1(p1, p2);
}
void Func3(char* p1, int* p2) {
*p2 = 1;
Func2(p1, p2);
}
The __restrict qualifiers on Func2() should allow the switch() should be
optimized away. Clang optimizes it, GCC does not.
It appears that __restrict on function parameters can even make the code worse.
Consider a slight variation on this example:
// Godbolt link for this example: https://godbolt.org/z/Y61qajETd
long g;
static void Func1(void* p1, int* p2) {
switch (*p2) {
case 2:
__builtin_memcpy(p1, &g, 1);
return;
case 1:
__builtin_memcpy(p1, &g, 8);
return;
case 0: {
__builtin_memcpy(p1, &g, 16);
return;
}
}
}
// If we remove __restrict here, GCC succeeds in optimizing away the switch().
static void Func2(char* __restrict p1, int* __restrict p2) {
*p1 = 123;
*p2 = 1;
Func1(p1, p2);
}
void Func3(char* p1, int* p2) {
*p2 = 1;
Func2(p1, p2);
}
In this case, it should be straightforward to optimize away the switch(), even
without __restrict. But GCC does not optimize this correctly unless we
*remove* __restrict.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug ipa/108226] __restrict on inlined function parameters does not function as expected
2022-12-26 0:05 [Bug tree-optimization/108226] New: __restrict on inlined function parameters does not function as expected jhaberman at gmail dot com
@ 2023-01-09 13:54 ` rguenth at gcc dot gnu.org
2023-02-17 13:57 ` jamborm at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-09 13:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108226
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jamborm at gcc dot gnu.org
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Component|tree-optimization |ipa
Last reconfirmed| |2023-01-09
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
For the first case it's the order of inlining - we first inline Func2 into
Func3 early and only then inline Func1 at IPA time which fails to put the Func1
accesses under __restrict.
For the second case it's with __restrict:
> ./cc1 -quiet t.c -O2 -fopt-info -fdump-tree-all
t.c:27:3: optimized: Inlining Func2/2 into Func3/3.
t.c:22:3: optimized: Inlined Func1.isra/5 into Func3/3 which now has time
12.500000 and size 21, net change of -7.
vs without
> ./cc1 -quiet t.c -O2 -fopt-info -fdump-tree-all
t.c:27:3: optimized: Inlining Func2/2 into Func3/3.
t.c:22:3: optimized: Inlined Func1.constprop.isra/6 into Func3/3 which now has
time 4.375000 and size 7, net change of -21.
so somehow the restrict qualification pessimizes IPA-CP?! Martin?
Note with restrict it's again the first issue. -fno-early-inlining helps
there.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug ipa/108226] __restrict on inlined function parameters does not function as expected
2022-12-26 0:05 [Bug tree-optimization/108226] New: __restrict on inlined function parameters does not function as expected jhaberman at gmail dot com
2023-01-09 13:54 ` [Bug ipa/108226] " rguenth at gcc dot gnu.org
@ 2023-02-17 13:57 ` jamborm at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: jamborm at gcc dot gnu.org @ 2023-02-17 13:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108226
--- Comment #2 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
>
> so somehow the restrict qualification pessimizes IPA-CP?! Martin?
>
Well, funny thing. Without restrict, IPA-CP sees (from release_ssa dump):
void Func3 (char * p1, int * p2)
{
<bb 2> [local count: 1073741824]:
*p1_3(D) = 123;
*p2_2(D) = 1;
Func1 (p1_3(D), p2_2(D));
return;
}
But with restrict in Func2 parameters, Func3 becomes:
void Func3 (char * p1, int * p2)
{
<bb 2> [local count: 1073741824]:
*p2_2(D) = 1;
*p1_4(D) = 123;
Func1 (p1_4(D), p2_2(D));
return;
}
And the different ordering of the two stores is the problem, even when
p1 is not a char pointer, because we dont't trust the types of the
actual/formal parameters for TBAA (we would need to know in what types
they are read in Func1).
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-02-17 13:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-26 0:05 [Bug tree-optimization/108226] New: __restrict on inlined function parameters does not function as expected jhaberman at gmail dot com
2023-01-09 13:54 ` [Bug ipa/108226] " rguenth at gcc dot gnu.org
2023-02-17 13:57 ` jamborm at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).