* autoinc / postinc not used
@ 2024-01-16 20:02 stefan
2024-01-16 23:52 ` Oleg Endo
0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-16 20:02 UTC (permalink / raw)
To: gcc-help
[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]
Hi all,
I work a lot with the good old m68k target where post-increment is supported, and I was surprised that there almost no post-increments are used in the generated code.
This simple code:
void memclr (int length, long * ptr) {
for(;length--;){
*ptr++= 0;
}
}
does not use post-increments on AVR or SH. See also https://godbolt.org/z/fTvdv65rr
On M68K a post-increment appears:
memclr:
move.l 4(%sp),%d1
move.l 8(%sp),%a0
move.l %d1,%d0
subq.l #1,%d0
tst.l %d1
jeq .L1
.L3:
clr.l (%a0)+
dbra %d0,.L3
clr.w %d0
subq.l #1,%d0
jcc .L3
.L1:
rts
If you change the code and add a 2nd statement to the loop:
void memclr (int length, long * ptr) {
for(;length--;){
*ptr++= 0;
*ptr++= 0;
}
}
the post-increment disappears:
memclr:
move.l 4(%sp),%d1
move.l 8(%sp),%a0
move.l %d1,%d0
subq.l #1,%d0
tst.l %d1
jeq .L1
.L3:
clr.l (%a0)
addq.l #8,%a0
clr.l -4(%a0)
dbra %d0,.L3
clr.w %d0
subq.l #1,%d0
jcc .L3
.L1:
rts
This is caused by several unfortunate conversions/optimizations. Here comes the first:
The GIMPLE PASS converts post-increments by creating the next pointer before the current pointer is used, which looks like
ptr.0 = ptr;
ptr = ptr.0 + 4;
*ptr.0 = 0;
ptr.1 = ptr;
ptr = ptr.1 + 4;
*ptr.1 = 0;
In the following steps this gets optimized further but in the end the addition stays always in front of the last zero assignment and ends up to become a +8. Since the +8 does not match the size also the first post-increment gets lost. And the last zero assignment is done with offset -4. That explains the generated code.
Now here comes my question:
Is there a more conforming/easier/better way to swap the generated gimple instructions than patching gimplify_modify_expr and check for assignment pairs where the pointer-add can be moved behind the memory assignment?
My hack is ugly:
gimple * p2 = gimple_seq_last_stmt(*pre_p);
if (p2->code == GIMPLE_ASSIGN && p2->prev && p2->prev != p2)
{
gimple * p1 = p2->prev;
if (p1->code == GIMPLE_ASSIGN)
{
tree b = gimple_assign_lhs(p1);
tree x1 = gimple_assign_lhs(p2);
tree x2 = gimple_assign_rhs1(p2);
if (b != x2 && (TREE_CODE(b) == VAR_DECL || TREE_CODE(x2) == VAR_DECL || TREE_CODE(b) == PARM_DECL || TREE_CODE(x2) == PARM_DECL) &&
((TREE_CODE(x1) == VAR_DECL && TREE_CODE(x2) == MEM_REF && TREE_OPERAND(x2, 0) != b
&& (TREE_CODE(TREE_OPERAND(x2, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x2, 0)) == PARM_DECL)) ||
(TREE_CODE(x1) == MEM_REF && (TREE_CODE(x2) == INTEGER_CST || (TREE_CODE(x2) == VAR_DECL && TREE_OPERAND(x1, 0) != b)))
&& (TREE_CODE(TREE_OPERAND(x1, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x1, 0)) == PARM_DECL)))
{
gimple_stmt_iterator to = gsi_last (*pre_p);
gimple_stmt_iterator from = to;
from.ptr = p1;
gsi_remove (&from, false);
gsi_insert_after (&to, p1, GSI_NEW_STMT);
}
}
}
(there are more modifications necessary to create better code, but it’s possible)
Thanks
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: autoinc / postinc not used
2024-01-16 20:02 autoinc / postinc not used stefan
@ 2024-01-16 23:52 ` Oleg Endo
2024-01-17 15:19 ` AW: " stefan
0 siblings, 1 reply; 6+ messages in thread
From: Oleg Endo @ 2024-01-16 23:52 UTC (permalink / raw)
To: stefan, gcc-help
Hi,
On Tue, 2024-01-16 at 21:02 +0100, stefan@franke.ms wrote:
> Hi all,
>
>
>
> I work a lot with the good old m68k target where post-increment is supported, and I was surprised that there almost no post-increments are used in the generated code.
>
> This simple code:
>
>
>
> void memclr (int length, long * ptr) {
>
> for(;length--;){
>
> *ptr++= 0;
>
> }
>
> }
>
>
>
> does not use post-increments on AVR or SH. See also https://godbolt.org/z/fTvdv65rr
>
>
This issue has been around for a very long time. I guess auto-inc is not
important enough on modern architectures to be of a major concern.
On SH{1,2,3,4} post-inc is only available for mem stores, not for mem loads.
SH2A adds support for stores with post-inc. You have to specify "-mb -m2a"
for that, as the default target of sh-elf is SH1 (-m1). But yeah, it's also
not utilized in this case.
Cheers,
Oleg
^ permalink raw reply [flat|nested] 6+ messages in thread
* AW: autoinc / postinc not used
2024-01-16 23:52 ` Oleg Endo
@ 2024-01-17 15:19 ` stefan
2024-01-17 23:32 ` Oleg Endo
0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-17 15:19 UTC (permalink / raw)
To: gcc-help
> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
>
> Hi,
>
>
> On Tue, 2024-01-16 at 21:02 +0100, stefan@franke.ms wrote:
> > Hi all,
> >
> >
> >
> > I work a lot with the good old m68k target where post-increment is
> supported, and I was surprised that there almost no post-increments are used
> in the generated code.
> >
> > This simple code:
> >
> >
> >
> > void memclr (int length, long * ptr) {
> >
> > for(;length--;){
> >
> > *ptr++= 0;
> >
> > }
> >
> > }
> >
> >
> >
> > does not use post-increments on AVR or SH. See also
> > https://godbolt.org/z/fTvdv65rr
> >
> >
>
> This issue has been around for a very long time. I guess auto-inc is not
> important enough on modern architectures to be of a major concern.
>
> On SH{1,2,3,4} post-inc is only available for mem stores, not for mem loads.
> SH2A adds support for stores with post-inc. You have to specify "-mb -m2a"
> for that, as the default target of sh-elf is SH1 (-m1). But yeah, it's also not
> utilized in this case.
>
> Cheers,
> Oleg
There are some more hacks needed to fix the handling here and there, most work is needed for -funroll-loops. After all, gcc creates beautiful code as shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version 6.5.0, but maybe I'll port that to a more recent version.
If there is interest for other targets, I can provide the information to apply my changes. Or you grab it yourself from my github repo.
Regards
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AW: autoinc / postinc not used
2024-01-17 15:19 ` AW: " stefan
@ 2024-01-17 23:32 ` Oleg Endo
2024-01-18 16:17 ` AW: " stefan
0 siblings, 1 reply; 6+ messages in thread
From: Oleg Endo @ 2024-01-17 23:32 UTC (permalink / raw)
To: stefan, gcc-help
On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
>
>
> There are some more hacks needed to fix the handling here and there, most work is needed for -funroll-loops. After all, gcc creates beautiful code as shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version 6.5.0, but maybe I'll port that to a more recent version.
> If there is interest for other targets, I can provide the information to apply my changes. Or you grab it yourself from my github repo.
>
>
The results you're getting on GCC 6 look great. Where is the patch or your
github repo?
Cheers,
Oleg
^ permalink raw reply [flat|nested] 6+ messages in thread
* AW: AW: autoinc / postinc not used
2024-01-17 23:32 ` Oleg Endo
@ 2024-01-18 16:17 ` stefan
2024-01-30 12:22 ` Oleg Endo
0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-18 16:17 UTC (permalink / raw)
To: gcc-help
> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
> von Oleg Endo
> Gesendet: Donnerstag, 18. Januar 2024 00:33
> An: stefan@franke.ms; gcc-help@gcc.gnu.org
> Betreff: Re: AW: autoinc / postinc not used
>
>
> On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
> >
> >
> > There are some more hacks needed to fix the handling here and there, most
> work is needed for -funroll-loops. After all, gcc creates beautiful code as
> shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version
> 6.5.0, but maybe I'll port that to a more recent version.
> > If there is interest for other targets, I can provide the information to apply
> my changes. Or you grab it yourself from my github repo.
> >
> >
>
> The results you're getting on GCC 6 look great. Where is the patch or your
> github repo?
>
> Cheers,
> Oleg
My gcc repo is here: https://github.com/bebbo/gcc and the most important branch is amiga6. Some changes are inside ifdef blocks with TARGET_M68K or TARGET_AMIGAOS. So don't expect to benefits out of the box...
...
Regards
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AW: AW: autoinc / postinc not used
2024-01-18 16:17 ` AW: " stefan
@ 2024-01-30 12:22 ` Oleg Endo
0 siblings, 0 replies; 6+ messages in thread
From: Oleg Endo @ 2024-01-30 12:22 UTC (permalink / raw)
To: stefan, gcc-help
On Thu, 2024-01-18 at 17:17 +0100, stefan@franke.ms wrote:
> > -----Ursprüngliche Nachricht-----
> > Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
> > von Oleg Endo
> > Gesendet: Donnerstag, 18. Januar 2024 00:33
> > An: stefan@franke.ms; gcc-help@gcc.gnu.org
> > Betreff: Re: AW: autoinc / postinc not used
> >
> >
> > On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
> > >
> > >
> > > There are some more hacks needed to fix the handling here and there, most
> > work is needed for -funroll-loops. After all, gcc creates beautiful code as
> > shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version
> > 6.5.0, but maybe I'll port that to a more recent version.
> > > If there is interest for other targets, I can provide the information to apply
> > my changes. Or you grab it yourself from my github repo.
> > >
> > >
> >
> > The results you're getting on GCC 6 look great. Where is the patch or your
> > github repo?
> >
> > Cheers,
> > Oleg
>
> My gcc repo is here: https://github.com/bebbo/gcc and the most important branch is amiga6. Some changes are inside ifdef blocks with TARGET_M68K or TARGET_AMIGAOS. So don't expect to benefits out of the box...
> ...
>
>
Nice, although a bit difficult to distill the actual changes. It's better
to send a patch to gcc-patches or start a discussion for the anticipated
changes at the gimple level on the development mailing list.
I've tried to do something for addressing mode optimizations myself + Erik
Varga some years ago.
https://github.com/erikvarga/gcc/
The idea was to do it solely at RTL level. Original first target was SH,
but I had M68K on the radar as well. Actually many other targets would
benefit from this. E.g. AFAIK, RISC-V backend has rolled their own way of
optimizing for the short displacement.
It's a bit difficult to come up with a generic and yet easy to use
optimization pass that just magically works. Let me know if you're
interested in picking up any of it.
Cheers,
Oleg
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-01-30 12:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-16 20:02 autoinc / postinc not used stefan
2024-01-16 23:52 ` Oleg Endo
2024-01-17 15:19 ` AW: " stefan
2024-01-17 23:32 ` Oleg Endo
2024-01-18 16:17 ` AW: " stefan
2024-01-30 12:22 ` Oleg Endo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).