public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* autoinc / postinc not used
@ 2024-01-16 20:02 stefan
  2024-01-16 23:52 ` Oleg Endo
  0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-16 20:02 UTC (permalink / raw)
  To: gcc-help

[-- Attachment #1: Type: text/plain, Size: 3845 bytes --]

Hi all,

 

I work a lot with the good old m68k target where post-increment is supported, and I was surprised that there almost no post-increments are used in the generated code.

This simple code:

 

void memclr (int length, long * ptr) {

  for(;length--;){

    *ptr++= 0;

  }

} 

 

does not use post-increments on AVR or SH. See also https://godbolt.org/z/fTvdv65rr

 

On M68K a post-increment appears:

 

memclr:

        move.l 4(%sp),%d1

        move.l 8(%sp),%a0

        move.l %d1,%d0

        subq.l #1,%d0

        tst.l %d1

        jeq .L1

.L3:

        clr.l (%a0)+

        dbra %d0,.L3

        clr.w %d0

        subq.l #1,%d0

        jcc .L3

.L1:

        rts

 

If you change the code and add a 2nd statement to the loop:

 

void memclr (int length, long * ptr) {

  for(;length--;){

    *ptr++= 0;

    *ptr++= 0;

  }

}

 

the post-increment disappears:

 

memclr:

        move.l 4(%sp),%d1

        move.l 8(%sp),%a0

        move.l %d1,%d0

        subq.l #1,%d0

        tst.l %d1

        jeq .L1

.L3:

        clr.l (%a0)

        addq.l #8,%a0

        clr.l -4(%a0)

        dbra %d0,.L3

        clr.w %d0

        subq.l #1,%d0

        jcc .L3

.L1:

        rts

 

 

This is caused by several unfortunate conversions/optimizations. Here comes the first:

 

The GIMPLE PASS converts post-increments by creating the next pointer before the current pointer is used, which looks like

 

  ptr.0 = ptr;

  ptr = ptr.0 + 4;

  *ptr.0 = 0;

  ptr.1 = ptr;

  ptr = ptr.1 + 4;

  *ptr.1 = 0;

In the following steps this gets optimized further but in the end the addition stays always in front of the last zero assignment and ends up to become a +8. Since the +8 does not match the size also the first post-increment gets lost. And the last zero assignment is done with offset -4. That explains the generated code.

 

Now here comes my question:

 

Is there a more conforming/easier/better way to swap the generated gimple instructions than patching gimplify_modify_expr and check for assignment pairs where the pointer-add can be moved behind the memory assignment?

My hack is ugly:

 

  gimple * p2 = gimple_seq_last_stmt(*pre_p);

  if (p2->code == GIMPLE_ASSIGN && p2->prev && p2->prev != p2)

    {

      gimple * p1 = p2->prev;

      if (p1->code == GIMPLE_ASSIGN)

                {

                  tree b = gimple_assign_lhs(p1);

                  tree x1 = gimple_assign_lhs(p2);

                  tree x2 = gimple_assign_rhs1(p2);

                  if (b != x2 && (TREE_CODE(b) == VAR_DECL || TREE_CODE(x2) == VAR_DECL || TREE_CODE(b) == PARM_DECL || TREE_CODE(x2) == PARM_DECL) &&

                      ((TREE_CODE(x1) == VAR_DECL && TREE_CODE(x2) == MEM_REF && TREE_OPERAND(x2, 0) != b

                                 && (TREE_CODE(TREE_OPERAND(x2, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x2, 0)) == PARM_DECL)) ||

                       (TREE_CODE(x1) == MEM_REF && (TREE_CODE(x2) == INTEGER_CST || (TREE_CODE(x2) == VAR_DECL && TREE_OPERAND(x1, 0) != b)))

                                  && (TREE_CODE(TREE_OPERAND(x1, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x1, 0)) == PARM_DECL)))

                    {

                      gimple_stmt_iterator to = gsi_last (*pre_p);

                      gimple_stmt_iterator from = to;

                      from.ptr = p1;

                      gsi_remove (&from, false);

                      gsi_insert_after (&to, p1, GSI_NEW_STMT);

                    }

                }

    }

 

(there are more modifications necessary to create better code, but it’s possible)

 

Thanks

 

Stefan

 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: autoinc / postinc not used
  2024-01-16 20:02 autoinc / postinc not used stefan
@ 2024-01-16 23:52 ` Oleg Endo
  2024-01-17 15:19   ` AW: " stefan
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Endo @ 2024-01-16 23:52 UTC (permalink / raw)
  To: stefan, gcc-help

Hi,


On Tue, 2024-01-16 at 21:02 +0100, stefan@franke.ms wrote:
> Hi all,
> 
>  
> 
> I work a lot with the good old m68k target where post-increment is supported, and I was surprised that there almost no post-increments are used in the generated code.
> 
> This simple code:
> 
>  
> 
> void memclr (int length, long * ptr) {
> 
>   for(;length--;){
> 
>     *ptr++= 0;
> 
>   }
> 
> } 
> 
>  
> 
> does not use post-increments on AVR or SH. See also https://godbolt.org/z/fTvdv65rr
> 
> 

This issue has been around for a very long time.  I guess auto-inc is not
important enough on modern architectures to be of a major concern.

On SH{1,2,3,4} post-inc is only available for mem stores, not for mem loads.
SH2A adds support for stores with post-inc.  You have to specify "-mb -m2a"
for that, as the default target of sh-elf is SH1 (-m1).  But yeah, it's also
not utilized in this case.

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* AW: autoinc / postinc not used
  2024-01-16 23:52 ` Oleg Endo
@ 2024-01-17 15:19   ` stefan
  2024-01-17 23:32     ` Oleg Endo
  0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-17 15:19 UTC (permalink / raw)
  To: gcc-help

> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
> 
> Hi,
> 
> 
> On Tue, 2024-01-16 at 21:02 +0100, stefan@franke.ms wrote:
> > Hi all,
> >
> >
> >
> > I work a lot with the good old m68k target where post-increment is
> supported, and I was surprised that there almost no post-increments are used
> in the generated code.
> >
> > This simple code:
> >
> >
> >
> > void memclr (int length, long * ptr) {
> >
> >   for(;length--;){
> >
> >     *ptr++= 0;
> >
> >   }
> >
> > }
> >
> >
> >
> > does not use post-increments on AVR or SH. See also
> > https://godbolt.org/z/fTvdv65rr
> >
> >
> 
> This issue has been around for a very long time.  I guess auto-inc is not
> important enough on modern architectures to be of a major concern.
> 
> On SH{1,2,3,4} post-inc is only available for mem stores, not for mem loads.
> SH2A adds support for stores with post-inc.  You have to specify "-mb -m2a"
> for that, as the default target of sh-elf is SH1 (-m1).  But yeah, it's also not
> utilized in this case.
> 
> Cheers,
> Oleg

There are some more hacks needed to fix the handling here and there, most work is needed for -funroll-loops. After all, gcc creates beautiful code as shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version 6.5.0, but maybe I'll port that to a more recent version.
If there is interest for other targets, I can provide the information to apply my changes. Or you grab it yourself from my github repo.

Regards

Stefan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AW: autoinc / postinc not used
  2024-01-17 15:19   ` AW: " stefan
@ 2024-01-17 23:32     ` Oleg Endo
  2024-01-18 16:17       ` AW: " stefan
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Endo @ 2024-01-17 23:32 UTC (permalink / raw)
  To: stefan, gcc-help


On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
> 
> 
> There are some more hacks needed to fix the handling here and there, most work is needed for -funroll-loops. After all, gcc creates beautiful code as shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version 6.5.0, but maybe I'll port that to a more recent version.
> If there is interest for other targets, I can provide the information to apply my changes. Or you grab it yourself from my github repo.
> 
> 

The results you're getting on GCC 6 look great.  Where is the patch or your
github repo?

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* AW: AW: autoinc / postinc not used
  2024-01-17 23:32     ` Oleg Endo
@ 2024-01-18 16:17       ` stefan
  2024-01-30 12:22         ` Oleg Endo
  0 siblings, 1 reply; 6+ messages in thread
From: stefan @ 2024-01-18 16:17 UTC (permalink / raw)
  To: gcc-help

> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
> von Oleg Endo
> Gesendet: Donnerstag, 18. Januar 2024 00:33
> An: stefan@franke.ms; gcc-help@gcc.gnu.org
> Betreff: Re: AW: autoinc / postinc not used
> 
> 
> On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
> >
> >
> > There are some more hacks needed to fix the handling here and there, most
> work is needed for -funroll-loops. After all, gcc creates beautiful code as
> shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version
> 6.5.0, but maybe I'll port that to a more recent version.
> > If there is interest for other targets, I can provide the information to apply
> my changes. Or you grab it yourself from my github repo.
> >
> >
> 
> The results you're getting on GCC 6 look great.  Where is the patch or your
> github repo?
> 
> Cheers,
> Oleg

My gcc repo is here: https://github.com/bebbo/gcc and the most important branch is amiga6. Some changes are inside ifdef blocks with TARGET_M68K or TARGET_AMIGAOS. So don't expect to benefits out of the box...
... 

Regards

Stefan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AW: AW: autoinc / postinc not used
  2024-01-18 16:17       ` AW: " stefan
@ 2024-01-30 12:22         ` Oleg Endo
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Endo @ 2024-01-30 12:22 UTC (permalink / raw)
  To: stefan, gcc-help


On Thu, 2024-01-18 at 17:17 +0100, stefan@franke.ms wrote:
> > -----Ursprüngliche Nachricht-----
> > Von: Gcc-help <gcc-help-bounces+bebbo=bejy.net@gcc.gnu.org> Im Auftrag
> > von Oleg Endo
> > Gesendet: Donnerstag, 18. Januar 2024 00:33
> > An: stefan@franke.ms; gcc-help@gcc.gnu.org
> > Betreff: Re: AW: autoinc / postinc not used
> > 
> > 
> > On Wed, 2024-01-17 at 16:19 +0100, stefan@franke.ms wrote:
> > > 
> > > 
> > > There are some more hacks needed to fix the handling here and there, most
> > work is needed for -funroll-loops. After all, gcc creates beautiful code as
> > shown here http://franke.ms/cex/z/MGTb5P . There it's still the old version
> > 6.5.0, but maybe I'll port that to a more recent version.
> > > If there is interest for other targets, I can provide the information to apply
> > my changes. Or you grab it yourself from my github repo.
> > > 
> > > 
> > 
> > The results you're getting on GCC 6 look great.  Where is the patch or your
> > github repo?
> > 
> > Cheers,
> > Oleg
> 
> My gcc repo is here: https://github.com/bebbo/gcc and the most important branch is amiga6. Some changes are inside ifdef blocks with TARGET_M68K or TARGET_AMIGAOS. So don't expect to benefits out of the box...
> ... 
> 
> 

Nice, although a bit difficult to distill the actual changes.  It's better
to send a patch to gcc-patches or start a discussion for the anticipated
changes at the gimple level on the development mailing list.

I've tried to do something for addressing mode optimizations myself + Erik
Varga some years ago.

https://github.com/erikvarga/gcc/

The idea was to do it solely at RTL level.  Original first target was SH,
but I had M68K on the radar as well.  Actually many other targets would
benefit from this.  E.g. AFAIK, RISC-V backend has rolled their own way of
optimizing for the short displacement.

It's a bit difficult to come up with a generic and yet easy to use
optimization pass that just magically works.  Let me know if you're
interested in picking up any of it.

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-30 12:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-16 20:02 autoinc / postinc not used stefan
2024-01-16 23:52 ` Oleg Endo
2024-01-17 15:19   ` AW: " stefan
2024-01-17 23:32     ` Oleg Endo
2024-01-18 16:17       ` AW: " stefan
2024-01-30 12:22         ` Oleg Endo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).