Re: [Patch, tentative] Use MOVE_MAX_PIECES instead of MOVE_MAX in gimple_fold_builtin_memory_op

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <rguenther@suse.de>
To: Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch, tentative] Use MOVE_MAX_PIECES instead of MOVE_MAX in gimple_fold_builtin_memory_op
Date: Fri, 25 Nov 2016 08:04:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.11.1611250859430.5294@t29.fhfr.qr> (raw)
In-Reply-To: <87ziko7ozp.fsf@atmel.com>

On Thu, 24 Nov 2016, Senthil Kumar Selvaraj wrote:

> 
> Richard Biener writes:
> 
> > On Thu, 24 Nov 2016, Richard Biener wrote:
> >
> >> On Thu, 24 Nov 2016, Senthil Kumar Selvaraj wrote:
> >> 
> >> > Hi,
> >> > 
> >> >   I've been analyzing a failing regtest (gcc.dg/strlenopt-8.c) for the avr
> >> >   target. I found that the (dump) failure is because there are 4
> >> >   instances of memcpy, while the testcase expects only 2 for a
> >> >   non-strict align target like the avr.
> >> > 
> >> >   Comparing that with a dump generated by x64_64-pc-linux, I found that
> >> >   the extra memcpy's come from the forwprop pass, when it replaces
> >> >   strcat with strlen and memcpy. For x86_64, the memcpy generated gets
> >> >   folded into a load/store in gimple_fold_builtin_memory_op. That
> >> >   doesn't happen for the avr because len (2) happens to be bigger than
> >> >   MOVE_MAX (1).
> >> > 
> >> >   The avr can only move 1 byte efficiently from reg <-> memory, but it's
> >> >   more efficient to load and store 2 bytes than to call memcpy, so
> >> >   MOVE_MAX_PIECES is set to 2.
> >> > 
> >> >   Given that gimple_fold_builtin_memory_op gets to choose between
> >> >   leaving the memcpy call as is, or breaking it down to a by-pieces
> >> >   move, shouldn't it use MOVE_MAX_PIECES instead of
> >> >   MOV_MAX?
> >> > 
> >> >   That is what the below patch does, and that makes the test
> >> >   pass. Does this sound right?
> >> 
> >> No, as we handle both memcpy and memmove this way we rely on
> >> the whole storage fit in a single register so we do the
> >> right thing for overlapping memory.
> >
> > So actually your patch doesn't chnage that, the ordering is ensured
> > by emitting a single GIMPLE load/store pair.  There are only
> > four targets defining MOVE_MAX_PIECES, and one (s390) even has
> > a smaller MOVE_MAX_PIECES than MOVE_MAX (huh).  AVR has larger
> > MOVE_MAX_PIECES than MOVE_MAX, but that seems to not make much
> > sense to me given their very similar description plus the
> > fact that AVR can only load a single byte at a time...
> >
> > The x86 comment says
> >
> > /* MOVE_MAX_PIECES is the number of bytes at a time which we can
> >    move efficiently, as opposed to  MOVE_MAX which is the maximum
> >    number of bytes we can move with a single instruction.
> >
> > which doesn't match up with
> >
> > @defmac MOVE_MAX
> > The maximum number of bytes that a single instruction can move quickly
> > between memory and registers or between two memory locations.
> > @end defmac
> >
> > note "quickly" here.  But OTOH
> >
> > @defmac MOVE_MAX_PIECES
> > A C expression used by @code{move_by_pieces} to determine the largest unit
> > a load or store used to copy memory is.  Defaults to @code{MOVE_MAX}.
> > @end defmac
> >
> > here the only difference is "copy memory".  But we try to special
> > case the one load - one store case, not generally "copy memory".
> >
> > So I think MOVE_MAX matches my intent when writing the code.
> 
> Ok, but isn't that inconsistent with tree-inline.c:estimate_move_cost, which
> considers MOVE_MAX_PIECES and MOVE_RATIO to decide between a libcall and
> by-pieces move?

Well, I don't understand why we have both MOVE_MAX and MOVE_MAX_PIECES.
There are exactly two uses of MOVE_MAX in GCC AFAICS, the gimple-fold.c
one and caller-save.c which derives MOVE_MAX_WORDS from it.  
MOVE_MAX_PIECES has the only use in block move expansion plus the
single use in tree-inline.c.

So I can't give a reason why one or the other should be more valid
but the tree-inline.c one tries to match memcpy expansion (obviously),
while the gimple-fold.c one tries to get at the maximum possible
single-insn move amount (and AVR is odd here having that lower than
MOVE_MAX_PIECES, compared to say s390 which has it the opposite way
around).

Richard.

> Regards
> Senthil
> 
> >
> > Richard.
> >
> >> Richard.
> >> 
> >> > Regards
> >> > Senthil
> >> > 
> >> > Index: gcc/gimple-fold.c
> >> > ===================================================================
> >> > --- gcc/gimple-fold.c	(revision 242741)
> >> > +++ gcc/gimple-fold.c	(working copy)
> >> > @@ -703,7 +703,7 @@
> >> >        src_align = get_pointer_alignment (src);
> >> >        dest_align = get_pointer_alignment (dest);
> >> >        if (tree_fits_uhwi_p (len)
> >> > -	  && compare_tree_int (len, MOVE_MAX) <= 0
> >> > +	  && compare_tree_int (len, MOVE_MAX_PIECES) <= 0
> >> >  	  /* ???  Don't transform copies from strings with known length this
> >> >  	     confuses the tree-ssa-strlen.c.  This doesn't handle
> >> >  	     the case in gcc.dg/strlenopt-8.c which is XFAILed for that
> >> > 
> >> > 
> >> 
> >> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

next prev parent reply	other threads:[~2016-11-25  8:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-24 13:36 Senthil Kumar Selvaraj
2016-11-24 13:57 ` Richard Biener
2016-11-24 15:09   ` Richard Biener
2016-11-24 17:45     ` Senthil Kumar Selvaraj
2016-11-25  8:04       ` Richard Biener [this message]
2016-11-28 21:58         ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1611250859430.5294@t29.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=senthil_kumar.selvaraj@atmel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).