public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* questions -mstringop-strategy option
@ 2019-01-18 22:36 Martin Sebor
  0 siblings, 0 replies; only message in thread
From: Martin Sebor @ 2019-01-18 22:36 UTC (permalink / raw)
  To: gcc

I'm looking at the option to see if I could use it to better exercise
warnings like -Wstringop-overflow and avoid the persistent regressions
on secondary targets like arm that don't inline memory built-ins.  As
an experiment, I thought I'd try to use the option to disable memcpy
inlining altogether as that's what the option is documented to do:

   Override the internal decision heuristic to decide if __builtin_memcpy
   should be inlined and what inline algorithm to use when the expected
   size of the copy operation is known

The manual says that in -mmemcpy-strategy=strategy, the argument is
a comma-separated list of alg:max_size:dest_align triplets with alg
being specified in -mstringop-strategy.  The list of documented algs
is:

   rep_byte
   rep_4byte
   rep_8byte
   byte_loop
   loop
   unrolled_loop
   libcall

The max_size component is said to specify the max byte size with
which inline algorithm alg is allowed, but the dest_align component
is undocumented.

Looking through tests for example I found:

   -mmemcpy-strategy=libcall:-1:noalign
   -mmemcpy-strategy=vector_loop:3000:align,libcall:-1:align
   -mmemcpy-strategy=libcall:-1:align
   -mmemcpy-strategy=vector_loop:2000:align,libcall:-1:align
   -mmemcpy-strategy=rep_8byte:-1:noalign
   -mmemcpy-strategy=vector_loop:-1:align

The vector_loop alg is not documented, but it looks like dest_align
can be either align or noalign (anything else?)

So with that, I tried -mmemcpy-strategy=libcall:-1:noalign as my
first attempt (see below).  Clearly, that doesn't work.  Without
looking at the code my guess is that the inlining heuristic
the manual talks about have to do with the expansion of memcpy
calls that have not been transformed into MEM_REFs by the middle
end.  Is that right?

If yes, should the manual be updated to make this clear, and to
explain the undocumented components?  (I can put together a patch
if someone can confirm that I understand this right.)

   $ cat t.c && gcc -O2 -S -Wall -Wextra -Wpedantic 
-fdump-tree-optimized=/dev/stdout -mmemcpy-strategy=libcall:-1:noalign t.c

   void f (void *d, const void *s)
   {
     __builtin_memcpy (d, s, 4);
   }

   ;; Function f (f, funcdef_no=0, decl_uid=1911, cgraph_uid=1, 
symbol_order=0)

   f (void * d, const void * s)
   {
     unsigned int _3;

     <bb 2> [local count: 1073741824]:
     _3 = MEM[(char * {ref-all})s_2(D)];
     MEM[(char * {ref-all})d_4(D)] = _3;
     return;

Martin

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-01-18 22:36 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-18 22:36 questions -mstringop-strategy option Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).