public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Fixing inline expansion of overlapping memmove and non-overlapping memcpy
@ 2019-05-14 19:21 Aaron Sawdey
  2019-05-15 12:23 ` Richard Biener
  2019-05-15 13:10 ` Michael Matz
  0 siblings, 2 replies; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-14 19:21 UTC (permalink / raw)
  To: gcc, Joseph Myers, Jakub Jelinek, Richard Biener, law
  Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt

GCC does not currently do inline expansion of overlapping memmove, nor does it
have an expansion pattern to allow for non-overlapping memcpy, so I plan to add
patterns and support to implement this in gcc 10 timeframe.

At present memcpy and memmove are kind of entangled. Here's the current state of
play:

memcpy -> expand with movmem pattern
memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
memmove (overlap) -> remains memmove -> glibc call

There are several problems currently. If the memmove() arguments are in fact
overlapping, then the expansion is actually not used which makes no sense and
costs performance of calling a library function instead of inline expanding
memmove() of small blocks.

There is currently no way to have a separate memcpy pattern. I know from
experience with expansion of memcmp on power that lengths on the order of
hundreds of bytes are needed before the function call overhead is overcome by
optimized glibc code. But we need the memcpy guarantee of non-overlapping
arguments to make that happen, as we don't want to do a runtime overlap test.

There is some analysis that happens in gimple_fold_builtin_memory_op() that
determines when memmove calls cannot have an overlap between the arguments and
converts them into memcpy() which is nice.

However in builtins.c expand_builtin_memmove() does not actually do the
expansion using the memmove pattern. This is why a memmove() call that cannot be
converted to memcpy() by gimple_fold_builtin_memory_op() is not expanded and we
call glibc memmove(). Only expand_builtin_memcpy() actually uses the memmove
pattern.

So here's my proposed set of fixes:
 * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
   cases.
 * The movmem optab will continue to be treated exactly as it is today so
   that ports that might have a broken movmem pattern that doesn't actually
   handle the overlap cases will continue to work.
 * expand_builtin_memmove() needs to actually do the memmove() expansion.
 * expand_builtin_memcpy() needs to use cpymem. Currently this happens down in
   emit_block_move_via_movmem() so some functions might need to be renamed.
 * ports can then add the new overlapping move and nonoverlapping copy expanders
   and will get better expansion of both memmove and memcpy functions.

I'd be interested in any comments about pieces of this machinery that need to
work a certain way, or other related issues that should be addressed in
between expand_builtin_memcpy() and emit_block_move_via_movmem().

Thanks!
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-15 18:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 19:21 Fixing inline expansion of overlapping memmove and non-overlapping memcpy Aaron Sawdey
2019-05-15 12:23 ` Richard Biener
2019-05-15 13:24   ` Aaron Sawdey
2019-05-15 14:02     ` Michael Matz
2019-05-15 14:11       ` Jakub Jelinek
2019-05-15 14:47         ` Michael Matz
2019-05-15 16:24       ` Aaron Sawdey
2019-05-15 16:31         ` Jakub Jelinek
2019-05-15 17:59           ` Aaron Sawdey
2019-05-15 18:01             ` Jakub Jelinek
2019-05-15 18:03               ` Aaron Sawdey
2019-05-15 13:10 ` Michael Matz
2019-05-15 13:16   ` Aaron Sawdey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).