public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Fixing inline expansion of overlapping memmove and non-overlapping memcpy
@ 2019-05-14 19:21 Aaron Sawdey
  2019-05-15 12:23 ` Richard Biener
  2019-05-15 13:10 ` Michael Matz
  0 siblings, 2 replies; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-14 19:21 UTC (permalink / raw)
  To: gcc, Joseph Myers, Jakub Jelinek, Richard Biener, law
  Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt

GCC does not currently do inline expansion of overlapping memmove, nor does it
have an expansion pattern to allow for non-overlapping memcpy, so I plan to add
patterns and support to implement this in gcc 10 timeframe.

At present memcpy and memmove are kind of entangled. Here's the current state of
play:

memcpy -> expand with movmem pattern
memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
memmove (overlap) -> remains memmove -> glibc call

There are several problems currently. If the memmove() arguments are in fact
overlapping, then the expansion is actually not used which makes no sense and
costs performance of calling a library function instead of inline expanding
memmove() of small blocks.

There is currently no way to have a separate memcpy pattern. I know from
experience with expansion of memcmp on power that lengths on the order of
hundreds of bytes are needed before the function call overhead is overcome by
optimized glibc code. But we need the memcpy guarantee of non-overlapping
arguments to make that happen, as we don't want to do a runtime overlap test.

There is some analysis that happens in gimple_fold_builtin_memory_op() that
determines when memmove calls cannot have an overlap between the arguments and
converts them into memcpy() which is nice.

However in builtins.c expand_builtin_memmove() does not actually do the
expansion using the memmove pattern. This is why a memmove() call that cannot be
converted to memcpy() by gimple_fold_builtin_memory_op() is not expanded and we
call glibc memmove(). Only expand_builtin_memcpy() actually uses the memmove
pattern.

So here's my proposed set of fixes:
 * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
   cases.
 * The movmem optab will continue to be treated exactly as it is today so
   that ports that might have a broken movmem pattern that doesn't actually
   handle the overlap cases will continue to work.
 * expand_builtin_memmove() needs to actually do the memmove() expansion.
 * expand_builtin_memcpy() needs to use cpymem. Currently this happens down in
   emit_block_move_via_movmem() so some functions might need to be renamed.
 * ports can then add the new overlapping move and nonoverlapping copy expanders
   and will get better expansion of both memmove and memcpy functions.

I'd be interested in any comments about pieces of this machinery that need to
work a certain way, or other related issues that should be addressed in
between expand_builtin_memcpy() and emit_block_move_via_movmem().

Thanks!
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-14 19:21 Fixing inline expansion of overlapping memmove and non-overlapping memcpy Aaron Sawdey
@ 2019-05-15 12:23 ` Richard Biener
  2019-05-15 13:24   ` Aaron Sawdey
  2019-05-15 13:10 ` Michael Matz
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Biener @ 2019-05-15 12:23 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: GCC Development, Joseph Myers, Jakub Jelinek, Jeff Law,
	Segher Boessenkool, David Edelsohn, Bill Schmidt

On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey <acsawdey@linux.ibm.com> wrote:
>
> GCC does not currently do inline expansion of overlapping memmove, nor does it
> have an expansion pattern to allow for non-overlapping memcpy, so I plan to add
> patterns and support to implement this in gcc 10 timeframe.
>
> At present memcpy and memmove are kind of entangled. Here's the current state of
> play:
>
> memcpy -> expand with movmem pattern
> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
> memmove (overlap) -> remains memmove -> glibc call
>
> There are several problems currently. If the memmove() arguments are in fact
> overlapping, then the expansion is actually not used which makes no sense and
> costs performance of calling a library function instead of inline expanding
> memmove() of small blocks.
>
> There is currently no way to have a separate memcpy pattern. I know from
> experience with expansion of memcmp on power that lengths on the order of
> hundreds of bytes are needed before the function call overhead is overcome by
> optimized glibc code. But we need the memcpy guarantee of non-overlapping
> arguments to make that happen, as we don't want to do a runtime overlap test.
>
> There is some analysis that happens in gimple_fold_builtin_memory_op() that
> determines when memmove calls cannot have an overlap between the arguments and
> converts them into memcpy() which is nice.
>
> However in builtins.c expand_builtin_memmove() does not actually do the
> expansion using the memmove pattern. This is why a memmove() call that cannot be
> converted to memcpy() by gimple_fold_builtin_memory_op() is not expanded and we
> call glibc memmove(). Only expand_builtin_memcpy() actually uses the memmove
> pattern.
>
> So here's my proposed set of fixes:
>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>    cases.
>  * The movmem optab will continue to be treated exactly as it is today so
>    that ports that might have a broken movmem pattern that doesn't actually
>    handle the overlap cases will continue to work.
>  * expand_builtin_memmove() needs to actually do the memmove() expansion.
>  * expand_builtin_memcpy() needs to use cpymem. Currently this happens down in
>    emit_block_move_via_movmem() so some functions might need to be renamed.
>  * ports can then add the new overlapping move and nonoverlapping copy expanders
>    and will get better expansion of both memmove and memcpy functions.
>
> I'd be interested in any comments about pieces of this machinery that need to
> work a certain way, or other related issues that should be addressed in
> between expand_builtin_memcpy() and emit_block_move_via_movmem().

I wonder if introducing a __builtin_memmove_with_hints specifying whether
src < dst or dst > src or unknown and/or a safe block size where that
doesn't matter
would help?  I can then be safely expanded to memmove() or to specific
inline code.

Richard.

> Thanks!
>    Aaron
>
> --
> Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
> 050-2/C113  (507) 253-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-14 19:21 Fixing inline expansion of overlapping memmove and non-overlapping memcpy Aaron Sawdey
  2019-05-15 12:23 ` Richard Biener
@ 2019-05-15 13:10 ` Michael Matz
  2019-05-15 13:16   ` Aaron Sawdey
  1 sibling, 1 reply; 13+ messages in thread
From: Michael Matz @ 2019-05-15 13:10 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: gcc, Joseph Myers, Jakub Jelinek, Richard Biener, law,
	Segher Boessenkool, David Edelsohn, Bill Schmidt

Hi,

On Tue, 14 May 2019, Aaron Sawdey wrote:

> memcpy -> expand with movmem pattern
> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
> memmove (overlap) -> remains memmove -> glibc call
...
> However in builtins.c expand_builtin_memmove() does not actually do the 
> expansion using the memmove pattern.

Because it can't: the movmem pattern is not defined to require handling 
overlaps, and hence can't be used for any possibly overlapping 
memmove.  (So, in a way the pattern is misnamed and should probably have 
been called cpymem from the beginning, alas there we are).

> So here's my proposed set of fixes:
>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>    cases.

Wouldn't it be nicer to rename the current movmem pattern to cpymem 
wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the 
whole tree) and then introduce a new optional movmem pattern with 
overlapping semantics?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 13:10 ` Michael Matz
@ 2019-05-15 13:16   ` Aaron Sawdey
  0 siblings, 0 replies; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-15 13:16 UTC (permalink / raw)
  To: Michael Matz
  Cc: gcc, Joseph Myers, Jakub Jelinek, Richard Biener, law,
	Segher Boessenkool, David Edelsohn, Bill Schmidt

On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey wrote:
> 
>> memcpy -> expand with movmem pattern
>> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
>> memmove (overlap) -> remains memmove -> glibc call
> ...
>> However in builtins.c expand_builtin_memmove() does not actually do the 
>> expansion using the memmove pattern.
> 
> Because it can't: the movmem pattern is not defined to require handling 
> overlaps, and hence can't be used for any possibly overlapping 
> memmove.  (So, in a way the pattern is misnamed and should probably have 
> been called cpymem from the beginning, alas there we are).
> 
>> So here's my proposed set of fixes:
>>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>>    cases.
> 
> Wouldn't it be nicer to rename the current movmem pattern to cpymem 
> wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the 
> whole tree) and then introduce a new optional movmem pattern with 
> overlapping semantics?

Yeah that makes a lot of sense. I was unaware of that history, and was led
astray by the fact that the powerpc implementation of movemem works by
doing a bunch of loads into registers followed by a bunch of stores and
so (I think) would actually work for the overlap case.

Thanks,
   Aaron




-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 12:23 ` Richard Biener
@ 2019-05-15 13:24   ` Aaron Sawdey
  2019-05-15 14:02     ` Michael Matz
  0 siblings, 1 reply; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-15 13:24 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Development, Joseph Myers, Jakub Jelinek, Jeff Law,
	Segher Boessenkool, David Edelsohn, Bill Schmidt

On 5/15/19 7:22 AM, Richard Biener wrote:
> On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey <acsawdey@linux.ibm.com> wrote:
>> I'd be interested in any comments about pieces of this machinery that need to
>> work a certain way, or other related issues that should be addressed in
>> between expand_builtin_memcpy() and emit_block_move_via_movmem().
> 
> I wonder if introducing a __builtin_memmove_with_hints specifying whether
> src < dst or dst > src or unknown and/or a safe block size where that
> doesn't matter
> would help?  I can then be safely expanded to memmove() or to specific
> inline code.

Yes this would be a nice thing to get to, a single move/copy underlying
builtin, to which we communicate what the compiler's analysis tells us
about whether the operands overlap and by how much.

Next question would be how do we move from the existing movmem pattern
(which Michael Matz tells us should be renamed cpymem anyway) to this
new thing. Are you proposing that we still have both movmem and cpymem
optab entries underneath to call the patterns but introduce this
new memmove_with_hints() to be used by things called by expand_builtin_memmove()
and expand_builtin_memcpy()?

Thanks!
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 13:24   ` Aaron Sawdey
@ 2019-05-15 14:02     ` Michael Matz
  2019-05-15 14:11       ` Jakub Jelinek
  2019-05-15 16:24       ` Aaron Sawdey
  0 siblings, 2 replies; 13+ messages in thread
From: Michael Matz @ 2019-05-15 14:02 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: Richard Biener, GCC Development, Joseph Myers, Jakub Jelinek,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

Hi,

On Wed, 15 May 2019, Aaron Sawdey wrote:

> Yes this would be a nice thing to get to, a single move/copy underlying 
> builtin, to which we communicate what the compiler's analysis tells us 
> about whether the operands overlap and by how much.
> 
> Next question would be how do we move from the existing movmem pattern 
> (which Michael Matz tells us should be renamed cpymem anyway) to this 
> new thing. Are you proposing that we still have both movmem and cpymem 
> optab entries underneath to call the patterns but introduce this new 
> memmove_with_hints() to be used by things called by 
> expand_builtin_memmove() and expand_builtin_memcpy()?

I'd say so.  There are multiple levels at play:
a) exposal to user: probably a new __builtint_memmove, or a new combined 
   builtin with a hint param to differentiate (but we can't get rid of 
   __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
   route in the middleend)
b) getting it through the gimple pipeline, probably just a new builtin 
   code, trivial
c) expanding the new builtin, with the help of next items
d) RTL block moves: they are defined as non-overlapping and I don't think 
   we should change this (essentially they're the reflection of struct 
   copies in C)
e) how any of the above (builtins and RTL block moves) are implemented: 
   currently non-overlapping only, using movmem pattern when possible; 
   ultimately all sitting in the emit_block_move_hints() routine.

So, I'd add a new method to emit_block_move_hints indicating possible 
overlap, disabling the use of move_by_pieces.  Then in 
emit_block_move_via_movmem (alse getting an indication of overlap), do the 
equivalent of:

  finished = 0;
  if (overlap_possible) {
    if (optab[movmem])
      finished = emit(movmem)
  } else {
    if (optab[cpymem])
      finished = emit(cpymem);
    if (!finished && optab[movmem])  // can use movmem also for overlap
      finished = emit(movmem);
  }

The overlap_possible method would only ever be used from the builtin 
expansion, and never from the RTL block move expand.  Additionally a 
target may optionally only define the movmem pattern if it's just as good 
as the cpymem pattern (e.g. because it only handles fixed small sizes and 
uses a load-all then store-all sequence).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 14:02     ` Michael Matz
@ 2019-05-15 14:11       ` Jakub Jelinek
  2019-05-15 14:47         ` Michael Matz
  2019-05-15 16:24       ` Aaron Sawdey
  1 sibling, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2019-05-15 14:11 UTC (permalink / raw)
  To: Michael Matz
  Cc: Aaron Sawdey, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On Wed, May 15, 2019 at 02:02:32PM +0000, Michael Matz wrote:
> > Yes this would be a nice thing to get to, a single move/copy underlying 
> > builtin, to which we communicate what the compiler's analysis tells us 
> > about whether the operands overlap and by how much.
> > 
> > Next question would be how do we move from the existing movmem pattern 
> > (which Michael Matz tells us should be renamed cpymem anyway) to this 
> > new thing. Are you proposing that we still have both movmem and cpymem 
> > optab entries underneath to call the patterns but introduce this new 
> > memmove_with_hints() to be used by things called by 
> > expand_builtin_memmove() and expand_builtin_memcpy()?
> 
> I'd say so.  There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined 
>    builtin with a hint param to differentiate (but we can't get rid of 
>    __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
>    route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin 
>    code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think 
>    we should change this (essentially they're the reflection of struct 
>    copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented: 
>    currently non-overlapping only, using movmem pattern when possible; 
>    ultimately all sitting in the emit_block_move_hints() routine.

Just one thing to note, our "memcpy" expectation is that either there is no
overlap, or there is 100% overlap (src == dest), both all the current movmem
== future cpymem expanders and all the supported library implementations do
support that, though the latter just de-facto, it isn't a written guarantee.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 14:11       ` Jakub Jelinek
@ 2019-05-15 14:47         ` Michael Matz
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Matz @ 2019-05-15 14:47 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Aaron Sawdey, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

Hi,

On Wed, 15 May 2019, Jakub Jelinek wrote:

> Just one thing to note, our "memcpy" expectation is that either there is 
> no overlap, or there is 100% overlap (src == dest), both all the current 
> movmem == future cpymem expanders and all the supported library 
> implementations do support that, though the latter just de-facto, it 
> isn't a written guarantee.

Yes, I should have been more precise, complete overlap is always de-facto 
supported as well.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 14:02     ` Michael Matz
  2019-05-15 14:11       ` Jakub Jelinek
@ 2019-05-15 16:24       ` Aaron Sawdey
  2019-05-15 16:31         ` Jakub Jelinek
  1 sibling, 1 reply; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-15 16:24 UTC (permalink / raw)
  To: Michael Matz
  Cc: Richard Biener, GCC Development, Joseph Myers, Jakub Jelinek,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On 5/15/19 9:02 AM, Michael Matz wrote:
> On Wed, 15 May 2019, Aaron Sawdey wrote:
>> Next question would be how do we move from the existing movmem pattern 
>> (which Michael Matz tells us should be renamed cpymem anyway) to this 
>> new thing. Are you proposing that we still have both movmem and cpymem 
>> optab entries underneath to call the patterns but introduce this new 
>> memmove_with_hints() to be used by things called by 
>> expand_builtin_memmove() and expand_builtin_memcpy()?
> 
> I'd say so.  There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined 
>    builtin with a hint param to differentiate (but we can't get rid of 
>    __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
>    route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin 
>    code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think 
>    we should change this (essentially they're the reflection of struct 
>    copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented: 
>    currently non-overlapping only, using movmem pattern when possible; 
>    ultimately all sitting in the emit_block_move_hints() routine.
> 
> So, I'd add a new method to emit_block_move_hints indicating possible 
> overlap, disabling the use of move_by_pieces.  Then in 
> emit_block_move_via_movmem (alse getting an indication of overlap), do the 
> equivalent of:
> 
>   finished = 0;
>   if (overlap_possible) {
>     if (optab[movmem])
>       finished = emit(movmem)
>   } else {
>     if (optab[cpymem])
>       finished = emit(cpymem);
>     if (!finished && optab[movmem])  // can use movmem also for overlap
>       finished = emit(movmem);
>   }
> 
> The overlap_possible method would only ever be used from the builtin 
> expansion, and never from the RTL block move expand.  Additionally a 
> target may optionally only define the movmem pattern if it's just as good 
> as the cpymem pattern (e.g. because it only handles fixed small sizes and 
> uses a load-all then store-all sequence).

We currently have gimple_fold_builtin_memory_op() figuring out where there
is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
you forsee looking for converting __builtin_memmove() with overlap into
a call to __builtin_memmove_hint if it is a case where we can define the
overlap precisely enough to provide the hint? My guess is that this wouldn't
be a very common case.

My goals for this are:
 * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
 * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
   on the gimple analysis) and goes through optab[movmem] or optab[cpymem]

I think what you've described meets these goals and cleans things up.

Thanks,
    Aaron


-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 16:24       ` Aaron Sawdey
@ 2019-05-15 16:31         ` Jakub Jelinek
  2019-05-15 17:59           ` Aaron Sawdey
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2019-05-15 16:31 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: Michael Matz, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
> We currently have gimple_fold_builtin_memory_op() figuring out where there
> is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
> you forsee looking for converting __builtin_memmove() with overlap into
> a call to __builtin_memmove_hint if it is a case where we can define the

Please do not introduce user visible builtins that you are not intending to
support for user use.  Thus, I think you want internal function
.MEMMOVE_HINT as opposed to __builtin_memmove_hint.

> overlap precisely enough to provide the hint? My guess is that this wouldn't
> be a very common case.
> 
> My goals for this are:
>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
>    on the gimple analysis) and goes through optab[movmem] or optab[cpymem]

Except for the becomes part (the memcpy call is the same thing as
__builtin_memcpy in the middle-end, all you care about if it is
BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
have already?

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 16:31         ` Jakub Jelinek
@ 2019-05-15 17:59           ` Aaron Sawdey
  2019-05-15 18:01             ` Jakub Jelinek
  0 siblings, 1 reply; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-15 17:59 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Michael Matz, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On 5/15/19 11:31 AM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
>> My goals for this are:
>>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
>>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
>>    on the gimple analysis) and goes through optab[movmem] or optab[cpymem]
> 
> Except for the becomes part (the memcpy call is the same thing as
> __builtin_memcpy in the middle-end, all you care about if it is
> BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
> missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
> have already?

Yes. I was just trying to state what I wanted it to become, some of which
is already present. So I think I will start working on two patches:

1) rename optab movmem and the underlying patterns to cpymem.
2) add a new optab movmem that is really memmove() and add support for
having __builtin_memmove() use it.

Handling of the partial overlap case can be a separate piece of work.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 17:59           ` Aaron Sawdey
@ 2019-05-15 18:01             ` Jakub Jelinek
  2019-05-15 18:03               ` Aaron Sawdey
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2019-05-15 18:01 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: Michael Matz, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote:
> On 5/15/19 11:31 AM, Jakub Jelinek wrote:
> > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
> >> My goals for this are:
> >>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
> >>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
> >>    on the gimple analysis) and goes through optab[movmem] or optab[cpymem]
> > 
> > Except for the becomes part (the memcpy call is the same thing as
> > __builtin_memcpy in the middle-end, all you care about if it is
> > BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
> > missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
> > have already?
> 
> Yes. I was just trying to state what I wanted it to become, some of which
> is already present. So I think I will start working on two patches:
> 
> 1) rename optab movmem and the underlying patterns to cpymem.
> 2) add a new optab movmem that is really memmove() and add support for
> having __builtin_memmove() use it.
> 
> Handling of the partial overlap case can be a separate piece of work.

That 1) and 2) can be also separate pieces of work ;).

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
  2019-05-15 18:01             ` Jakub Jelinek
@ 2019-05-15 18:03               ` Aaron Sawdey
  0 siblings, 0 replies; 13+ messages in thread
From: Aaron Sawdey @ 2019-05-15 18:03 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Michael Matz, Richard Biener, GCC Development, Joseph Myers,
	Jeff Law, Segher Boessenkool, David Edelsohn, Bill Schmidt

On 5/15/19 1:01 PM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote:
>> 1) rename optab movmem and the underlying patterns to cpymem.
>> 2) add a new optab movmem that is really memmove() and add support for
>> having __builtin_memmove() use it.
>>
>> Handling of the partial overlap case can be a separate piece of work.
> 
> That 1) and 2) can be also separate pieces of work ;).
Exactly -- make things as easy as possible when I go begging for reviewers :-)

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-15 18:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 19:21 Fixing inline expansion of overlapping memmove and non-overlapping memcpy Aaron Sawdey
2019-05-15 12:23 ` Richard Biener
2019-05-15 13:24   ` Aaron Sawdey
2019-05-15 14:02     ` Michael Matz
2019-05-15 14:11       ` Jakub Jelinek
2019-05-15 14:47         ` Michael Matz
2019-05-15 16:24       ` Aaron Sawdey
2019-05-15 16:31         ` Jakub Jelinek
2019-05-15 17:59           ` Aaron Sawdey
2019-05-15 18:01             ` Jakub Jelinek
2019-05-15 18:03               ` Aaron Sawdey
2019-05-15 13:10 ` Michael Matz
2019-05-15 13:16   ` Aaron Sawdey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).