public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations
@ 2021-06-24 14:36 unlvsur at live dot com
  2021-06-25  6:22 ` [Bug middle-end/101197] " rguenth at gcc dot gnu.org
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-06-24 14:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

            Bug ID: 101197
           Summary: __builtin_memmove does not perform constant
                    optimizations
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: unlvsur at live dot com
  Target Milestone: ---

https://godbolt.org/z/qTMEo93j1

Two code does exactly the same thing but gcc refuses to optimize this.

While clang generates exactly the same output of assembly
https://godbolt.org/z/7a4r1hxj7

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
@ 2021-06-25  6:22 ` rguenth at gcc dot gnu.org
  2021-06-25  6:55 ` marxin at gcc dot gnu.org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-25  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-06-25
          Component|tree-optimization           |middle-end

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
void foo (char *a, char *b)
{
  __builtin_memmove (a, b, 32);
}

is not expanded inline (not even with-minline-all-stringops).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
  2021-06-25  6:22 ` [Bug middle-end/101197] " rguenth at gcc dot gnu.org
@ 2021-06-25  6:55 ` marxin at gcc dot gnu.org
  2021-06-25  7:53 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-06-25  6:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
I would like to take a look.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
  2021-06-25  6:22 ` [Bug middle-end/101197] " rguenth at gcc dot gnu.org
  2021-06-25  6:55 ` marxin at gcc dot gnu.org
@ 2021-06-25  7:53 ` jakub at gcc dot gnu.org
  2021-06-25  8:01 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-06-25  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess it can be expanded inline iff it is done through a modified
move_by_pieces - one that instead of emitting read store read store read store
emits all the reads first and then all the stores.  And hopefully the aliasing
info will make it clear for all the following RTL passes that it can overlap
and thus scheduling etc. can't reorder the stores with the reads.
emit_block_move_hints will need some small tweaks.
Currently for might_overlap it only tries emit_block_move_via_pattern and punts
if that fails.
I think we'd need some m_* flag and adjust the run method so that if that is
true, it emits the stores in a separate sequence from the reads.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (2 preceding siblings ...)
  2021-06-25  7:53 ` jakub at gcc dot gnu.org
@ 2021-06-25  8:01 ` jakub at gcc dot gnu.org
  2021-08-16 10:23 ` tnfchris at gcc dot gnu.org
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-06-25  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Also, one will probably need to rename all the MOVE_* and *move_* stuff to
COPY_* and *copy_* and reserve MOVE_* and *move_* for the overlapping copies. 
And most likely on various arches it might need smaller size limits, because it
will also depend on how many registers the target has (besides fixes/special
ones or those that likely need to be used for the to and from addresses), some
small spilling might be ok, but heavy spilling would essentially mean memcpy to
a temporary spill area and memcpy back.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (3 preceding siblings ...)
  2021-06-25  8:01 ` jakub at gcc dot gnu.org
@ 2021-08-16 10:23 ` tnfchris at gcc dot gnu.org
  2021-08-16 10:29 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-16 10:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Build|x86_64-linux-gnu            |x86_64-linux-gnu,
                   |                            |aarch64-linux-gnu
             Target|x86_64-linux-gnu            |x86_64-linux-gnu,
                   |                            |aarch64-linux-gnu
               Host|x86_64-linux-gnu            |x86_64-linux-gnu,
                   |                            |aarch64-linux-gnu
                 CC|                            |tnfchris at gcc dot gnu.org
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=90262

--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Clang seems to also be able to do better alias analysis here and transform
memmove to memcpy if it can prove the pointers don't overlap in the copy
region.

This seems to give GCC about a 30% loss compared to clang in some widely used
open source compression library.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (4 preceding siblings ...)
  2021-08-16 10:23 ` tnfchris at gcc dot gnu.org
@ 2021-08-16 10:29 ` jakub at gcc dot gnu.org
  2021-08-16 10:33 ` unlvsur at live dot com
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-08-16 10:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Shouldn't that be a different PR with details?  I mean, this PR is that we
should expand shorter memmove inline even if the regions do overlap.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (5 preceding siblings ...)
  2021-08-16 10:29 ` jakub at gcc dot gnu.org
@ 2021-08-16 10:33 ` unlvsur at live dot com
  2021-08-16 10:35 ` tnfchris at gcc dot gnu.org
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #7 from cqwrteur <unlvsur at live dot com> ---
(In reply to Jakub Jelinek from comment #6)
> Shouldn't that be a different PR with details?  I mean, this PR is that we
> should expand shorter memmove inline even if the regions do overlap.

clang also has __builtin_memcpy_inline and __builtin_memmove_inline,
__builtin_memset_inline that only accepts constant. Do we need something like
that?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (6 preceding siblings ...)
  2021-08-16 10:33 ` unlvsur at live dot com
@ 2021-08-16 10:35 ` tnfchris at gcc dot gnu.org
  2021-08-16 10:37 ` unlvsur at live dot com
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-16 10:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #8 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #6)
> Shouldn't that be a different PR with details?  I mean, this PR is that we
> should expand shorter memmove inline even if the regions do overlap.

Sure, I'm still trying to create a minimal representative example (it's C++ and
templated) unless just pointing at the github is enough. 

To be clear though, just inlining memmove at all will cover most of the
distance, it's just that you require less registers.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (7 preceding siblings ...)
  2021-08-16 10:35 ` tnfchris at gcc dot gnu.org
@ 2021-08-16 10:37 ` unlvsur at live dot com
  2021-08-16 10:53 ` tnfchris at gcc dot gnu.org
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 10:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #9 from cqwrteur <unlvsur at live dot com> ---
(In reply to Tamar Christina from comment #8)
> (In reply to Jakub Jelinek from comment #6)
> > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > should expand shorter memmove inline even if the regions do overlap.
> 
> Sure, I'm still trying to create a minimal representative example (it's C++
> and templated) unless just pointing at the github is enough. 
> 
> To be clear though, just inlining memmove at all will cover most of the
> distance, it's just that you require less registers.

inline things like memcpy and memmove will lead to serious binary bloat. The
compiler usually picks to emit call to libc's memcpy and memmove that is
usually highly optimized with assembly code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (8 preceding siblings ...)
  2021-08-16 10:37 ` unlvsur at live dot com
@ 2021-08-16 10:53 ` tnfchris at gcc dot gnu.org
  2021-08-16 10:54 ` unlvsur at live dot com
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-16 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #10 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to cqwrteur from comment #9)
> (In reply to Tamar Christina from comment #8)
> > (In reply to Jakub Jelinek from comment #6)
> > > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > > should expand shorter memmove inline even if the regions do overlap.
> > 
> > Sure, I'm still trying to create a minimal representative example (it's C++
> > and templated) unless just pointing at the github is enough. 
> > 
> > To be clear though, just inlining memmove at all will cover most of the
> > distance, it's just that you require less registers.
> 
> inline things like memcpy and memmove will lead to serious binary bloat. The
> compiler usually picks to emit call to libc's memcpy and memmove that is
> usually highly optimized with assembly code.

Yes your binary will grow, but on small memcopy and memmove. the calling
overhead, not to mention the register allocation overhead you might get from
having to spill your caller saves more than makes up for it.

We already inline memcpy and memset. there's no reason not to do memmove,
especially at -O3.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (9 preceding siblings ...)
  2021-08-16 10:53 ` tnfchris at gcc dot gnu.org
@ 2021-08-16 10:54 ` unlvsur at live dot com
  2021-08-16 10:54 ` unlvsur at live dot com
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 10:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #11 from cqwrteur <unlvsur at live dot com> ---
(In reply to Tamar Christina from comment #10)
> (In reply to cqwrteur from comment #9)
> > (In reply to Tamar Christina from comment #8)
> > > (In reply to Jakub Jelinek from comment #6)
> > > > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > > > should expand shorter memmove inline even if the regions do overlap.
> > > 
> > > Sure, I'm still trying to create a minimal representative example (it's C++
> > > and templated) unless just pointing at the github is enough. 
> > > 
> > > To be clear though, just inlining memmove at all will cover most of the
> > > distance, it's just that you require less registers.
> > 
> > inline things like memcpy and memmove will lead to serious binary bloat. The
> > compiler usually picks to emit call to libc's memcpy and memmove that is
> > usually highly optimized with assembly code.
> 
> Yes your binary will grow, but on small memcopy and memmove. the calling
> overhead, not to mention the register allocation overhead you might get from
> having to spill your caller saves more than makes up for it.
> 
> We already inline memcpy and memset. there's no reason not to do memmove,
> especially at -O3.

That is false. inline memcpy and memset only works when the size is constant.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (10 preceding siblings ...)
  2021-08-16 10:54 ` unlvsur at live dot com
@ 2021-08-16 10:54 ` unlvsur at live dot com
  2021-08-16 10:56 ` tnfchris at gcc dot gnu.org
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 10:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #12 from cqwrteur <unlvsur at live dot com> ---
(In reply to cqwrteur from comment #11)
> (In reply to Tamar Christina from comment #10)
> > (In reply to cqwrteur from comment #9)
> > > (In reply to Tamar Christina from comment #8)
> > > > (In reply to Jakub Jelinek from comment #6)
> > > > > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > > > > should expand shorter memmove inline even if the regions do overlap.
> > > > 
> > > > Sure, I'm still trying to create a minimal representative example (it's C++
> > > > and templated) unless just pointing at the github is enough. 
> > > > 
> > > > To be clear though, just inlining memmove at all will cover most of the
> > > > distance, it's just that you require less registers.
> > > 
> > > inline things like memcpy and memmove will lead to serious binary bloat. The
> > > compiler usually picks to emit call to libc's memcpy and memmove that is
> > > usually highly optimized with assembly code.
> > 
> > Yes your binary will grow, but on small memcopy and memmove. the calling
> > overhead, not to mention the register allocation overhead you might get from
> > having to spill your caller saves more than makes up for it.
> > 
> > We already inline memcpy and memset. there's no reason not to do memmove,
> > especially at -O3.
> 
> That is false. inline memcpy and memset only works when the size is constant.

more for type punning reason.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (11 preceding siblings ...)
  2021-08-16 10:54 ` unlvsur at live dot com
@ 2021-08-16 10:56 ` tnfchris at gcc dot gnu.org
  2021-08-16 11:07 ` unlvsur at live dot com
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-16 10:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #13 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to cqwrteur from comment #12)
> (In reply to cqwrteur from comment #11)
> > (In reply to Tamar Christina from comment #10)
> > > (In reply to cqwrteur from comment #9)
> > > > (In reply to Tamar Christina from comment #8)
> > > > > (In reply to Jakub Jelinek from comment #6)
> > > > > > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > > > > > should expand shorter memmove inline even if the regions do overlap.
> > > > > 
> > > > > Sure, I'm still trying to create a minimal representative example (it's C++
> > > > > and templated) unless just pointing at the github is enough. 
> > > > > 
> > > > > To be clear though, just inlining memmove at all will cover most of the
> > > > > distance, it's just that you require less registers.
> > > > 
> > > > inline things like memcpy and memmove will lead to serious binary bloat. The
> > > > compiler usually picks to emit call to libc's memcpy and memmove that is
> > > > usually highly optimized with assembly code.
> > > 
> > > Yes your binary will grow, but on small memcopy and memmove. the calling
> > > overhead, not to mention the register allocation overhead you might get from
> > > having to spill your caller saves more than makes up for it.
> > > 
> > > We already inline memcpy and memset. there's no reason not to do memmove,
> > > especially at -O3.
> > 
> > That is false. inline memcpy and memset only works when the size is constant.
> 
> more for type punning reason.

> but on small memcopy and memmove.(In reply to cqwrteur from comment #11)
> (In reply to Tamar Christina from comment #10)
> > (In reply to cqwrteur from comment #9)
> > > (In reply to Tamar Christina from comment #8)
> > > > (In reply to Jakub Jelinek from comment #6)
> > > > > Shouldn't that be a different PR with details?  I mean, this PR is that we
> > > > > should expand shorter memmove inline even if the regions do overlap.
> > > > 
> > > > Sure, I'm still trying to create a minimal representative example (it's C++
> > > > and templated) unless just pointing at the github is enough. 
> > > > 
> > > > To be clear though, just inlining memmove at all will cover most of the
> > > > distance, it's just that you require less registers.
> > > 
> > > inline things like memcpy and memmove will lead to serious binary bloat. The
> > > compiler usually picks to emit call to libc's memcpy and memmove that is
> > > usually highly optimized with assembly code.
> > 
> > Yes your binary will grow, but on small memcopy and memmove. the calling
> > overhead, not to mention the register allocation overhead you might get from
> > having to spill your caller saves more than makes up for it.
> > 
> > We already inline memcpy and memset. there's no reason not to do memmove,
> > especially at -O3.
> 
> That is false. inline memcpy and memset only works when the size is constant.

How do you think you know when the size is small?

> but on small memcopy and memmove.

By logic this means you know the size is constant.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (12 preceding siblings ...)
  2021-08-16 10:56 ` tnfchris at gcc dot gnu.org
@ 2021-08-16 11:07 ` unlvsur at live dot com
  2021-08-16 11:08 ` unlvsur at live dot com
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 11:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #14 from cqwrteur <unlvsur at live dot com> ---
(In reply to Tamar Christina from comment #13)
> (In reply to cqwrteur from comment #12)
> > (In reply to cqwrteur from comment #11)
> > > (In reply to Tamar Christina from comment #10)
> How do you think you know when the size is small?

That is unfortunate you have no way to deal with that.

> > but on small memcopy and memmove.
> 
> By logic this means you know the size is constant.

That is for type punning reason I just mentioned before.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (13 preceding siblings ...)
  2021-08-16 11:07 ` unlvsur at live dot com
@ 2021-08-16 11:08 ` unlvsur at live dot com
  2021-08-16 12:20 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: unlvsur at live dot com @ 2021-08-16 11:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #15 from cqwrteur <unlvsur at live dot com> ---
(In reply to cqwrteur from comment #14)
> (In reply to Tamar Christina from comment #13)
> > (In reply to cqwrteur from comment #12)
> > > (In reply to cqwrteur from comment #11)
> > > > (In reply to Tamar Christina from comment #10)
> > How do you think you know when the size is small?
> 
> That is unfortunate you have no way to deal with that.
> 
> > > but on small memcopy and memmove.
> > 
> > By logic this means you know the size is constant.
> 
> That is for type punning reason I just mentioned before.

You have no way to tell the compiler your size is small or how small it is.

maybe __builtin_unreachable() could help. But it is still useless.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (14 preceding siblings ...)
  2021-08-16 11:08 ` unlvsur at live dot com
@ 2021-08-16 12:20 ` rguenth at gcc dot gnu.org
  2021-08-18 12:53 ` marxin at gcc dot gnu.org
  2021-08-18 15:57 ` tnfchris at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-16 12:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
We're expanding memmove inline on GIMPLE if the size is constant power-of-two
and up to MOVE_MAX.

      /* If we can perform the copy efficiently with first doing all loads
         and then all stores inline it that way.  Currently efficiently
         means that we can load all the memory into a single integer
         register which is what MOVE_MAX gives us.  */

note as targets allow by_pieces to use larger regs now we could up that
limit as well (we just need an appropriate large integer type for the
copy).  Emitting multiple loads/stores on GIMPLE is of course also
possible, likewise would be SRA actually analyzing memmove/copy as
copies (tracking addressability properly of course - rewriting the calls
to aggregate copies of char[] would eventually ease this) so we'd use
matching loads/stores, easing followup optimization and avoiding STLF
penalties when using too large accesses.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (15 preceding siblings ...)
  2021-08-16 12:20 ` rguenth at gcc dot gnu.org
@ 2021-08-18 12:53 ` marxin at gcc dot gnu.org
  2021-08-18 15:57 ` tnfchris at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-08-18 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
           Assignee|marxin at gcc dot gnu.org          |unassigned at gcc dot gnu.org

--- Comment #17 from Martin Liška <marxin at gcc dot gnu.org> ---
Waiting for Tamara's test-case now.
Btw. can you please share a pointer to the Github repsitory?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations
  2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
                   ` (16 preceding siblings ...)
  2021-08-18 12:53 ` marxin at gcc dot gnu.org
@ 2021-08-18 15:57 ` tnfchris at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-18 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101197

--- Comment #18 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #17)
> Waiting for Tamara's test-case now.
> Btw. can you please share a pointer to the Github repsitory?

Sure, it's this project and this particular call
https://github.com/google/snappy/blob/b4888f76161debdbcde30a64be577b82fd40de29/snappy.cc#L1166

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-08-18 15:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-24 14:36 [Bug tree-optimization/101197] New: __builtin_memmove does not perform constant optimizations unlvsur at live dot com
2021-06-25  6:22 ` [Bug middle-end/101197] " rguenth at gcc dot gnu.org
2021-06-25  6:55 ` marxin at gcc dot gnu.org
2021-06-25  7:53 ` jakub at gcc dot gnu.org
2021-06-25  8:01 ` jakub at gcc dot gnu.org
2021-08-16 10:23 ` tnfchris at gcc dot gnu.org
2021-08-16 10:29 ` jakub at gcc dot gnu.org
2021-08-16 10:33 ` unlvsur at live dot com
2021-08-16 10:35 ` tnfchris at gcc dot gnu.org
2021-08-16 10:37 ` unlvsur at live dot com
2021-08-16 10:53 ` tnfchris at gcc dot gnu.org
2021-08-16 10:54 ` unlvsur at live dot com
2021-08-16 10:54 ` unlvsur at live dot com
2021-08-16 10:56 ` tnfchris at gcc dot gnu.org
2021-08-16 11:07 ` unlvsur at live dot com
2021-08-16 11:08 ` unlvsur at live dot com
2021-08-16 12:20 ` rguenth at gcc dot gnu.org
2021-08-18 12:53 ` marxin at gcc dot gnu.org
2021-08-18 15:57 ` tnfchris at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).