public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/112683] New: Optimizing memcpy range by extending to word bounds
@ 2023-11-23 14:41 antoshkka at gmail dot com
  2023-11-23 21:21 ` [Bug middle-end/112683] " pinskia at gcc dot gnu.org
  2023-11-24  8:01 ` rguenth at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: antoshkka at gmail dot com @ 2023-11-23 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112683

            Bug ID: 112683
           Summary: Optimizing memcpy range by extending to word bounds
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Consider the minimized source code from libstdc++

```
struct string {
    unsigned long _M_string_length;
    enum { _S_local_capacity = 15 };
    char _M_local_buf[_S_local_capacity + 1];
};

string copy(const string& __str) noexcept {
    string result;

    if (__str._M_string_length > __str._S_local_capacity)
        __builtin_unreachable();

    result._M_string_length = __str._M_string_length;
    __builtin_memcpy(result._M_local_buf, __str._M_local_buf,
                     __str._M_string_length + 1);

    return result;
}
```

Right now GCC with -O2 emits a long assembly with ~50 instructions
https://godbolt.org/z/a89bh17hd

However, note that
* the `result._M_local_buf` is uninitialized,
* there's at most 16 bytes to copy to `result._M_local_buf` which is of size 16
bytes

So the compiler could optimize the code to always copy 16 bytes. The behavior
change is not observable by user as the uninitialized bytes could contain any
data, including the same bytes as `_str._M_local_buf`.

As a result of always copying 16 bytes, the assembly becomes more than 7 times
shorter, conditional jumps go away: https://godbolt.org/z/r5GPYTs4Y

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/112683] Optimizing memcpy range by extending to word bounds
  2023-11-23 14:41 [Bug tree-optimization/112683] New: Optimizing memcpy range by extending to word bounds antoshkka at gmail dot com
@ 2023-11-23 21:21 ` pinskia at gcc dot gnu.org
  2023-11-24  8:01 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-23 21:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112683

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |middle-end

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
  # RANGE [irange] long unsigned int [1, 16] MASK 0x1f VALUE 0x0
  _2 = _1 + 1;
  # PT = nonlocal 
  _3 = &__str_5(D)->_M_local_bufD.4676;
  # .MEM_7 = VDEF <.MEM_6>
  memcpyD.1403 (&<retval>._M_local_bufD.4676, _3, _2);

The range information is there already for _2.


Note the hugely expanded out instructions is a target issue though.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/112683] Optimizing memcpy range by extending to word bounds
  2023-11-23 14:41 [Bug tree-optimization/112683] New: Optimizing memcpy range by extending to word bounds antoshkka at gmail dot com
  2023-11-23 21:21 ` [Bug middle-end/112683] " pinskia at gcc dot gnu.org
@ 2023-11-24  8:01 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-24  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112683

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note copying more will likely trigger valgrind complaints accessing
uninitialized memory?  Technically it also makes the IL to invoke
undefined behavior, if we'd expand this to byte-by-byte copies with
registers.

So I'm not sure this is a good idea.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-11-24  8:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-23 14:41 [Bug tree-optimization/112683] New: Optimizing memcpy range by extending to word bounds antoshkka at gmail dot com
2023-11-23 21:21 ` [Bug middle-end/112683] " pinskia at gcc dot gnu.org
2023-11-24  8:01 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).