public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts
@ 2020-08-07 23:36 admin@tho-otto.de
  2020-08-07 23:37 ` [Bug target/96532] " admin@tho-otto.de
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: admin@tho-otto.de @ 2020-08-07 23:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

            Bug ID: 96532
           Summary: [m68k] gcc 10.x generates calls to memset even for
                    very small amounts
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: admin@tho-otto.de
  Target Milestone: ---

Created attachment 49028
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49028&action=edit
Sample program

Starting with gcc 10.x, the attached small sample generates library calls to
memset, although it could determine that at most 4 bytes have to be set.

The compiler was generated from a vanilla releases/gcc-10 branch, with a
configuration of:

configure --target=m68k-elf '--prefix=/usr' '--libdir=/usr/lib64'
'--bindir=/usr/bin' '--libexecdir=${libdir}' 'CFLAGS_FOR_BUILD=-O2
-fomit-frame-pointer' 'CFLAGS=-O2 -fomit-frame-pointer' 'CXXFLAGS_FOR_BUILD=-O2
-fomit-frame-pointer' 'CXXFLAGS=-O2 -fomit-frame-pointer' 'BOOT_CFLAGS=-O2
-fomit-frame-pointer' 'CFLAGS_FOR_TARGET=-O2 -fomit-frame-pointer'
'CXXFLAGS_FOR_TARGET=-O2 -fomit-frame-pointer' 'LDFLAGS_FOR_BUILD=' 'LDFLAGS='
'--disable-libvtv' '--disable-libmpx' '--disable-libcc1' '--disable-werror'
'--with-gxx-include-dir=/usr/m68k-elf/sys-root/usr/include/c++/10'
'--with-default-libstdcxx-abi=gcc4-compatible' '--with-gcc-major-version-only'
'--with-gcc' '--with-gnu-as' '--with-gnu-ld' '--with-system-zlib'
'--disable-libgomp' '--without-newlib' '--disable-libstdcxx-pch'
'--disable-threads' '--disable-win32-registry' '--disable-lto' '--enable-ssp'
'--enable-libssp' '--disable-plugin' '--disable-decimal-float' '--disable-nls'
'--with-libiconv-prefix=/usr' '--with-libintl-prefix=/usr'
'--with-sysroot=/usr/m68k-elf/sys-root' 'CC=gcc' 'CXX=g++'
'--enable-languages=c'

Attached are the sample, the assembler output produced by gcc 10, and also the
assembler output of gcc-7.1.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
@ 2020-08-07 23:37 ` admin@tho-otto.de
  2020-08-07 23:37 ` admin@tho-otto.de
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: admin@tho-otto.de @ 2020-08-07 23:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

--- Comment #1 from Thorsten Otto <admin@tho-otto.de> ---
Created attachment 49029
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49029&action=edit
Asembler output produced by gcc 10

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
  2020-08-07 23:37 ` [Bug target/96532] " admin@tho-otto.de
@ 2020-08-07 23:37 ` admin@tho-otto.de
  2020-08-08  9:03 ` mikpelinux at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: admin@tho-otto.de @ 2020-08-07 23:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

--- Comment #2 from Thorsten Otto <admin@tho-otto.de> ---
Created attachment 49030
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49030&action=edit
Assembler output produced by gcc 7.1.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
  2020-08-07 23:37 ` [Bug target/96532] " admin@tho-otto.de
  2020-08-07 23:37 ` admin@tho-otto.de
@ 2020-08-08  9:03 ` mikpelinux at gmail dot com
  2020-08-08 13:39 ` admin@tho-otto.de
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: mikpelinux at gmail dot com @ 2020-08-08  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

Mikael Pettersson <mikpelinux at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mikpelinux at gmail dot com

--- Comment #3 from Mikael Pettersson <mikpelinux at gmail dot com> ---
This happens for multiple targets: I can reproduce it with gcc-10.2 crosses to
m68k, sparc64, and aarch64, but not with a cross to s390x or natively on
x86_64.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
                   ` (2 preceding siblings ...)
  2020-08-08  9:03 ` mikpelinux at gmail dot com
@ 2020-08-08 13:39 ` admin@tho-otto.de
  2020-08-08 14:28 ` czietz at gmx dot net
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: admin@tho-otto.de @ 2020-08-08 13:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

--- Comment #4 from Thorsten Otto <admin@tho-otto.de> ---
Might be caused by x86 and s390 having a machine dependant pattern for
setmem/cpymem, possibly eliminating the library call again, while other
target's don't have such a pattern.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
                   ` (3 preceding siblings ...)
  2020-08-08 13:39 ` admin@tho-otto.de
@ 2020-08-08 14:28 ` czietz at gmx dot net
  2020-08-25 11:15 ` admin@tho-otto.de
  2023-06-30 21:41 ` eerott at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: czietz at gmx dot net @ 2020-08-08 14:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

Christian Zietz <czietz at gmx dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |czietz at gmx dot net

--- Comment #5 from Christian Zietz <czietz at gmx dot net> ---
The call to __builtin_memset() is added by the "tree-ldist" pass. On x86_64 it
is  replaced by inline code in the "rtl-expand" pass. On m68k it isn't.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
                   ` (4 preceding siblings ...)
  2020-08-08 14:28 ` czietz at gmx dot net
@ 2020-08-25 11:15 ` admin@tho-otto.de
  2023-06-30 21:41 ` eerott at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: admin@tho-otto.de @ 2020-08-25 11:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

--- Comment #6 from Thorsten Otto <admin@tho-otto.de> ---
Created attachment 49116
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49116&action=edit
Assembler output produced by gcc 11.0.0 for arm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/96532] [m68k] gcc 10.x generates calls to memset even for very small amounts
  2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
                   ` (5 preceding siblings ...)
  2020-08-25 11:15 ` admin@tho-otto.de
@ 2023-06-30 21:41 ` eerott at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: eerott at gmail dot com @ 2023-06-30 21:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96532

Eero Tamminen <eerott at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eerott at gmail dot com

--- Comment #7 from Eero Tamminen <eerott at gmail dot com> ---
Timing and profiling whole EmuTOS (m68k ROM) bootup, showed these added
memcpy() calls adding 8% to the boot time [1] with GCC 13.1.

For that particular case, all those extra (20000) memcpy() calls, and the
associated 8% bootup overhead, came from this loop:
-----------------------------------
uint32_t pair_planes[4];
...
for (i = 0; i < v_planes / 2; i++) {
    *(uint32_t*)addr = pair_planes[i];
    addr += sizeof(uint32_t);
} 
-----------------------------------
And it went away when GCC -freestanding option was used.

Without that memcpy() overhead, GCC 13.1 perf was then very close to GCC 4.6
perf in that particular case (it did not help other cases where newer GCC was
slower).

Further testing with (compiler explorer) showed that when compiler was given a
better hint that the loop it replaced with memcpy() actually loops max 4 times,
those memcpy() instances went also away:
-----------------------------------
if (v_planes > 2*ARRAY_SIZE(pair_planes)) return;
-----------------------------------

How GCC deduced that above loop was large enough that it makes sense to replace
it with memcpy() overhead?  From the max valid index for "pair_planes", it
should have already been clear that any large indexes get to "undefined
behavior".

[1] 1/3 of the boot time went to timeout for waiting user interaction, and 1/3
went to waiting slow disk responses, so in reality the overhead was really 3x
8%.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-30 21:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07 23:36 [Bug target/96532] New: [m68k] gcc 10.x generates calls to memset even for very small amounts admin@tho-otto.de
2020-08-07 23:37 ` [Bug target/96532] " admin@tho-otto.de
2020-08-07 23:37 ` admin@tho-otto.de
2020-08-08  9:03 ` mikpelinux at gmail dot com
2020-08-08 13:39 ` admin@tho-otto.de
2020-08-08 14:28 ` czietz at gmx dot net
2020-08-25 11:15 ` admin@tho-otto.de
2023-06-30 21:41 ` eerott at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).