public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/38453] New: Output code optimisation excessive use of builtins
@ 2008-12-09 14:51 vince at simtec dot co dot uk
2008-12-09 14:52 ` [Bug c/38453] " vince at simtec dot co dot uk
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: vince at simtec dot co dot uk @ 2008-12-09 14:51 UTC (permalink / raw)
To: gcc-bugs
While compiling compression code for LZMA for use with an embedded ARM target I
have discovered a regression from previous editions of GCC.
I have pared this down to a trivial example (attached) which boils down to a
application specific modulus operation (please note this is the *minimal* test
case and obviously is a bit more complex buried in the middle of the
compression system. The behavior exhibited remains the same in both the large
and small systems.
The simple test case is compiled with
arm-unknown-linux-gnu-gcc -Os -o foo test.c
and the resulting objdump is:
000083fc <foo>:
83fc: e92d4010 push {r4, lr}
8400: e5d11000 ldrb r1, [r1]
8404: e1a04000 mov r4, r0
8408: e1a02001 mov r2, r1
840c: ea000002 b 841c <foo+0x20>
8410: e5943004 ldr r3, [r4, #4]
8414: e2833001 add r3, r3, #1 ; 0x1
8418: e5843004 str r3, [r4, #4]
841c: e242302d sub r3, r2, #45 ; 0x2d
8420: e352002c cmp r2, #44 ; 0x2c
8424: e20320ff and r2, r3, #255 ; 0xff
8428: 8afffff8 bhi 8410 <foo+0x14>
842c: e1a00001 mov r0, r1
8430: e3a0102d mov r1, #45 ; 0x2d
8434: eb000003 bl 8448 <__umodsi3>
8438: e20000ff and r0, r0, #255 ; 0xff
843c: e5840000 str r0, [r4]
8440: e8bd8010 pop {r4, pc}
if a differing optimisation is used:
arm-unknown-linux-gnu-gcc -O2 -o foo test.c
000083fc <foo>:
83fc: e92d4070 push {r4, r5, r6, lr}
8400: e5d14000 ldrb r4, [r1]
8404: e354002c cmp r4, #44 ; 0x2c
8408: e1a06000 mov r6, r0
840c: 9a00000e bls 844c <foo+0x50>
8410: e244402d sub r4, r4, #45 ; 0x2d
8414: e20440ff and r4, r4, #255 ; 0xff
8418: e5905004 ldr r5, [r0, #4]
841c: e3a0102d mov r1, #45 ; 0x2d
8420: e1a00004 mov r0, r4
8424: eb00004f bl 8568 <__umodsi3>
8428: e3a0102d mov r1, #45 ; 0x2d
842c: e1a03000 mov r3, r0
8430: e1a00004 mov r0, r4
8434: e20340ff and r4, r3, #255 ; 0xff
8438: eb000006 bl 8458 <__aeabi_uidiv>
843c: e2855001 add r5, r5, #1 ; 0x1
8440: e20000ff and r0, r0, #255 ; 0xff
8444: e0855000 add r5, r5, r0
8448: e5865004 str r5, [r6, #4]
844c: e5864000 str r4, [r6]
8450: e8bd8070 pop {r4, r5, r6, pc}
Actually several optimization levels were tried and all produced similar output
GCC 4.2.2 and 4.2.4 (which are our current compliers)
arm-unknown-linux-gnueabi-gcc -Os -o foo test.c
produce:
00008328 <foo>:
8328: e5d12000 ldrb r2, [r1]
832c: ea000003 b 8340 <foo+0x18>
8330: e5903004 ldr r3, [r0, #4]
8334: e20120ff and r2, r1, #255 ; 0xff
8338: e2833001 add r3, r3, #1 ; 0x1
833c: e5803004 str r3, [r0, #4]
8340: e352002c cmp r2, #44 ; 0x2c
8344: e242102d sub r1, r2, #45 ; 0x2d
8348: 8afffff8 bhi 8330 <foo+0x8>
834c: e5802000 str r2, [r0]
8350: e12fff1e bx lr
As can be seen the trivial loop is performed and the divisor and remainder
found but then the __umodsi3 builtin is called to do the operation *again* and
that used to assign the result which is already available from the loop!
This odd behavior is seen in cross built (and native) GCC 4.3.2 but not in
4.2.4 it seems to be present in current development builds however I have
issues building those reliably so cannot give definite results.
The behavior is especially obvious with large performance and code size
degradation in compression code on small embedded system. Also the additional
need to link in the __umodsi3 implementation causes more space to be lost.
This has also been observed in some circumstances within ARM kernels when using
modulous on powers of two! the obvious optimisation using shifts is performed
and then the value recomputed using __modsi3
Just for completeness here is the GCC 4.3.2 compiler used for the tests (the
4.3.4 produces identical compiled output but has other undesirable behaviors
not relevant to this report)
arm-unknown-linux-gnu-gcc -v
Using built-in specs.
Target: arm-unknown-linux-gnu
Configured with: /opt/simtec/crosstool-ng/targets/src/gcc-4.3.2/configure
--build=x86_64-build_unknown-linux-gnu --host=x86_64-build_unknown-linux-gnu
--target=arm-unknown-linux-gnu --prefix=/opt/simtec/arm-unknown-linux-gnu
--with-sysroot=/opt/simtec/arm-unknown-linux-gnu/arm-unknown-linux-gnu/sys-root
--enable-languages=c,c++,fortran,java --disable-multilib --with-float=soft
--with-gmp=/opt/simtec/arm-unknown-linux-gnu
--with-mpfr=/opt/simtec/arm-unknown-linux-gnu
--with-pkgversion=crosstool-NG-1.3.0 --enable-__cxa_atexit
--with-local-prefix=/opt/simtec/arm-unknown-linux-gnu/arm-unknown-linux-gnu/sys-root
--disable-nls --enable-threads=posix --enable-symvers=gnu --enable-c99
--enable-long-long --enable-target-optspace
Thread model: posix
gcc version 4.3.2 (crosstool-NG-1.3.0)
--
Summary: Output code optimisation excessive use of builtins
Product: gcc
Version: 4.3.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vince at simtec dot co dot uk
GCC build triplet: x86_64-build_unknown-linux-gnu
GCC host triplet: x86_64-build_unknown-linux-gnu
GCC target triplet: arm-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
@ 2008-12-09 14:52 ` vince at simtec dot co dot uk
2008-12-10 0:27 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: vince at simtec dot co dot uk @ 2008-12-09 14:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from vince at simtec dot co dot uk 2008-12-09 14:51 -------
Created an attachment (id=16854)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16854&action=view)
Trivial test code to show behaviour
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
2008-12-09 14:52 ` [Bug c/38453] " vince at simtec dot co dot uk
@ 2008-12-10 0:27 ` pinskia at gcc dot gnu dot org
2008-12-10 10:56 ` steven at gcc dot gnu dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-10 0:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2008-12-10 00:25 -------
I don't see an issue here really, the code got optimized to just:
<bb 2>:
prop0.24 = *propsData;
prop0 = prop0.24;
goto <bb 4>;
<bb 3>:
propsRes->pb = [plus_expr] propsRes->pb + 1;
prop0 = prop0 + 211;
<bb 4>:
if (prop0 > 44)
goto <bb 3>;
else
goto <bb 5>;
<bb 5>:
propsRes->lc = (int) (int) (prop0.24 % 45);
return;
But since for arm, there is no %/divide instruction (which is sad by the way),
a call to __umodsi3/__aeabi_uidiv is used.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
2008-12-09 14:52 ` [Bug c/38453] " vince at simtec dot co dot uk
2008-12-10 0:27 ` pinskia at gcc dot gnu dot org
@ 2008-12-10 10:56 ` steven at gcc dot gnu dot org
2008-12-10 11:20 ` Andrew Thomas Pinski
2008-12-10 11:21 ` pinskia at gmail dot com
` (2 subsequent siblings)
5 siblings, 1 reply; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-10 10:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from steven at gcc dot gnu dot org 2008-12-10 10:51 -------
Investigating.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |steven at gcc dot gnu dot
|dot org |org
Status|UNCONFIRMED |ASSIGNED
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-12-10 10:51:37
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-10 10:56 ` steven at gcc dot gnu dot org
@ 2008-12-10 11:20 ` Andrew Thomas Pinski
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Thomas Pinski @ 2008-12-10 11:20 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
Sent from my iPhone
On Dec 10, 2008, at 2:51 AM, "steven at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org
> wrote:
>
>
> ------- Comment #3 from steven at gcc dot gnu dot org 2008-12-10
> 10:51 -------
> Investigating.
>
There is no reason to investigate. The reason why this change
happened was because the hurestic in scev-cp was removed and is now
done always. There is another bug about this with respect to the Linux
kernel too.
Thanks,
Andrew Pinski
>
> --
>
> steven at gcc dot gnu dot org changed:
>
> What |Removed |Added
> ---
> ---
> ----------------------------------------------------------------------
> AssignedTo|unassigned at gcc dot gnu |steven at gcc dot
> gnu dot
> |dot org |org
> Status|UNCONFIRMED |ASSIGNED
> Ever Confirmed|0 |1
> Last reconfirmed|0000-00-00 00:00:00 |2008-12-10 10:51:37
> date| |
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
` (2 preceding siblings ...)
2008-12-10 10:56 ` steven at gcc dot gnu dot org
@ 2008-12-10 11:21 ` pinskia at gmail dot com
2008-12-10 11:26 ` steven at gcc dot gnu dot org
2008-12-10 11:29 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 8+ messages in thread
From: pinskia at gmail dot com @ 2008-12-10 11:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gmail dot com 2008-12-10 11:20 -------
Subject: Re: Output code optimisation excessive use of builtins
Sent from my iPhone
On Dec 10, 2008, at 2:51 AM, "steven at gcc dot gnu dot org"
<gcc-bugzilla@gcc.gnu.org
> wrote:
>
>
> ------- Comment #3 from steven at gcc dot gnu dot org 2008-12-10
> 10:51 -------
> Investigating.
>
There is no reason to investigate. The reason why this change
happened was because the hurestic in scev-cp was removed and is now
done always. There is another bug about this with respect to the Linux
kernel too.
Thanks,
Andrew Pinski
>
> --
>
> steven at gcc dot gnu dot org changed:
>
> What |Removed |Added
> ---
> ---
> ----------------------------------------------------------------------
> AssignedTo|unassigned at gcc dot gnu |steven at gcc dot
> gnu dot
> |dot org |org
> Status|UNCONFIRMED |ASSIGNED
> Ever Confirmed|0 |1
> Last reconfirmed|0000-00-00 00:00:00 |2008-12-10 10:51:37
> date| |
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
` (3 preceding siblings ...)
2008-12-10 11:21 ` pinskia at gmail dot com
@ 2008-12-10 11:26 ` steven at gcc dot gnu dot org
2008-12-10 11:29 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-10 11:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from steven at gcc dot gnu dot org 2008-12-10 11:24 -------
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32044#c5
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug c/38453] Output code optimisation excessive use of builtins
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
` (4 preceding siblings ...)
2008-12-10 11:26 ` steven at gcc dot gnu dot org
@ 2008-12-10 11:29 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-10 11:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from steven at gcc dot gnu dot org 2008-12-10 11:25 -------
*** This bug has been marked as a duplicate of 32044 ***
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38453
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-12-10 11:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-09 14:51 [Bug c/38453] New: Output code optimisation excessive use of builtins vince at simtec dot co dot uk
2008-12-09 14:52 ` [Bug c/38453] " vince at simtec dot co dot uk
2008-12-10 0:27 ` pinskia at gcc dot gnu dot org
2008-12-10 10:56 ` steven at gcc dot gnu dot org
2008-12-10 11:20 ` Andrew Thomas Pinski
2008-12-10 11:21 ` pinskia at gmail dot com
2008-12-10 11:26 ` steven at gcc dot gnu dot org
2008-12-10 11:29 ` steven at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).