public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
[not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
@ 2010-10-05 18:14 ` paul at pwsan dot com
2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: paul at pwsan dot com @ 2010-10-05 18:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
paul walmsley <paul at pwsan dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |paul at pwsan dot com
--- Comment #2 from paul walmsley <paul at pwsan dot com> 2010-10-05 18:14:35 UTC ---
Here's a minimal test case:
void foo(unsigned int v)
{
*(volatile unsigned short *)0xabcdefab = (v);
}
arm-linux-gcc -O2 -march=armv7-a -c test.c; arm-linux-objdump -DS test.o
| less
00000000 <foo>:
0: e30e3fff movw r3, #61439 ; 0xefff
4: e34a3bcd movt r3, #43981 ; 0xabcd
8: e6ff0070 uxth r0, r0
c: e14305b4 strh r0, [r3, #-84]
10: e12fff1e bx lr
As David notes, the expected behavior is that the uxth should not be generated
for >= armv6 targets, and the two shifts should not be generated on < armv6
targets, as they should be superfluous.
http://marc.info/?l=linux-omap&m=128630215909798&w=2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
[not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
@ 2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
2010-10-08 17:17 ` paul at pwsan dot com
2021-12-13 8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
3 siblings, 0 replies; 6+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2010-10-08 13:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rearnsha at gcc dot gnu.org
--- Comment #3 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2010-10-08 13:04:49 UTC ---
(In reply to comment #2)
> Here's a minimal test case:
>
> void foo(unsigned int v)
> {
> *(volatile unsigned short *)0xabcdefab = (v);
> }
>
>
The compiler has to be extremely conservative with this code as it has a
volatile memory reference to deal with. It must take extreme care not to
modify that operation and one consequence of this is that it is then difficult
to remove the narrowing operation.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
[not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
@ 2010-10-08 17:17 ` paul at pwsan dot com
2021-12-13 8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
3 siblings, 0 replies; 6+ messages in thread
From: paul at pwsan dot com @ 2010-10-08 17:17 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
--- Comment #4 from paul walmsley <paul at pwsan dot com> 2010-10-08 17:17:24 UTC ---
The bug also appears without volatile:
----
/* generates an unnecessary uxth */
void foo(unsigned short a, unsigned short b, unsigned short c,
unsigned short *e, unsigned short *f)
{
*e = (a + b) + c;
*f = (a + b) - c;
}
/* works as expected */
void bar(unsigned short a, unsigned short b, unsigned short c,
unsigned short *e)
{
*e = (a + b) + c;
}
/* works as expected */
void baz(unsigned short a, unsigned short b, unsigned short c,
unsigned short *e)
{
*e = (a + b) - c;
}
-----
compiled and dumped with:
arm-linux-gnueabi-gcc -O2 -c test.c ; objdump -DS test.o
produces:
-----
Disassembly of section .text:
00000000 <foo>:
0: e0811000 add r1, r1, r0
4: e1a01801 lsl r1, r1, #16
8: e1a01821 lsr r1, r1, #16
c: e0620001 rsb r0, r2, r1
10: e0821001 add r1, r2, r1
14: e1c310b0 strh r1, [r3]
18: e59d3000 ldr r3, [sp]
1c: e1c300b0 strh r0, [r3]
20: e12fff1e bx lr
00000024 <bar>:
24: e0811000 add r1, r1, r0
28: e0822001 add r2, r2, r1
2c: e1c320b0 strh r2, [r3]
30: e12fff1e bx lr
00000034 <baz>:
34: e0811000 add r1, r1, r0
38: e0622001 rsb r2, r2, r1
3c: e1c320b0 strh r2, [r3]
40: e12fff1e bx lr
-----
gcc -v:
Using built-in specs.
Target: arm-linux-gnueabi
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-2'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.4 --enable-shared --enable-multiarch
--enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.4.5
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-objc-gc --disable-sjlj-exceptions --enable-checking=release
--program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabi
--with-headers=/usr/arm-linux-gnueabi/include
--with-libs=/usr/arm-linux-gnueabi/lib
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-2)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/40893] ARM and PPC truncate intermediate operations unnecessarily
[not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2010-10-08 17:17 ` paul at pwsan dot com
@ 2021-12-13 8:55 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-13 8:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Keywords| |missed-optimization
Component|middle-end |rtl-optimization
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily
@ 2009-07-28 16:28 lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
0 siblings, 2 replies; 6+ messages in thread
From: lessen42+gcc at gmail dot com @ 2009-07-28 16:28 UTC (permalink / raw)
To: gcc-bugs
Consider the following C code:
#include <inttypes.h>
void dct2x2dc_dconly( int16_t d[2][2] )
{
int d0 = d[0][0] + d[0][1];
int d1 = d[1][0] + d[1][1];
d[0][0] = d0 + d1;
d[0][1] = d0 - d1;
}
The following is generated with arm-none-linux-gnueabi-gcc-4.4.0 -O3
-mcpu=cortex-a8 -S
dct2x2dc_dconly:
ldrsh ip, [r0, #2]
ldrsh r3, [r0, #0]
ldrsh r1, [r0, #6]
ldrsh r2, [r0, #4]
add r3, ip, r3
add r2, r1, r2
uxth r3, r3
uxth r2, r2
rsb r1, r2, r3
add r3, r2, r3
strh r1, [r0, #2] @ movhi
strh r3, [r0, #0] @ movhi
bx lr
(with pre-armv6 targets the two uxth are replaced by asl #16, lsr #16 pairs.)
The following is generated with powerpc-unknown-linux-gnu-gcc-4.4.0 -O3
-mcpu=G4 -S
dct2x2dc_dconly:
lha 10,2(3)
lha 0,0(3)
lha 11,6(3)
lha 9,4(3)
add 0,10,0
rlwinm 0,0,0,0xffff
add 9,11,9
rlwinm 9,9,0,0xffff
subf 11,9,0
add 0,9,0
sth 11,2(3)
sth 0,0(3)
blr
The two uxth in the ARM version, and the two rlwinm in the PPC version are
completely unnecessary, as letting strh/sth truncate will give equivalent
results. x86 does not exhibit this behaviour, and removing either d0 + d1 or d0
- d1 will not cause d0 and d1 be truncated to to 16 bits on both ARM and PPC.
powerpc-unknown-linux-gnu-gcc-4.4.0 -v
Using built-in specs.
Target: powerpc-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.4.0/work/gcc-4.4.0/configure
--prefix=/usr --bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/4.4.0
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include/g++-v4
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--enable-altivec --disable-fixed-point --without-ppl --without-cloog
--disable-nls --with-system-zlib --disable-checking --disable-werror
--enable-secureplt --disable-multilib --disable-libmudflap --disable-libssp
--enable-libgomp --enable-cld --disable-libgcj --enable-languages=c,c++,fortran
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/
--with-pkgversion='Gentoo 4.4.0 p1.1'
Thread model: posix
gcc version 4.4.0 (Gentoo 4.4.0 p1.1)
arm-none-linux-gnueabi-gcc-4.4.0 -v
Using built-in specs.
Target: arm-none-linux-gnueabi
Configured with: ../gcc-4.4.0/configure --target=arm-none-linux-gnueabi
--prefix=/usr/local/arm --enable-threads
--with-sysroot=/usr/local/arm/arm-none-linux-gnueabi/libc
Thread model: posix
gcc version 4.4.0 (GCC)
--
Summary: ARM and PPC truncate intermediate operations
unnecessarily
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: lessen42+gcc at gmail dot com
GCC host triplet: i386-apple-darwin
GCC target triplet: arm-none-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
@ 2009-07-28 22:28 ` lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
1 sibling, 0 replies; 6+ messages in thread
From: lessen42+gcc at gmail dot com @ 2009-07-28 22:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from lessen42+gcc at gmail dot com 2009-07-28 22:27 -------
More specifically, on x86_64 the following is generated with gcc-4.4 -O3
-march=core2 -S
_dct2x2dc_dconly:
movswl 2(%rdi),%edx
pushq %rbp
addw (%rdi), %dx
movswl 6(%rdi),%eax
movq %rsp, %rbp
addw 4(%rdi), %ax
leal (%rax,%rdx), %ecx
subw %ax, %dx
movw %cx, (%rdi)
movw %dx, 2(%rdi)
leave
ret
So it seems that the optimizer realizes that you don't need registers larger
than 16-bits, which allows memory operands on x86, which is optimal for this
case. However, other architectures follow this too literally, wasting
instructions to truncate intermediate results to 16 bits.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
@ 2009-09-09 16:34 ` ramana at gcc dot gnu dot org
1 sibling, 0 replies; 6+ messages in thread
From: ramana at gcc dot gnu dot org @ 2009-09-09 16:34 UTC (permalink / raw)
To: gcc-bugs
--
ramana at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-09-09 16:33:48
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-13 8:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
2010-10-08 17:17 ` paul at pwsan dot com
2021-12-13 8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).