public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily
@ 2009-07-28 16:28 lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
0 siblings, 2 replies; 3+ messages in thread
From: lessen42+gcc at gmail dot com @ 2009-07-28 16:28 UTC (permalink / raw)
To: gcc-bugs
Consider the following C code:
#include <inttypes.h>
void dct2x2dc_dconly( int16_t d[2][2] )
{
int d0 = d[0][0] + d[0][1];
int d1 = d[1][0] + d[1][1];
d[0][0] = d0 + d1;
d[0][1] = d0 - d1;
}
The following is generated with arm-none-linux-gnueabi-gcc-4.4.0 -O3
-mcpu=cortex-a8 -S
dct2x2dc_dconly:
ldrsh ip, [r0, #2]
ldrsh r3, [r0, #0]
ldrsh r1, [r0, #6]
ldrsh r2, [r0, #4]
add r3, ip, r3
add r2, r1, r2
uxth r3, r3
uxth r2, r2
rsb r1, r2, r3
add r3, r2, r3
strh r1, [r0, #2] @ movhi
strh r3, [r0, #0] @ movhi
bx lr
(with pre-armv6 targets the two uxth are replaced by asl #16, lsr #16 pairs.)
The following is generated with powerpc-unknown-linux-gnu-gcc-4.4.0 -O3
-mcpu=G4 -S
dct2x2dc_dconly:
lha 10,2(3)
lha 0,0(3)
lha 11,6(3)
lha 9,4(3)
add 0,10,0
rlwinm 0,0,0,0xffff
add 9,11,9
rlwinm 9,9,0,0xffff
subf 11,9,0
add 0,9,0
sth 11,2(3)
sth 0,0(3)
blr
The two uxth in the ARM version, and the two rlwinm in the PPC version are
completely unnecessary, as letting strh/sth truncate will give equivalent
results. x86 does not exhibit this behaviour, and removing either d0 + d1 or d0
- d1 will not cause d0 and d1 be truncated to to 16 bits on both ARM and PPC.
powerpc-unknown-linux-gnu-gcc-4.4.0 -v
Using built-in specs.
Target: powerpc-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.4.0/work/gcc-4.4.0/configure
--prefix=/usr --bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/4.4.0
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include/g++-v4
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--enable-altivec --disable-fixed-point --without-ppl --without-cloog
--disable-nls --with-system-zlib --disable-checking --disable-werror
--enable-secureplt --disable-multilib --disable-libmudflap --disable-libssp
--enable-libgomp --enable-cld --disable-libgcj --enable-languages=c,c++,fortran
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/
--with-pkgversion='Gentoo 4.4.0 p1.1'
Thread model: posix
gcc version 4.4.0 (Gentoo 4.4.0 p1.1)
arm-none-linux-gnueabi-gcc-4.4.0 -v
Using built-in specs.
Target: arm-none-linux-gnueabi
Configured with: ../gcc-4.4.0/configure --target=arm-none-linux-gnueabi
--prefix=/usr/local/arm --enable-threads
--with-sysroot=/usr/local/arm/arm-none-linux-gnueabi/libc
Thread model: posix
gcc version 4.4.0 (GCC)
--
Summary: ARM and PPC truncate intermediate operations
unnecessarily
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: lessen42+gcc at gmail dot com
GCC host triplet: i386-apple-darwin
GCC target triplet: arm-none-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
2009-07-28 16:28 [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily lessen42+gcc at gmail dot com
@ 2009-07-28 22:28 ` lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: lessen42+gcc at gmail dot com @ 2009-07-28 22:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from lessen42+gcc at gmail dot com 2009-07-28 22:27 -------
More specifically, on x86_64 the following is generated with gcc-4.4 -O3
-march=core2 -S
_dct2x2dc_dconly:
movswl 2(%rdi),%edx
pushq %rbp
addw (%rdi), %dx
movswl 6(%rdi),%eax
movq %rsp, %rbp
addw 4(%rdi), %ax
leal (%rax,%rdx), %ecx
subw %ax, %dx
movw %cx, (%rdi)
movw %dx, 2(%rdi)
leave
ret
So it seems that the optimizer realizes that you don't need registers larger
than 16-bits, which allows memory operands on x86, which is optimal for this
case. However, other architectures follow this too literally, wasting
instructions to truncate intermediate results to 16 bits.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
2009-07-28 16:28 [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
@ 2009-09-09 16:34 ` ramana at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: ramana at gcc dot gnu dot org @ 2009-09-09 16:34 UTC (permalink / raw)
To: gcc-bugs
--
ramana at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-09-09 16:33:48
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-09-09 16:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-28 16:28 [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).